Files
scrabble-solver/CLAUDE.md
T

50 lines
3.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# scrabble-solver — project guide
A Go library that, given a dictionary, a board position and a rack, returns every legal
play ranked by score, and scores/validates arbitrary plays. The move generator is the
**DAWG** algorithm (Appel & Jacobson) over `github.com/iliadenisov/dafsa` — a bit-packed,
minimised DAWG with a compact ≤63-symbol alphabet. A GADDAG generator was also built,
measured by self-play, and **removed**: DAWG won for this scoring-solver workload
(~7× smaller, comparable speed) — see `RESULTS.md`.
Module `scrabble-solver`, Go 1.26. Rulesets: English Scrabble, Russian Scrabble, and
Russian **Эрудит** (`rules` package); Эрудит has no Ё tile and folds Ё→Е in its dictionary.
## Layout
- `scrabble/` — the public API: `Solver` (`NewSolver`, `GenerateMoves`, `ScorePlay`,
`ValidatePlay`), the `Move`/`Placement`/`Word` types, the DAWG generator and scoring.
- `board/`, `rack/`, `rules/` — board grid (+ transpose), rack as per-letter counts,
and rulesets (geometry, premium layout, tile values/counts, alphabet, bonus):
`rules.English()`, `rules.RussianScrabble()`, `rules.Erudit()`.
- `internal/``dictdawg` (build/load/serialise DAWGs over dafsa), `wordlist`
(encode/filter/sort/dedupe + `FoldYo`), `graph`, `dict`.
- `cmd/builddict` — word list → serialised DAWG (`-alphabet latin|russian`).
- `cmd/stress`, `selfplay/` — the self-play stress harness behind `RESULTS.md`.
- `dawg/`**committed** dictionaries: `en_sowpods.dawg`, `ru_scrabble.dawg`,
`ru_erudit.dawg` (Ё→Е folded). Rebuild with `make dawg`.
- `dictionaries/``kamilmielnik/scrabble-dictionaries` git submodule (English source).
- `dictprep/` — self-contained tooling that turns the Russian academic orthographic
dictionary into a common-noun word list. See `dictprep/README.md`. Committed output is
`dictprep/russian/{all,scrabble}.txt` (+ `orfo_dict_2025.{pdf,txt}`, `manual_confirm.txt`).
Running Stage 2 needs a Python venv with `mawo-pymorphy3` and the `libmorph` apt packages
(see `dictprep/README.md`).
## Build & test
go test ./... # all packages green; also run go vet ./... and gofmt
make dawg # rebuild dawg/*.dawg from the word lists
Scoring and move generation are validated against **real tournament games** in GCG format
(`scrabble/gcg_test.go` + `scrabble/testdata/*.gcg`, including the 700+ club): for every
move the test checks the score, the running total, and that the generator actually
produces the played move with that score — canonical play, not invented cases.
## Key facts
- Compact byte encoding: low 6 bits = alphabet index; `0x80` = blank/wildcard (board, rack
and output bytes only — never inside the graph). The public API is byte-indexed.
- DAWG is the production generator; the GADDAG was removed after measurement.
- Detailed docs: `ALGORITHM.md` (the algorithm — single source of truth), `PLAN.md`
(design and decisions), `RESULTS.md` (DAWG-vs-GADDAG), `dictprep/README.md` (RU pipeline).