scrabble-solver/PLAN.md

# Scrabble Solver — Implementation Plan

## Outcome (current state)

Both generators were implemented and verified to produce identical moves, then compared
by self-play stress test (`RESULTS.md`). The **GADDAG was removed**: for a scoring solver
it was ~7× larger and no faster than the **DAWG**, which is now the sole generator.
Shipped: the DAWG generator, full scoring + breakdown, the public `Solver`
(`GenerateMoves`/`ScorePlay`/`ValidatePlay`), and three rulesets (English Scrabble,
Russian Scrabble, Эрудит). The rest of this document is the original roadmap, kept for
history; the DAWG/GADDAG comparison it describes is preserved in `RESULTS.md`.

## Context

We are building a Go library that, given a dictionary, a current game position and a
player's rack, returns every legal new play ranked by descending score. The core is a
fast finite-automaton move generator based on two papers (analysed in `ALGORITHM.md`):

- Appel & Jacobson, *The World's Fastest Scrabble Program* (CACM 1988) — the **DAWG**
  algorithm (cross-checks, anchor squares, `LeftPart`/`ExtendRight`, edge encoding,
  cross-sums for scoring, transpose for the perpendicular direction).
- Gordon, *A Faster Scrabble Move Generation Algorithm* (SP&E 1994) — the **GADDAG**
  (`REV(x)◊y` representation, single `Gen`/`GoOn` generator, deterministic cross-sets,
  construction algorithm).

The graph engine is `github.com/iliadenisov/dafsa` — a compact, bit-packed minimized
DAWG with a 6-bit "compact alphabet" (`alphabet.Indexer`, ≤63 symbols) and an
index-based (`*B`) API, checked out locally at `../dafsa`.

### Headline approach

**Implement BOTH generators — DAWG and GADDAG — behind one shared `Generator`
interface, then decide which becomes the production default empirically**, via a clean
self-play stress test (two greedy players, several games) on the *same* dictionary,
measuring speed and memory. The choice is made **after** implementation and measurement.
Both implementations are kept; the comparison output (`RESULTS.md`) picks the default.

### Locked decisions

| # | Topic | Decision |
|---|---|---|
| 1 | Core algorithm | **Implement both**: DAWG (Appel-Jacobson) and GADDAG (Gordon, over `dafsa` as a DAWG of `{REV(x)◊y}` with per-node final flags). Pick the default after a self-play stress test. |
| 2 | dafsa changes | **Edit `../dafsa` directly**, wire via `go.mod replace`. Leave a spec/CHANGELOG there. |
| 3 | Ruleset scope | Default **standard English Scrabble**, fully **parameterizable** (board geometry, premium layout DL/TL/DW/TW, tile values & counts, alphabet, blank count, bingo bonus). Must support **Russian "Эрудит"** (same 15×15 board + premiums; different tile values/counts; Cyrillic alphabet; one word per move, horizontal **or** vertical). |
| 4 | Scoring | **Full tournament scoring + breakdown**: main word + all cross-words + premiums (newly-placed tiles only) + bingo bonus; result carries formed cross-words and a per-tile breakdown. |
| 5 | Symbol encoding | **`0x80` = wildcard/blank** flag (board/rack/output only — never in the graph). **GADDAG separator `◊` = index == alphabet size** (`cbits` minimal; measured optimum). |
| 6 | State model | **Compact byte board** is the generation core; a structured **`Play`** type + a constructor that applies plays to build a board provide the full game-state overlay. |
| 7 | API scope | **Generation + scoring + validation** of arbitrary plays. |
| 8 | Dictionary | `kamilmielnik/scrabble-dictionaries` as a **git submodule**; `cmd/builddict` builds serialized structures **cached in `testdata/` (gitignored)**. English now; Russian later. |

### Static structure probe (informs expectations; NOT the decision)

Full SOWPODS (267,752 words, Σlen = 2,439,269), built through `dafsa`:

| Structure | nodes | bytes | bits/char | build | ns/arc |
|---|---:|---:|---:|---:|---:|
| DAWG (a–z) | 77,808 | 750 KB | 2.46 | 186 ms | 48.7 |
| GADDAG sep=size(26) · cbits5 | 587,940 | 5.37 MB | 17.61 | 2.92 s | 61.0 |
| GADDAG sep=62 · cbits6 (measured) | 587,940 | 5.53 MB | 18.13 | — | 60.9 |
| GADDAG sep=0x40 · cbits7 (extrapolated) | 587,940 | ~5.69 MB | ~18.6 | — | ~61 |

GADDAG is ~7× the DAWG and ~25% costlier per arc, but ~2× faster at actual move
generation (Gordon Table IV: ~2.5× fewer arcs). The stress test settles it.

## Deliverable documents

1. **`ALGORITHM.md`** — single source of truth (verbatim pseudocode + our adaptation).
2. **`PLAN.md`** — this plan.
3. **`RESULTS.md`** — stress-test comparison + the production-default decision.

## Architecture & package layout

```
scrabble-solver/
  go.mod                      # + replace github.com/iliadenisov/dafsa => ../dafsa
  PLAN.md  ALGORITHM.md  RESULTS.md  README.md
  dictionaries/               # git submodule: kamilmielnik/scrabble-dictionaries
  testdata/                   # gitignored: cached serialized DAWG + GADDAG
  internal/
    gaddag/                   # REV(x)◊y transform + build + traversal wrapper over dafsa
    dictdawg/                 # plain-DAWG build + traversal wrapper over dafsa
    encoding/                 # byte conventions (wildcard 0x80, separator, board cells)
  board/                      # compact board grid, transpose, premium layout
  rack/                       # rack as per-letter counts + blanks
  rules/                      # Ruleset: geometry, premiums, tile values/counts, alphabet, bonus
  scrabble/  (public pkg)     # Solver + Generator interface; Play/Move types;
    gen_dawg.go               #   DAWG generator (LeftPart/ExtendRight)
    gen_gaddag.go             #   GADDAG generator (Gen/GoOn)
  selfplay/                   # bag + greedy player + game loop (self-play engine)
  cmd/builddict/              # word list -> serialized DAWG/GADDAG -> testdata
  cmd/stress/                 # run N self-play games per generator, emit comparison
```

Shared `Generator` interface so the harness can swap implementations:

```go
type Generator interface {
    GenerateMoves(b *board.Board, r rack.Rack, mode Mode) []Move // ranked, descending score
    Name() string
}
```

Board, rack, rules and **scoring are shared**; cross-set computation is per-generator
(DAWG: probe the dictionary DAWG incl. the non-deterministic left set; GADDAG:
deterministic GADDAG walk) — that difference is part of what is measured.

## Changes to `../dafsa` (additive ⇒ backward compatible)

1. **Low-level traversal API**: `type Node` (opaque bit-offset); `Root() Node`;
   `Next(n, ch) (child, final, ok)` (= Gordon's `NextArc`, wraps private `getEdge`);
   `Arcs(n, fn)` (wraps `getNode`, for blanks/cross-sets); a reusable allocation-free
   cursor for hot-path `Next`.
2. **Custom-alphabet persistence**: `WriteTo`/`SaveWith` (allow non-embedded alphabet);
   `ReadWith`/`LoadWith` (inject a known indexer, skip language reconstruction).
3. (Optional) accurate serialized arc/node count; document that `NumEdges()` is build-time.

## Data model & compact formats

- **Byte symbol**: low 6 bits = alphabet index; `0x80` = wildcard/blank (I/O only);
  `◊` = index `len(alphabet)` (GADDAG graph only).
- **Board**: `[]byte`, row-major. `0` = empty; occupied = `letterIndex+1`; blank =
  `(letterIndex+1) | 0x80`. Helpers: `At`, `Set`, `Transpose`, premium lookup.
- **Rack**: `[]byte` counts, length `alphabetSize+1`; last slot = blank count.
- **`Play`**: `{Row, Col; Dir; Tiles []byte (0x80 flags); Main; CrossWords; Score;
  Breakdown}` — input for apply/validate/score and the output element.
- **Modes**: `Both`, `Horizontal`, `Vertical`.

## Staged implementation

- **Stage 0** — Scaffolding & docs: `ALGORITHM.md`, `PLAN.md`, `dictionaries/` submodule,
  `go.mod` replace, `.gitignore`.
- **Stage 1** — dafsa traversal API (shared): `Node`, `Root`, `Next`, `Arcs`, cursor; tests.
- **Stage 2** — dafsa custom-alphabet persistence: `SaveWith`/`ReadWith`; round-trip.
- **Stage 3** — Shared infra: encoding, board (+transpose), rack, rules (EN; Эрудит stub),
  scoring, `Generator` interface, `Move`/`Play` types.
- **Stage 4** — Dictionary build: `internal/dictdawg` + `internal/gaddag`; `cmd/builddict`
  caching serialized DAWG **and** GADDAG in `testdata`.
- **Stage 5** — Cross-sets: DAWG cross-checks (incl. non-deterministic left set) and GADDAG
  deterministic cross-sets; validated against each other + brute force on a small lexicon.
- **Stage 6** — DAWG generator (`LeftPart`/`ExtendRight`).
- **Stage 7** — GADDAG generator (`Gen`/`GoOn`).
- **Stage 8** — Correctness gate: DAWG and GADDAG identical move sets on random positions
  (each move once) + brute force on a tiny dictionary. Must pass before perf comparison.
- **Stage 9** — Self-play stress test: `selfplay` engine (bag, racks, greedy policy,
  seeded RNG, end conditions); `cmd/stress` plays N games per generator measuring time,
  arcs, allocations (`runtime.MemStats`), peak RSS (`/proc/self/status` VmHWM), footprint;
  emit `RESULTS.md`.
- **Stage 10** — Decision + public API: choose default from `RESULTS.md` (both selectable);
  finalize `Solver` API, Play↔board constructors, examples.
- **Stage 11** — Polish: benchmarks, README, optional prebuilt-graph distribution.

## Verification

- `go test ./...`, `go vet`, lint green per stage.
- Mutual oracle (Stage 8): identical move sets; brute force on a tiny dictionary.
- Build EN structures from the SOWPODS submodule via `cmd/builddict`; run `GenerateMoves`
  on canonical positions (e.g. Gordon's "CARE on ABLE") and assert top moves/scores.
- Run `cmd/stress` (100–1000 seeded games per generator) → `RESULTS.md`.

## Assumptions & caveats

- Both algorithms ship; the production default is decided by the stress test. Both remain
  selectable.
- Self-play policy defaults to **greedy** (deterministic tie-break, seeded RNG); tunable.
- Separator = real 27th token (`◊`, index = size, `cbits=5`); `0x40` reserved on the board.
  Wildcard/blank = `0x80`, never in the graph.
- **Stateless per-call** generation in v1 (anchors + cross-sets recomputed per call);
  incremental maintenance is a later optimization (both generators run stateless — a
  fairness note for the comparison).
- Persistence stores only the graph; the (custom) alphabet is injected on load.
- Russian "Эрудит" alphabet specifics (Е/Ё handling, tile values/counts) resolved at
  Stage 3/4; "one word per move, H or V" is satisfied by the modes.
- The final-flag GADDAG is larger than Gordon's letter-set form; letter-sets-on-arcs
  remain a possible future optimization.