Implement Scrabble move generator (DAWG) with English and Russian rules

A Go library that returns every legal play ranked by score and scores or validates plays, using the Appel-Jacobson DAWG algorithm over github.com/iliadenisov/dafsa v1.1.0. - DAWG move generation (across / down / both), full tournament scoring with a per-tile breakdown; public Solver: GenerateMoves (ranked), ScorePlay, ValidatePlay. - Rulesets: English Scrabble, Russian Scrabble, Эрудит (parameterizable Ruleset). - cmd/builddict (build the DAWG from the dictionaries submodule), cmd/stress (self-play benchmark), selfplay engine; brute-force test oracle. - A GADDAG was implemented, benchmarked and removed (the DAWG was smaller and faster for a scoring solver); see RESULTS.md and ALGORITHM.md.
2026-06-01 16:07:32 +02:00
parent f51a1fe2f2
commit 15c7959d96
43 changed files with 3406 additions and 0 deletions
@@ -0,0 +1,87 @@
+# DAWG vs GADDAG — self-play stress test
+
+> **Note.** The GADDAG generator has since been **removed** from the codebase — the DAWG
+> is the sole move generator. This document is kept as the record of the comparison that
+> justified that choice. `cmd/stress` now benchmarks the DAWG alone.
+
+The two move generators were compared by playing greedy AI-vs-AI games on the same
+dictionary with the same seeds, measuring speed and memory. Reproduce with:
+
+```
+go run ./cmd/builddict          # build testdata/sowpods.{dawg,gaddag}
+go run ./cmd/stress -games 200
+```
+
+## Setup
+
+- Dictionary: English **SOWPODS**, 267,752 words (2–15 letters).
+- Board: standard 15×15; greedy player (highest-scoring move), seeded RNG.
+- 200 games per generator, identical seeds; mode = both orientations.
+- Machine: this dev container (Go 1.26, 12 cores; single-threaded run).
+
+## Results (200 games)
+
+Includes the **deterministic cross-set optimization** (one arc enumeration via
+`completers()` for one-sided squares instead of probing every letter); both one-sided
+cases are deterministic for the GADDAG, only the right-extension is for the DAWG.
+
+| metric | DAWG | GADDAG |
+|---|---:|---:|
+| structure size | **732.5 KB** | 5.12 MB |
+| games / turns / plays | 200 / 4783 / 4769 | 200 / 4783 / 4769 |
+| moves generated | 3,880,236 | 3,880,236 |
+| generation time | **23.68 s** | 25.98 s |
+| µs / move-generation call | **4951** | 5431 |
+| moves generated / sec | **163,863** | 149,383 |
+| arcs traversed | **261.8 M** | 276.0 M |
+| arcs / move generated | **67.5** | 71.1 |
+| heap allocated | 16.79 GB | 16.79 GB |
+| GC cycles | 5624 | 5605 |
+| avg final game score | 849.3 | 849.3 |
+
+`GADDAG vs DAWG: 1.10× generation time, 7.16× structure size, 1.00× heap.`
+Peak process RSS (both structures mapped): ~42 MB.
+
+The optimization cut the GADDAG's cross-set arcs (285.5 M → 276.0 M) and narrowed the
+arc gap (1.08× → 1.05×), but the verdict is unchanged: **the GADDAG is still ~10 %
+slower, 7× larger and traverses slightly more arcs.** End-to-end time is dominated by
+the shared per-move scoring (`Evaluate`, ~3.9 M calls), not by graph search, so the
+search-algorithm difference barely moves the total — and what difference remains favours
+the narrower, smaller DAWG.
+
+## Interpretation
+
+- **Correctness.** Both generators produce the *identical* set of moves and scores at
+  every position (identical turns/plays/moves/score above, and the Stage-8 mutual oracle
+  agreeing on 119 positions over 5 games). They differ only in how they search.
+- **Speed.** The DAWG is ~10 % faster end-to-end and traverses ~8 % **fewer** arcs.
+- **The GADDAG does not win here, contrary to Gordon's paper.** Two reasons:
+  1. We use the **final-flag GADDAG** (the minimized DAWG of `REV(x)◊y`, completion via
+     accepting states) so that `dafsa` can be used essentially unmodified. This variant
+     lacks Gordon's *letter-sets-on-arcs* compression, so it is both ~7× larger and has
+     wider nodes — it traverses *more* arcs, not fewer, erasing the theoretical edge.
+  2. The workload is a **scoring solver**: every generated move is scored (cross-words,
+     premiums, bonus) by the shared `Evaluate`. That shared per-move cost dominates, so
+     the search-algorithm difference is small — and what remains favours the simpler,
+     narrower DAWG.
+- **Memory.** The GADDAG structure is 7.16× larger (5.12 MB vs 732 KB). Per-move heap
+  allocation is identical (dominated by shared scoring), and overall RSS is modest.
+
+## Decision
+
+**Use the DAWG (Appel-Jacobson) generator as the production default.** For this library
+(a move generator that scores and ranks every play) it is smaller (7×), at least as fast,
+and simpler to operate: it needs no separator symbol or custom alphabet, `dafsa`'s
+`Save`/`Load` work unchanged, and it requires the fewest `dafsa` additions (only the
+shared low-level traversal API).
+
+The **GADDAG generator is kept and remains selectable** (`scrabble.NewGADDAGGenerator`),
+both as an alternative and as a continuing correctness oracle for the DAWG.
+
+### Caveats / what could change the picture
+
+- A **letter-set GADDAG** (Gordon's true compressed form) plus **incremental cross-set
+  maintenance** would shrink the GADDAG and cut its arc count; it might then beat the DAWG
+  on raw move generation. This was not pursued: the DAWG already meets the goal, and the
+  7× size gap is decisive for a scoring-solver workload where generation is not the
+  bottleneck. It remains a documented future optimization.