Files
scrabble-game/PLAN.md
T
Ilia Denisov 85baabe4ba
Tests · Go / test (push) Successful in 6s
Tests · Integration / integration (push) Successful in 10s
Tests · Go / test (pull_request) Successful in 5s
Tests · Integration / integration (pull_request) Successful in 10s
Stage 5: robot opponent (pool, seed-derived strategy, move driver, matchmaker substitution)
- internal/robot: durable kind='robot' account pool (migration 00004); every
  per-game and per-turn choice derived deterministically from the game seed
  (restart-stable FNV mix); a background move driver; margin targeting (band
  1-30, closest-to-band); right-skewed [2,90]min delays (median ~10m);
  opponent-anchored sleep with +/-3h drift; daytime nudge reply + proactive
  12h nudge; friend/chat blocked via profile toggles.
- engine.Candidates (decoded ranked plays); game.Candidates + RobotTurns;
  social.LastNudgeAt.
- matchmaker: 10s wait then robot substitution (reaper) + Poll delivery seam.
- config (BACKEND_ROBOT_DRIVE_INTERVAL, BACKEND_LOBBY_ROBOT_WAIT,
  BACKEND_LOBBY_REAPER_INTERVAL); main wiring + boot-time pool provisioning.
- metrics: robot account_stats (authoritative balance) + robot_games_finished_total
  OTel counter + per-finish log.
- docs: PLAN, ARCHITECTURE, FUNCTIONAL(+ru), TESTING, README; account.go comment.
- tests: robot strategy units, matchmaker reaper/Poll, engine.Candidates; inttest
  robot full-game / substitution / proactive-nudge.
2026-06-02 21:02:20 +02:00

378 lines
24 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Scrabble Game — implementation plan
Living plan and **stage tracker**. Each stage is implemented in its own session;
the rules for starting and finishing a stage are in [`CLAUDE.md`](CLAUDE.md).
The architecture/decision record is [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md);
behaviour is [`docs/FUNCTIONAL.md`](docs/FUNCTIONAL.md). When a stage produces a
decision, bake it back here **and** into the affected docs/code in the same PR.
## Context
Greenfield multiplatform Scrabble. Players arrive from a platform (Telegram
first; later VK/MAX/iOS/Android) or standalone web (email / guest). Three
executables — `gateway`, `backend`, `ui` — plus per-platform side-services.
Deliberately simpler than the sibling `../galaxy-game` (idea donor, not a
template). The `../scrabble-solver` engine is embedded in-process as a library.
## Locked decisions (recap — full record in docs/ARCHITECTURE.md)
Stack: `go.work` monorepo, modules `scrabble/<name>`, Go 1.26.x, backend
gin+pgx+Postgres(schema `backend`)+goose+zap+OTel (deps added when first used).
Wire: Connect-RPC + FlatBuffers (client↔gateway), REST/JSON + `X-User-ID`
(gateway↔backend), gRPC server-stream for live events. Auth: platform-native,
thin opaque session token, no Ed25519/signing, likely no Redis. UI: pure
HTML5/CSS, plain Svelte + Vite, Capacitor for native. MVP surfaces: Telegram +
web (email + ephemeral guest) + link/merge. Variants: ru/en/Эрудит.
Legality: validate-at-submit. End: empty bag+rack / 6 scoreless / 24h timeout.
Hint: top-1. Word-check: unlimited + complaint. Robot: P(win)≈0.40, margin
targeting, [2,90]min skewed timing, sleep 00:0007:00 opp-tz, nudge logic.
Dictionary: pin per game. History: structured + GCG export, dictionary-
independent (see ARCHITECTURE §9.1).
## Stage tracker
| # | Stage | Status |
|---|-------|--------|
| 0 | Scaffolding (go.work, backend skeleton, docs, CI) | **done** |
| 1 | Backend foundation (config, server, Postgres+goose, sessions, accounts) | **done** |
| 2 | Engine package over scrabble-solver | **done** |
| 3 | Game domain (lifecycle, rules, hint, word-check, history+GCG, stats) | **done** |
| 4 | Lobby & social (matchmaking, friends, block, chat, profile, nudge) | **done** |
| 5 | Robot opponent | **done** |
| 6 | Gateway edge (Connect/FB, platform auth, sessions, push bridge, admin) | todo |
| 7 | UI (plain Svelte + Vite, board, lobby, chat, i18n) | todo |
| 8 | Telegram integration (bot side-service, deep-link, push) | todo |
| 9 | Admin & dictionary ops (complaint review, version reload) | todo |
| 10 | Account linking & merge | todo |
| 11 | Polish (observability, perf with evidence, deploy) | todo |
Scaffolding is incremental: `go.work` lists only existing modules; each stage
adds the modules it needs.
## Stages
Each stage: read this plan + relevant docs, **interview the owner on the open
details below**, implement within scope, then update plan/docs/code and get CI
green before marking done.
### Stage 0 — Scaffolding *(done)*
Scope: `go.work` (Go 1.26.3, `use ./backend`); minimal runnable `backend`
(gin, zap, `/healthz`, `/readyz`, env config); docs skeleton; `PLAN.md`;
`CLAUDE.md`; `.gitea/workflows/go-unit.yaml`; README; `.gitignore`.
Acceptance: `go build ./backend/...` + `go vet` + gofmt clean +
`go test ./backend/...` green; CI green on push.
### Stage 1 — Backend foundation
Scope: config/server route groups (`/api/v1/{public,user,internal,admin}`,
probes), Postgres (pgx) + embedded goose migrations + schema `backend`,
telemetry (OTel) wiring, in-memory cache scaffolding, thin sessions + accounts +
platform identities.
Open details: Postgres version + DSN/`search_path` convention; jet vs
sqlc/sqlx (default jet); migration naming; exact session-token shape (opaque
random length, TTL, revocation); account/identity table shape; whether the
admin bootstrap lands here or in Stage 9.
### Stage 2 — Engine package
Scope: `backend/internal/engine` over scrabble-solver — versioned DAWG
load/registry, GenerateMoves/ValidatePlay/ScorePlay wrappers, bag/rack, the
**dictionary-independent** game-state model + decode helpers. Add
`replace scrabble-solver => ../scrabble-solver` to `go.work` here and solve the
CI sibling-checkout (clone `gitea.iliadenisov.ru/.../scrabble-solver`).
Open details: how CI obtains the solver (clone sibling vs publish/tag the
solver module); in-memory game-state representation; how blanks and exchanges
are modelled; Эрудит specifics to verify against the solver.
### Stage 3 — Game domain
Scope: create/join, turn order, submit play/pass/exchange/resign,
validate-at-submit, scoring, end-conditions, 24h timeout/auto-resign, hint,
word-check + complaint capture, structured history + GCG writer, stats on
finish.
Open details: GCG dialect details (blanks, exchanges, notation); exact stats
edge cases; turn-timeout scheduler mechanism (cron vs per-game timer);
complaint payload shape.
### Stage 4 — Lobby & social
Scope: matchmaking pool, friends, block, per-game chat, profile + email
confirm-code, nudge.
Open details: pool fairness/keying confirmation; deep-link format per platform;
chat length limit + retention; friend-request lifecycle; email-code provider
(SMTP relay choice).
### Stage 5 — Robot opponent
Scope: human-like player — balance ~0.40, margin targeting, skewed [2,90]min
timing + sleep + nudge logic, friend/DM blocking, name pool.
Open details: exact delay distribution + parameters; margin band; name pool
source; how the scheduler drives robot moves; metrics for tuning balance.
### Stage 6 — Gateway edge
Scope: Connect/gRPC-Web (h2c), Telegram initData validation → session →
`X-User-ID`, in-memory rate-limit, admin Basic-Auth passthrough, FlatBuffers
transcoding, in-app push stream bridging backend `push` gRPC stream, email +
ephemeral-guest paths.
Open details: FlatBuffers schema layout + message_type catalog; rate-limit
classes/limits; admin surface routing; session cache shape at the gateway.
### Stage 7 — UI
Scope: plain Svelte + Vite static; Connect-web + FlatBuffers client; lobby (my
games, profile tabs); board (HTML5/CSS grid, drag-n-drop, no assets); chat;
hint/word-check; in-app stream; i18n en/ru; in-memory session (+IndexedDB if
available); Capacitor-ready structure.
Open details: detailed game-board UX (deferred by the owner to this stage);
client routing; offline/refresh behaviour; design system / theming.
### Stage 8 — Telegram integration
Scope: bot side-service, deep-link invites, platform push (your-turn / nudge),
Mini App launch/auth; backend↔platform internal API.
Open details: bot framework/library; deep-link scheme; push message templates;
internal API contract; Mini App hosting/origin.
### Stage 9 — Admin & dictionary ops
Scope: admin endpoints (users, games, complaint review queue, dictionary
versions + reload), complaint→dictionary update pipeline.
Open details: whether a server-rendered console is wanted or JSON-only; the
dictionary rebuild/deploy pipeline; complaint resolution workflow.
### Stage 10 — Account linking & merge
Scope: link-via-confirm; merge-into-A (stats sum, transfer games/friends,
dedupe). High blast-radius — focused regression tests.
Open details: conflict resolution (active games on both, duplicate friends,
display-name collisions); irreversibility/audit; confirm-flow per platform.
### Stage 11 — Polish
Scope: observability dashboards, evidence-based performance work, prod
build/deploy.
Open details: deployment target/host; dashboards; load expectations.
## Refinements logged during implementation
- **Stage 0**: solver `replace` deferred to Stage 2 (nothing imports it yet;
adding the path now would break CI, which checks out only this repo). Docker /
compose deferred to a stage that has something to deploy. Trunk is `master`
(owner preference); `feature/*` + PR from Stage 1; the genesis commit lands on
`master` by necessity.
- **Stage 1** (interview + implementation):
- Query layer: **go-jet** over `database/sql` (pgx stdlib) + otelsql; a
`cmd/jetgen` tool regenerates the **committed** code from a throwaway
container. Postgres **17** pinned for jetgen, tests and prod.
- Sessions: opaque token stored only as a **SHA-256 hash** (kept as hex
`text`, not `bytea` — avoids jet bytea-literal friction), **revoke-only**
(no TTL); revocation-audit table deferred. Backend keeps a warmed
write-through session cache that gates `/readyz`.
- Data model: **UUIDv7** PKs; one unified `identities` table
(`kind ∈ telegram|email`, widen to `vk`/`max` later); no soft-delete /
actor-audit columns yet.
- HTTP surface: **service/store/cache layer only**. `/api/v1/{public,user,
internal,admin}` groups + `X-User-ID` middleware are scaffolding (exposed via
`Server` group accessors); the session/account REST handlers land with the
gateway in **Stage 6**. Admin bootstrap deferred to **Stage 9**.
- Telemetry: providers + request-timing middleware + otelsql; exporters
`none` (default) / `stdout`; OTLP + dashboards deferred to **Stage 11**.
- Tests/CI: integration tests behind the `integration` build tag in
`backend/internal/inttest` + new `integration.yaml` (testcontainers, Ryuk
off, serial), firing on push and PR. Backend now **hard-depends on Postgres
at boot** (migrations at startup) — a deliberate contract change from
Stage 0, documented in both READMEs. All code stays in the existing
`backend` module under `internal/` (+ `cmd/jetgen`); `go.work` untouched.
- **Stage 2** (interview + implementation):
- Scope: `internal/engine` is a self-contained **library** (registry, bag,
`Game` state machine, decode/replay). No `config`/`main`/`server` wiring this
stage — there is no consumer yet; wiring lands in **Stage 3**, mirroring
Stage 1's deferred handlers.
- **Pure rules engine** (interview): the engine owns the in-memory `Game`,
pure transitions (play/pass/exchange/resign + draw) **and end-condition
detection**, including the standard **end-game rack-adjustment scoring** — a
deliberate slice of Stage 3's "scoring/end-conditions" that the pure-engine
boundary implies. Stage 3 keeps scheduling, the 24h timeout, persistence and
GCG.
- **Solver wiring**: `replace scrabble-solver => ../scrabble-solver` in
`go.work`; `backend/go.mod` requires `scrabble-solver` (placeholder version,
redirected by the replace) and `github.com/iliadenisov/dafsa` directly (for
`dawg.Load`). CI clones the **public** solver repo at **master HEAD**
anonymously into `../scrabble-solver` (no token); both Go workflows gained
the step (the engine's untagged tests run under the integration workflow too)
and set `BACKEND_DICT_DIR`.
- **Dictionaries**: registry loads the committed DAWGs from a directory
parameter; `dict_version` is an explicit string label; the latest version
per variant is tracked. Smoke tests validate a known word per variant
(English/Russian/Эрудит). **Эрудит is handled uniformly** — every real
difference is already in `rules.Erudit()`; the move.go "single orientation
per turn" note needs no special code (any single play is one-directional).
- **Bag/blanks/exchange**: own deterministic `Bag` (Draw + Return) because
`selfplay.Bag` cannot return tiles; exchange is legal only when the bag holds
at least a rack and draws replacements before returning the swapped tiles. A
blank is `Placement{Blank:true}` carrying its designated letter; the history
keeps the concrete letter plus a blank flag (decoded via `Alphabet.Character`
/ `Decode`). `ReplayBoard` reuses `scrabble.Apply`, so no `internal/encoding`
dependency.
- **Deviation from the approved plan**: `docs/FUNCTIONAL.md` (+`_ru`) was left
unchanged. Stage 2 adds no user-visible behaviour; the variant, per-game
dictionary and dictionary-independent-history user stories already live in
Stages 34, so a "light touch" here would have duplicated or pre-empted them.
- **Stage 3** (interview + implementation):
- Scope, as in Stages 12: **domain service/store layer + engine wiring, no
HTTP** (`internal/game`). The gateway↔backend REST surface lands in Stage 6;
the only active driver this stage is a background turn-timeout sweeper started
from `main`. The robot (Stage 5) will consume the same service API.
- **Persistence = event-sourcing + warm cache** (interview): durable state is
the `games` row plus an append-only decoded move journal (`game_moves`); the
live position is an `engine.Game` kept in an in-memory cache with a ~24h idle
TTL and rebuilt by replaying the journal on a miss (the seeded bag makes
replay exact). Each game is serialised by a per-game mutex; a persistence
failure evicts the live game so the next access rebuilds. §9 reworded from
"stored structurally" to this model.
- **Resign/timeout split** (interview): 2-player resign/timeout only this stage
(the other player wins); multiplayer drop-out-and-continue + resigned-tiles
disposition deferred to Stage 4. Per-game **turn-timeout duration** setting
(5/10/15/30 min, 1/2/3/6/12/24 h; default 24 h) and a per-user **away window**
(`accounts.away_start/away_end`, default 00:0007:00 local, honoured by the
sweeper with midnight-cross handling) added now; profile editing of the away
window is Stage 4 and the robot's sleep (Stage 5) reuses it.
- **Engine `Resign` fix** (interview, in `internal/engine`): the resigner keeps
their accumulated score (no end-game rack adjustment) and never wins; `winner`
excludes the resigner, so a two-player resign/timeout gives the win to the
other player regardless of score. Timeout reuses `Resign`, so the game domain
needs no winner override.
- **Additive engine domain API**: `Direction`, `Game.SubmitPlay/SubmitExchange/
EvaluatePlay/HintView/Hand`, `MoveRecord.{Dir,MainRow,MainCol}`,
`Registry.Lookup`, `ParseVariant` — so `internal/game` never imports
`scrabble-solver` (keeps the §5 single-importer invariant).
- **Create = atomic with seats** (interview): `Create` seats all accounts and
starts; lobby seat-filling is Stage 4. **Sweeper = periodic goroutine**
(interview; default 60 s, `BACKEND_GAME_TIMEOUT_SWEEP_INTERVAL`).
- **Hint = settings + wallet** (interview): per-game `hints_allowed` +
`hints_per_player`, plus a profile wallet `accounts.hint_balance` (spent after
the allowance; purchases later). Category defaults (random 1 / tournament 0 /
friendly 1-or-0) are the caller's job (lobby/tournaments).
- **Stats** (interview): `account_stats` with **`draws`** added beyond §9's
wins/losses; `max_word_points` = best single **move** score; ties draw,
resign/timeout is a loss, guests get no stats.
- **Complaint** (interview): full payload with `game_id`; word-check is scoped
to the game's pinned `(variant, dict_version)`. Stage 9 owns the resolution
lifecycle, so the `status` column carries no value CHECK yet.
- **GCG** (interview): standard Poslfit dialect (UTF-8, `#player`/`#lexicon`
pragmas, `8G`/`H8` coordinates, lower-case blanks, `.` pass-throughs, `-TILES`
exchange) plus `#note` lines for resign/timeout; derived from the journal, so
dictionary-independent.
- **Engine wiring + config**: `main` loads the registry (`engine.Open`, a hard
boot dependency like migrations) and starts the sweeper. New config:
`BACKEND_DICT_DIR` (required), `BACKEND_DICT_VERSION` (default `v1`),
`BACKEND_GAME_TIMEOUT_SWEEP_INTERVAL` (60 s), `BACKEND_GAME_CACHE_TTL` (24 h).
No CI change — both Go workflows already clone the solver sibling and export
`BACKEND_DICT_DIR`. `accounts` gained `away_start`/`away_end`/`hint_balance`
and the `account` package gained `SpendHint` (it owns its table).
- **Stage 4** (interview + implementation):
- Scope, as in Stages 13: **domain service/store layer, no HTTP** — REST/stream
is Stage 6. Chat and nudges are **persisted** now; live delivery (push /
in-app stream) is Stage 6/8. New packages `internal/social` (friends, blocks,
chat+nudge) and `internal/lobby` (matchmaking + invitations); profile editing
and the email confirm-code extend `internal/account`. The services have no
active driver this stage, so `main` builds them and hands them to the server,
which exposes them via accessors (the Stage 1 scaffolding-accessor pattern) for
the Stage 6 handlers.
- **Friends** (interview): request → accept on a single `friendships` table;
decline/cancel delete the pending row; **blocking severs** any friendship.
- **Blocks** (interview): the existing global toggles **plus** a per-user
`blocks` table; block effects are **mutual** (a block either way suppresses
chat visibility and prevents requests/invitations between the pair).
- **Friend games** (interview): invitation → accept; the game starts only when
**all** invitees accept, any decline cancels it, and a pending invitation
**lazily expires after 7 days** (checked on access — no new sweeper).
- **Chat** (interview): ≤ **60 runes**, stored with the game forever, the
sender **IP** kept for moderation (as `text`, following Stage 1's no-`bytea`
precedent; the gateway forwards it in Stage 6), input **content-filtered**
(links/emails/phone numbers incl. obfuscated forms) via `mvdan.cc/xurls/v2`
plus a compact leet/separator normaliser and a ≥7-digit phone heuristic — the
one new dependency. **Nudge is a chat message** (`kind='nudge'`), rate-limited
to once per hour per game per sender.
- **Matchmaking** (interview): an **in-memory** FIFO pool keyed by **variant**
only (variant fixes the board language), pairing two humans (seat order
randomised). The 10 s wait and **robot substitution are deferred to Stage 5**.
The pool does **not** consult blocks (auto-match is anonymous) — a deliberate
simplification of the plan's optional block-skip that also avoids a DB call
under the pool lock.
- **Email confirm-code** (interview): 6-digit code, 15-min TTL, ≤ 5 attempts,
stored as a **SHA-256 hash**; a `Mailer` seam with an SMTP relay
(`BACKEND_SMTP_*`) and a default **log mailer**. It binds an email to the
current account; an email already confirmed by another account → `ErrEmailTaken`
(**merge is Stage 10**); email-as-login is Stage 6 and reuses this mechanism.
- **Multi-player drop-out** (interview; discharges the Stage 3 deferral): the
engine's `Resign` now drops a seat and the rest **play on** while ≥ 2 are
active, finishing (last-survivor wins) when one remains; `winner` excludes all
resigned seats. A per-game **`dropout_tiles`** setting (`remove` default |
`return`) governs the leaver's rack, which is **never revealed** to the others.
Timeout reuses `Resign`, so a multi-player timeout drops one seat and play
continues; `game.commit`/`timeoutGame` were already keyed on `g.Over()`, so they
only needed the setting threaded through create/replay.
- **Build/deps**: `go mod tidy` is not run — the bare-path `scrabble-solver`
replace lives only in `go.work`, so `tidy`/`go get` cannot resolve it; the
`xurls` dependency was added with `go mod edit -require` + `go mod download`,
its checksums recorded in the committed **`go.work.sum`**. No CI workflow change
(both Go workflows already clone the solver sibling and export
`BACKEND_DICT_DIR`).
- **Stage 5** (interview + implementation):
- Scope, as in Stages 14: **domain layer, no HTTP** — the robot consumes the
public game API as an ordinary seated player (`internal/robot`), so only
`internal/engine` still imports the solver. New: `engine.Candidates()` (decoded
ranked plays) and a thin `game.Service.Candidates` + `RobotTurns` read.
- **Account model** (interview): a pool of **durable accounts**, each a single
`identities` row `kind='robot'` (migration `00004` widens the kind CHECK — a
CHECK-only change, no jetgen). A curated ~16-name pool in code; `EnsurePool`
provisions them idempotently at boot (a hard dependency, like the registry) with
`block_chat`/`block_friend_requests` set, which is **all** the friend/DM blocking
needs (no special-casing).
- **Driver + state** (interview): a background sweeper goroutine
(`robot.Service.Run`/`Drive`, mirroring the timeout sweeper); **every per-game
and per-turn choice is derived deterministically from the game `seed`** (FNV-1a
mix, restart-stable — not `hash/maphash`), so the robot keeps **no extra state**.
`playToWin = mix(seed,"win")%100 < 40`; per-turn `delay`; sleep `drift`.
- **Timing** (interview): per-move delay `2 + 88·u^k` minutes, `u~U(0,1)`,
**k≈3.5 → median ~10 min**, clamped to [2,90]. A daytime nudge on the robot's
turn pulls the move into a 210 min reply window; the robot proactively nudges
after **12 h** idle on the human's turn (reusing `social.Nudge`'s once-per-hour
guard; `social.LastNudgeAt` added to detect the human's nudge).
- **Sleep** (interview — resolves the §7-vs-`account.go` mismatch): the robot
sleeps 00:0007:00 in the **opponent's timezone shifted by a per-game drift ∈
[3,+3]h** (so its night overlaps the human's rather than running anti-phase),
computed on the fly per game — **no profile mutation, no concurrency cap**. The
`account.go` away-window comment was corrected accordingly.
- **Margin** (interview): pick the candidate whose resulting margin (own+moveopp)
is closest to **[1,30]** when playing to win / **[30,1]** when playing to lose,
tie-broken toward the conservative edge; no legal play → exchange the full rack
when the bag can refill it, else pass.
- **Substitution** (interview): a matchmaker **reaper** (`Reap`/`RunReaper`)
substitutes a pooled robot after a **10 s** wait (`BACKEND_LOBBY_ROBOT_WAIT`),
`NewMatchmaker` now takes a `RobotProvider`. A waiter learns of a match — human
pairing **or** substitution — through a new `Poll` + results map; production
delivery is a **match-found notification** (session/in-app push + side-service),
Stage 6/8 — noted in §10.
- **Metrics** (interview, 1+2): robots are durable accounts, so `account_stats`
is the authoritative, complete balance ground-truth (target ~40% robot wins);
an OTel counter (`robot_games_finished_total`, exporter `none` today) and a
structured log cover robot-finished games for live observation.
- **Config**: `BACKEND_ROBOT_DRIVE_INTERVAL` (30 s), `BACKEND_LOBBY_ROBOT_WAIT`
(10 s), `BACKEND_LOBBY_REAPER_INTERVAL` (1 s). No CI change (both Go workflows
already clone the solver sibling and export `BACKEND_DICT_DIR`).
## Deferred TODOs (cross-stage)
- **TODO-1 — publish & version the solver.** Once `scrabble-solver` is stable,
give it a real module URL and switch `backend` to a versioned dependency,
dropping the `go.work` replace and the CI clone. Removes the floating
`master` dependency accepted for now (Stage 2 interview).
- **TODO-2 — split the solver into engine vs dictionary generator + versioned
dictionary artifacts.** Owner's idea, with the caveats agreed at the Stage 2
interview: the split is sound (build-time wordlist→DAWG vs runtime load have
different lifecycles and shrink the runtime dependency surface), **but** the
generator must pin the **same** `dafsa`/`alphabet` versions and alphabet
definitions as the runtime engine or the on-disk format / letter indexing
drifts and silently corrupts validation. For delivery prefer **Git LFS or an
artifact store** (Gitea releases / OCI artifact / object storage) over a raw
git submodule (the ~0.50.7 MB DAWGs are regenerated wholesale and bloat git
history); pin by tag/hash for a reproducible startup set. A submodule/LFS pull
is a **deploy-time** way to populate the directory, **not** the runtime
dynamic-reload mechanism (Stage 9) — keep the `BACKEND_DICT_DIR` directory as
the runtime contract: a new `.dawg` appears in it and is loaded with
`dawg.Load`.