# Scrabble Game — implementation plan Living plan and **stage tracker**. Each stage is implemented in its own session; the rules for starting and finishing a stage are in [`CLAUDE.md`](CLAUDE.md). The architecture/decision record is [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md); behaviour is [`docs/FUNCTIONAL.md`](docs/FUNCTIONAL.md). When a stage produces a decision, bake it back here **and** into the affected docs/code in the same PR. ## Context Greenfield multiplatform Scrabble. Players arrive from a platform (Telegram first; later VK/MAX/iOS/Android) or standalone web (email / guest). Three executables — `gateway`, `backend`, `ui` — plus per-platform side-services. Deliberately simpler than the sibling `../galaxy-game` (idea donor, not a template). The `../scrabble-solver` engine is embedded in-process as a library. ## Locked decisions (recap — full record in docs/ARCHITECTURE.md) Stack: `go.work` monorepo, modules `scrabble/`, Go 1.26.x, backend gin+pgx+Postgres(schema `backend`)+goose+zap+OTel (deps added when first used). Wire: Connect-RPC + FlatBuffers (client↔gateway), REST/JSON + `X-User-ID` (gateway↔backend), gRPC server-stream for live events. Auth: platform-native, thin opaque session token, no Ed25519/signing, likely no Redis. UI: pure HTML5/CSS, plain Svelte + Vite, Capacitor for native. MVP surfaces: Telegram + web (email + ephemeral guest) + link/merge. Variants: ru/en/Эрудит. Legality: validate-at-submit. End: empty bag+rack / 6 scoreless / 24h timeout. Hint: top-1. Word-check: unlimited + complaint. Robot: P(win)≈0.40, margin targeting, [2,90]min skewed timing, sleep 00:00–07:00 opp-tz, nudge logic. Dictionary: pin per game. History: structured + GCG export, dictionary- independent (see ARCHITECTURE §9.1). ## Stage tracker | # | Stage | Status | |---|-------|--------| | 0 | Scaffolding (go.work, backend skeleton, docs, CI) | **done** | | 1 | Backend foundation (config, server, Postgres+goose, sessions, accounts) | **done** | | 2 | Engine package over scrabble-solver | **done** | | 3 | Game domain (lifecycle, rules, hint, word-check, history+GCG, stats) | **done** | | 4 | Lobby & social (matchmaking, friends, block, chat, profile, nudge) | **done** | | 5 | Robot opponent | **done** | | 6 | Gateway edge (Connect/FB, platform auth, sessions, push bridge, admin) | todo | | 7 | UI (plain Svelte + Vite, board, lobby, chat, i18n) | todo | | 8 | Telegram integration (bot side-service, deep-link, push) | todo | | 9 | Admin & dictionary ops (complaint review, version reload) | todo | | 10 | Account linking & merge | todo | | 11 | Polish (observability, perf with evidence, deploy) | todo | Scaffolding is incremental: `go.work` lists only existing modules; each stage adds the modules it needs. ## Stages Each stage: read this plan + relevant docs, **interview the owner on the open details below**, implement within scope, then update plan/docs/code and get CI green before marking done. ### Stage 0 — Scaffolding *(done)* Scope: `go.work` (Go 1.26.3, `use ./backend`); minimal runnable `backend` (gin, zap, `/healthz`, `/readyz`, env config); docs skeleton; `PLAN.md`; `CLAUDE.md`; `.gitea/workflows/go-unit.yaml`; README; `.gitignore`. Acceptance: `go build ./backend/...` + `go vet` + gofmt clean + `go test ./backend/...` green; CI green on push. ### Stage 1 — Backend foundation Scope: config/server route groups (`/api/v1/{public,user,internal,admin}`, probes), Postgres (pgx) + embedded goose migrations + schema `backend`, telemetry (OTel) wiring, in-memory cache scaffolding, thin sessions + accounts + platform identities. Open details: Postgres version + DSN/`search_path` convention; jet vs sqlc/sqlx (default jet); migration naming; exact session-token shape (opaque random length, TTL, revocation); account/identity table shape; whether the admin bootstrap lands here or in Stage 9. ### Stage 2 — Engine package Scope: `backend/internal/engine` over scrabble-solver — versioned DAWG load/registry, GenerateMoves/ValidatePlay/ScorePlay wrappers, bag/rack, the **dictionary-independent** game-state model + decode helpers. Add `replace scrabble-solver => ../scrabble-solver` to `go.work` here and solve the CI sibling-checkout (clone `gitea.iliadenisov.ru/.../scrabble-solver`). Open details: how CI obtains the solver (clone sibling vs publish/tag the solver module); in-memory game-state representation; how blanks and exchanges are modelled; Эрудит specifics to verify against the solver. ### Stage 3 — Game domain Scope: create/join, turn order, submit play/pass/exchange/resign, validate-at-submit, scoring, end-conditions, 24h timeout/auto-resign, hint, word-check + complaint capture, structured history + GCG writer, stats on finish. Open details: GCG dialect details (blanks, exchanges, notation); exact stats edge cases; turn-timeout scheduler mechanism (cron vs per-game timer); complaint payload shape. ### Stage 4 — Lobby & social Scope: matchmaking pool, friends, block, per-game chat, profile + email confirm-code, nudge. Open details: pool fairness/keying confirmation; deep-link format per platform; chat length limit + retention; friend-request lifecycle; email-code provider (SMTP relay choice). ### Stage 5 — Robot opponent Scope: human-like player — balance ~0.40, margin targeting, skewed [2,90]min timing + sleep + nudge logic, friend/DM blocking, name pool. Open details: exact delay distribution + parameters; margin band; name pool source; how the scheduler drives robot moves; metrics for tuning balance. ### Stage 6 — Gateway edge Scope: Connect/gRPC-Web (h2c), Telegram initData validation → session → `X-User-ID`, in-memory rate-limit, admin Basic-Auth passthrough, FlatBuffers transcoding, in-app push stream bridging backend `push` gRPC stream, email + ephemeral-guest paths. Open details: FlatBuffers schema layout + message_type catalog; rate-limit classes/limits; admin surface routing; session cache shape at the gateway. ### Stage 7 — UI Scope: plain Svelte + Vite static; Connect-web + FlatBuffers client; lobby (my games, profile tabs); board (HTML5/CSS grid, drag-n-drop, no assets); chat; hint/word-check; in-app stream; i18n en/ru; in-memory session (+IndexedDB if available); Capacitor-ready structure. Open details: detailed game-board UX (deferred by the owner to this stage); client routing; offline/refresh behaviour; design system / theming. ### Stage 8 — Telegram integration Scope: bot side-service, deep-link invites, platform push (your-turn / nudge), Mini App launch/auth; backend↔platform internal API. Open details: bot framework/library; deep-link scheme; push message templates; internal API contract; Mini App hosting/origin. ### Stage 9 — Admin & dictionary ops Scope: admin endpoints (users, games, complaint review queue, dictionary versions + reload), complaint→dictionary update pipeline. Open details: whether a server-rendered console is wanted or JSON-only; the dictionary rebuild/deploy pipeline; complaint resolution workflow. ### Stage 10 — Account linking & merge Scope: link-via-confirm; merge-into-A (stats sum, transfer games/friends, dedupe). High blast-radius — focused regression tests. Open details: conflict resolution (active games on both, duplicate friends, display-name collisions); irreversibility/audit; confirm-flow per platform. ### Stage 11 — Polish Scope: observability dashboards, evidence-based performance work, prod build/deploy. Open details: deployment target/host; dashboards; load expectations. ## Refinements logged during implementation - **Stage 0**: solver `replace` deferred to Stage 2 (nothing imports it yet; adding the path now would break CI, which checks out only this repo). Docker / compose deferred to a stage that has something to deploy. Trunk is `master` (owner preference); `feature/*` + PR from Stage 1; the genesis commit lands on `master` by necessity. - **Stage 1** (interview + implementation): - Query layer: **go-jet** over `database/sql` (pgx stdlib) + otelsql; a `cmd/jetgen` tool regenerates the **committed** code from a throwaway container. Postgres **17** pinned for jetgen, tests and prod. - Sessions: opaque token stored only as a **SHA-256 hash** (kept as hex `text`, not `bytea` — avoids jet bytea-literal friction), **revoke-only** (no TTL); revocation-audit table deferred. Backend keeps a warmed write-through session cache that gates `/readyz`. - Data model: **UUIDv7** PKs; one unified `identities` table (`kind ∈ telegram|email`, widen to `vk`/`max` later); no soft-delete / actor-audit columns yet. - HTTP surface: **service/store/cache layer only**. `/api/v1/{public,user, internal,admin}` groups + `X-User-ID` middleware are scaffolding (exposed via `Server` group accessors); the session/account REST handlers land with the gateway in **Stage 6**. Admin bootstrap deferred to **Stage 9**. - Telemetry: providers + request-timing middleware + otelsql; exporters `none` (default) / `stdout`; OTLP + dashboards deferred to **Stage 11**. - Tests/CI: integration tests behind the `integration` build tag in `backend/internal/inttest` + new `integration.yaml` (testcontainers, Ryuk off, serial), firing on push and PR. Backend now **hard-depends on Postgres at boot** (migrations at startup) — a deliberate contract change from Stage 0, documented in both READMEs. All code stays in the existing `backend` module under `internal/` (+ `cmd/jetgen`); `go.work` untouched. - **Stage 2** (interview + implementation): - Scope: `internal/engine` is a self-contained **library** (registry, bag, `Game` state machine, decode/replay). No `config`/`main`/`server` wiring this stage — there is no consumer yet; wiring lands in **Stage 3**, mirroring Stage 1's deferred handlers. - **Pure rules engine** (interview): the engine owns the in-memory `Game`, pure transitions (play/pass/exchange/resign + draw) **and end-condition detection**, including the standard **end-game rack-adjustment scoring** — a deliberate slice of Stage 3's "scoring/end-conditions" that the pure-engine boundary implies. Stage 3 keeps scheduling, the 24h timeout, persistence and GCG. - **Solver wiring**: `replace scrabble-solver => ../scrabble-solver` in `go.work`; `backend/go.mod` requires `scrabble-solver` (placeholder version, redirected by the replace) and `github.com/iliadenisov/dafsa` directly (for `dawg.Load`). CI clones the **public** solver repo at **master HEAD** anonymously into `../scrabble-solver` (no token); both Go workflows gained the step (the engine's untagged tests run under the integration workflow too) and set `BACKEND_DICT_DIR`. - **Dictionaries**: registry loads the committed DAWGs from a directory parameter; `dict_version` is an explicit string label; the latest version per variant is tracked. Smoke tests validate a known word per variant (English/Russian/Эрудит). **Эрудит is handled uniformly** — every real difference is already in `rules.Erudit()`; the move.go "single orientation per turn" note needs no special code (any single play is one-directional). - **Bag/blanks/exchange**: own deterministic `Bag` (Draw + Return) because `selfplay.Bag` cannot return tiles; exchange is legal only when the bag holds at least a rack and draws replacements before returning the swapped tiles. A blank is `Placement{Blank:true}` carrying its designated letter; the history keeps the concrete letter plus a blank flag (decoded via `Alphabet.Character` / `Decode`). `ReplayBoard` reuses `scrabble.Apply`, so no `internal/encoding` dependency. - **Deviation from the approved plan**: `docs/FUNCTIONAL.md` (+`_ru`) was left unchanged. Stage 2 adds no user-visible behaviour; the variant, per-game dictionary and dictionary-independent-history user stories already live in Stages 3–4, so a "light touch" here would have duplicated or pre-empted them. - **Stage 3** (interview + implementation): - Scope, as in Stages 1–2: **domain service/store layer + engine wiring, no HTTP** (`internal/game`). The gateway↔backend REST surface lands in Stage 6; the only active driver this stage is a background turn-timeout sweeper started from `main`. The robot (Stage 5) will consume the same service API. - **Persistence = event-sourcing + warm cache** (interview): durable state is the `games` row plus an append-only decoded move journal (`game_moves`); the live position is an `engine.Game` kept in an in-memory cache with a ~24h idle TTL and rebuilt by replaying the journal on a miss (the seeded bag makes replay exact). Each game is serialised by a per-game mutex; a persistence failure evicts the live game so the next access rebuilds. §9 reworded from "stored structurally" to this model. - **Resign/timeout split** (interview): 2-player resign/timeout only this stage (the other player wins); multiplayer drop-out-and-continue + resigned-tiles disposition deferred to Stage 4. Per-game **turn-timeout duration** setting (5/10/15/30 min, 1/2/3/6/12/24 h; default 24 h) and a per-user **away window** (`accounts.away_start/away_end`, default 00:00–07:00 local, honoured by the sweeper with midnight-cross handling) added now; profile editing of the away window is Stage 4 and the robot's sleep (Stage 5) reuses it. - **Engine `Resign` fix** (interview, in `internal/engine`): the resigner keeps their accumulated score (no end-game rack adjustment) and never wins; `winner` excludes the resigner, so a two-player resign/timeout gives the win to the other player regardless of score. Timeout reuses `Resign`, so the game domain needs no winner override. - **Additive engine domain API**: `Direction`, `Game.SubmitPlay/SubmitExchange/ EvaluatePlay/HintView/Hand`, `MoveRecord.{Dir,MainRow,MainCol}`, `Registry.Lookup`, `ParseVariant` — so `internal/game` never imports `scrabble-solver` (keeps the §5 single-importer invariant). - **Create = atomic with seats** (interview): `Create` seats all accounts and starts; lobby seat-filling is Stage 4. **Sweeper = periodic goroutine** (interview; default 60 s, `BACKEND_GAME_TIMEOUT_SWEEP_INTERVAL`). - **Hint = settings + wallet** (interview): per-game `hints_allowed` + `hints_per_player`, plus a profile wallet `accounts.hint_balance` (spent after the allowance; purchases later). Category defaults (random 1 / tournament 0 / friendly 1-or-0) are the caller's job (lobby/tournaments). - **Stats** (interview): `account_stats` with **`draws`** added beyond §9's wins/losses; `max_word_points` = best single **move** score; ties draw, resign/timeout is a loss, guests get no stats. - **Complaint** (interview): full payload with `game_id`; word-check is scoped to the game's pinned `(variant, dict_version)`. Stage 9 owns the resolution lifecycle, so the `status` column carries no value CHECK yet. - **GCG** (interview): standard Poslfit dialect (UTF-8, `#player`/`#lexicon` pragmas, `8G`/`H8` coordinates, lower-case blanks, `.` pass-throughs, `-TILES` exchange) plus `#note` lines for resign/timeout; derived from the journal, so dictionary-independent. - **Engine wiring + config**: `main` loads the registry (`engine.Open`, a hard boot dependency like migrations) and starts the sweeper. New config: `BACKEND_DICT_DIR` (required), `BACKEND_DICT_VERSION` (default `v1`), `BACKEND_GAME_TIMEOUT_SWEEP_INTERVAL` (60 s), `BACKEND_GAME_CACHE_TTL` (24 h). No CI change — both Go workflows already clone the solver sibling and export `BACKEND_DICT_DIR`. `accounts` gained `away_start`/`away_end`/`hint_balance` and the `account` package gained `SpendHint` (it owns its table). - **Stage 4** (interview + implementation): - Scope, as in Stages 1–3: **domain service/store layer, no HTTP** — REST/stream is Stage 6. Chat and nudges are **persisted** now; live delivery (push / in-app stream) is Stage 6/8. New packages `internal/social` (friends, blocks, chat+nudge) and `internal/lobby` (matchmaking + invitations); profile editing and the email confirm-code extend `internal/account`. The services have no active driver this stage, so `main` builds them and hands them to the server, which exposes them via accessors (the Stage 1 scaffolding-accessor pattern) for the Stage 6 handlers. - **Friends** (interview): request → accept on a single `friendships` table; decline/cancel delete the pending row; **blocking severs** any friendship. - **Blocks** (interview): the existing global toggles **plus** a per-user `blocks` table; block effects are **mutual** (a block either way suppresses chat visibility and prevents requests/invitations between the pair). - **Friend games** (interview): invitation → accept; the game starts only when **all** invitees accept, any decline cancels it, and a pending invitation **lazily expires after 7 days** (checked on access — no new sweeper). - **Chat** (interview): ≤ **60 runes**, stored with the game forever, the sender **IP** kept for moderation (as `text`, following Stage 1's no-`bytea` precedent; the gateway forwards it in Stage 6), input **content-filtered** (links/emails/phone numbers incl. obfuscated forms) via `mvdan.cc/xurls/v2` plus a compact leet/separator normaliser and a ≥7-digit phone heuristic — the one new dependency. **Nudge is a chat message** (`kind='nudge'`), rate-limited to once per hour per game per sender. - **Matchmaking** (interview): an **in-memory** FIFO pool keyed by **variant** only (variant fixes the board language), pairing two humans (seat order randomised). The 10 s wait and **robot substitution are deferred to Stage 5**. The pool does **not** consult blocks (auto-match is anonymous) — a deliberate simplification of the plan's optional block-skip that also avoids a DB call under the pool lock. - **Email confirm-code** (interview): 6-digit code, 15-min TTL, ≤ 5 attempts, stored as a **SHA-256 hash**; a `Mailer` seam with an SMTP relay (`BACKEND_SMTP_*`) and a default **log mailer**. It binds an email to the current account; an email already confirmed by another account → `ErrEmailTaken` (**merge is Stage 10**); email-as-login is Stage 6 and reuses this mechanism. - **Multi-player drop-out** (interview; discharges the Stage 3 deferral): the engine's `Resign` now drops a seat and the rest **play on** while ≥ 2 are active, finishing (last-survivor wins) when one remains; `winner` excludes all resigned seats. A per-game **`dropout_tiles`** setting (`remove` default | `return`) governs the leaver's rack, which is **never revealed** to the others. Timeout reuses `Resign`, so a multi-player timeout drops one seat and play continues; `game.commit`/`timeoutGame` were already keyed on `g.Over()`, so they only needed the setting threaded through create/replay. - **Build/deps**: `go mod tidy` is not run — the bare-path `scrabble-solver` replace lives only in `go.work`, so `tidy`/`go get` cannot resolve it; the `xurls` dependency was added with `go mod edit -require` + `go mod download`, its checksums recorded in the committed **`go.work.sum`**. No CI workflow change (both Go workflows already clone the solver sibling and export `BACKEND_DICT_DIR`). - **Stage 5** (interview + implementation): - Scope, as in Stages 1–4: **domain layer, no HTTP** — the robot consumes the public game API as an ordinary seated player (`internal/robot`), so only `internal/engine` still imports the solver. New: `engine.Candidates()` (decoded ranked plays) and a thin `game.Service.Candidates` + `RobotTurns` read. - **Account model** (interview): a pool of **durable accounts**, each a single `identities` row `kind='robot'` (migration `00004` widens the kind CHECK — a CHECK-only change, no jetgen). A curated ~16-name pool in code; `EnsurePool` provisions them idempotently at boot (a hard dependency, like the registry) with `block_chat`/`block_friend_requests` set, which is **all** the friend/DM blocking needs (no special-casing). - **Driver + state** (interview): a background sweeper goroutine (`robot.Service.Run`/`Drive`, mirroring the timeout sweeper); **every per-game and per-turn choice is derived deterministically from the game `seed`** (FNV-1a mix, restart-stable — not `hash/maphash`), so the robot keeps **no extra state**. `playToWin = mix(seed,"win")%100 < 40`; per-turn `delay`; sleep `drift`. - **Timing** (interview): per-move delay `2 + 88·u^k` minutes, `u~U(0,1)`, **k≈3.5 → median ~10 min**, clamped to [2,90]. A daytime nudge on the robot's turn pulls the move into a 2–10 min reply window; the robot proactively nudges after **12 h** idle on the human's turn (reusing `social.Nudge`'s once-per-hour guard; `social.LastNudgeAt` added to detect the human's nudge). - **Sleep** (interview — resolves the §7-vs-`account.go` mismatch): the robot sleeps 00:00–07:00 in the **opponent's timezone shifted by a per-game drift ∈ [−3,+3]h** (so its night overlaps the human's rather than running anti-phase), computed on the fly per game — **no profile mutation, no concurrency cap**. The `account.go` away-window comment was corrected accordingly. - **Margin** (interview): pick the candidate whose resulting margin (own+move−opp) is closest to **[1,30]** when playing to win / **[−30,−1]** when playing to lose, tie-broken toward the conservative edge; no legal play → exchange the full rack when the bag can refill it, else pass. - **Substitution** (interview): a matchmaker **reaper** (`Reap`/`RunReaper`) substitutes a pooled robot after a **10 s** wait (`BACKEND_LOBBY_ROBOT_WAIT`), `NewMatchmaker` now takes a `RobotProvider`. A waiter learns of a match — human pairing **or** substitution — through a new `Poll` + results map; production delivery is a **match-found notification** (session/in-app push + side-service), Stage 6/8 — noted in §10. - **Metrics** (interview, 1+2): robots are durable accounts, so `account_stats` is the authoritative, complete balance ground-truth (target ~40% robot wins); an OTel counter (`robot_games_finished_total`, exporter `none` today) and a structured log cover robot-finished games for live observation. - **Config**: `BACKEND_ROBOT_DRIVE_INTERVAL` (30 s), `BACKEND_LOBBY_ROBOT_WAIT` (10 s), `BACKEND_LOBBY_REAPER_INTERVAL` (1 s). No CI change (both Go workflows already clone the solver sibling and export `BACKEND_DICT_DIR`). ## Deferred TODOs (cross-stage) - **TODO-1 — publish & version the solver.** Once `scrabble-solver` is stable, give it a real module URL and switch `backend` to a versioned dependency, dropping the `go.work` replace and the CI clone. Removes the floating `master` dependency accepted for now (Stage 2 interview). - **TODO-2 — split the solver into engine vs dictionary generator + versioned dictionary artifacts.** Owner's idea, with the caveats agreed at the Stage 2 interview: the split is sound (build-time wordlist→DAWG vs runtime load have different lifecycles and shrink the runtime dependency surface), **but** the generator must pin the **same** `dafsa`/`alphabet` versions and alphabet definitions as the runtime engine or the on-disk format / letter indexing drifts and silently corrupts validation. For delivery prefer **Git LFS or an artifact store** (Gitea releases / OCI artifact / object storage) over a raw git submodule (the ~0.5–0.7 MB DAWGs are regenerated wholesale and bloat git history); pin by tag/hash for a reproducible startup set. A submodule/LFS pull is a **deploy-time** way to populate the directory, **not** the runtime dynamic-reload mechanism (Stage 9) — keep the `BACKEND_DICT_DIR` directory as the runtime contract: a new `.dawg` appears in it and is loaded with `dawg.Load`.