internal/game drives the engine over a single match and owns everything the
engine does not: event-sourced persistence (a games row + an append-only decoded
move journal, with the live engine.Game kept warm in a cache and rebuilt by
replay on a miss), the play/pass/exchange/resign transitions with
validate-at-submit scoring, an unlimited score/legality preview, the hint
(per-game allowance + profile wallet), the word-check tool with complaint
capture, per-player game state, history and GCG export (Poslfit dialect), and a
background turn-timeout sweeper that auto-resigns overdue turns honouring each
player's daily away window. Like Stages 1-2 it is a service/store layer with no
HTTP; the gateway surface lands in Stage 6.
Engine: additive decoded domain API (Direction, SubmitPlay/SubmitExchange/
EvaluatePlay/HintView/Hand, MoveRecord.{Dir,MainRow,MainCol}, Registry.Lookup,
ParseVariant) so internal/game never imports scrabble-solver; and a Resign fix
so the resigner keeps their score yet never wins (the other player wins a
two-player game). Timeout reuses Resign.
Persistence: migration 00002 adds games, game_players, game_moves, complaints,
account_stats and extends accounts with away_start/away_end/hint_balance; go-jet
regenerated. account gained SpendHint. Config adds BACKEND_DICT_DIR (required),
BACKEND_DICT_VERSION, BACKEND_GAME_TIMEOUT_SWEEP_INTERVAL, BACKEND_GAME_CACHE_TTL;
main loads the registry at boot (hard dependency) and starts the sweeper.
Tests: engine resign + decoded-API tests; game unit tests (GCG, away-window
boundaries, hint budget, cache, keyed mutex, payload); inttest integration
(lifecycle, replay equivalence, timeout sweep with away grace, resign stats,
hint policy, word-check/complaint, per-game-lock). Docs (PLAN, ARCHITECTURE,
FUNCTIONAL +_ru, TESTING, README) updated.
18 KiB
Scrabble Game — Architecture
Source of truth for the platform architecture, transport, security model and
cross-service contracts. User-visible behaviour per domain lives in
FUNCTIONAL.md; the staged build order lives in
../PLAN.md. This document always describes the current
design, not the history of how it was reached. Sections describing
not-yet-implemented components are marked (planned).
1. Overview
Three executables plus per-platform side-services:
gateway(planned) — the only public ingress. Performs anti-abuse (rate limiting), authenticates the player against the originating platform (or an email/guest session), resolves the internaluser_id, and forwards authenticated traffic tobackendwith anX-User-IDheader. Hosts an admin surface behind HTTP Basic Auth. Bridges live events frombackendto the client.backend— internal-only service that owns every domain concern: identity/sessions, accounts and linking, lobby and matchmaking, the game runtime, the robot opponent, chat, notifications, statistics, history, and administration. Embeds thescrabble-solverengine as a library, in-process — there is no per-game container. The only network consumer ofbackendisgateway(plus platform side-services over an internal API).ui(planned) — pure-HTML5 client (plain Svelte + Vite, static build). Talks tobackendonly throughgateway. Embeddable in platform webviews; packageable to native (iOS/Android) via Capacitor.platform/<name>(planned) — per-platform side-services (Telegram bot first): deep-link invites and platform-native push notifications. They talk tobackendover an internal API.
flowchart LR
Client((Client / webview)) -- Connect-RPC + FlatBuffers (h2c) --> Gateway
Gateway -- REST/JSON, X-User-ID --> Backend
Backend -- gRPC server-stream (live events) --> Gateway
Gateway -- in-app stream --> Client
Backend -- pgx --> Postgres[(Postgres)]
Backend -. embeds .- Solver[[scrabble-solver library]]
Telegram[Telegram bot side-service] -- internal API --> Backend
The MVP runs gateway and backend as single-instance processes inside a
trusted network. No Redis is planned (anti-replay crypto was deliberately
dropped). Horizontal scaling is explicit future work.
2. Transport
- client ↔ gateway: Connect-RPC + FlatBuffers over HTTP/2 cleartext
(
h2c). Binary payloads, server-streaming for the in-app live channel, first-class JS clients (@connectrpc/connect-web+ theflatbuffersnpm package). The contract is kept minimal. - gateway ↔ backend (sync): plain HTTP REST/JSON. The gateway injects
X-User-IDfor authenticated requests;backendnever re-derives identity from the body. - backend → gateway (live): a single gRPC server-stream carries live events (your-turn, opponent-moved, chat, nudge). The gateway bridges them to the client's in-app stream while the app is open. Out-of-app delivery uses platform-native push via the platform side-service.
3. Authentication & sessions
Platform-native, deliberately simple: no Ed25519 client keys, no per-request signing, no anti-replay crypto (these were considered and dropped — players arrive from a platform rather than completing a mandatory registration).
- The gateway validates the originating credential once — the platform's
signed launch data (e.g. Telegram
initDataHMAC), an email-code login, or a guest bootstrap — then mints a thin opaque server session token (session_id). - The client holds
session_idin memory for the app session (browser/OS storage is optional and may be unavailable; losing it means re-login). - The gateway caches
session → user_idand injectsX-User-ID. Session records live inbackend, which stores only a SHA-256 hash of the opaque token (never the plaintext), keeps a warmed in-memory cache for fast resolution, and treats sessions as revoke-only — they have no TTL and live until explicitly revoked (status→revoked). - Guest = ephemeral web session (no platform, no email): session-only, nothing persisted; restricted to auto-match, with no friends and no stats/history. Platform users are auto-provisioned durable accounts.
4. Accounts, identities, linking & merge
- One internal account may carry several platform identities
(
telegram,vk, …) plus an optional email identity. First contact from a platform auto-provisions a durable account bound to that platform identity. Concretely, platform and email identities share oneidentitiestable keyed by a unique(kind, external_id); email is an identity withkind=emailand aconfirmedflag (the confirm-code flow lands later). Accounts and identities use application-generated UUIDv7 primary keys. - Linking is initiated from an authenticated profile: choose a platform → complete that platform's web-auth confirm → attach the identity to the current account.
- Merge: if the identity being linked already has its own account with history, the two accounts are merged into the current one (A is primary): statistics are summed, games and friends are transferred, duplicates are de-duplicated, the secondary account is retired. High blast-radius; an isolated, well-tested stage.
5. Game engine integration (scrabble-solver)
backend embeds the solver library in-process behind internal/engine, the
only package that imports scrabble-solver (see CLAUDE.md for
the solver's public API and constraints). The engine is a self-contained rules
library — no persistence, transport or scheduling; the game domain drives it.
Key points:
- Variants at launch: English Scrabble, Russian Scrabble, Эрудит
(
engine.Variant, mapping torules.English()/RussianScrabble()/Erudit()). Эрудит's specifics (non-doubling centre,ёwith no tiles, 3 blanks, a 15-point bonus) live entirely in the solver ruleset, so the engine treats every variant uniformly. - Dictionaries are committed DAWGs loaded with
dawg.Loadfrom the directoryBACKEND_DICT_DIR;backendloads theengine.Registryat startup as a hard dependency (like migrations), so a missing dictionary fails the boot. The registry holds dictionaries in memory addressed by(variant, dict_version), tracking the latest version per variant, and answers the word-check tool throughRegistry.Lookup. - Dictionary versioning — pin per game. A game records the
dict_versionit started on and finishes on that version; new games use the latest. Multiple versions may be resident at once. An admin reload (planned, Stage 9) registers a new version throughRegistry.Load; delivery is the DAWG file in the image / a volume mounted at the dictionary directory. (A future split of the solver into engine + dictionary generator with versioned artifacts is recorded in../PLAN.mdTODO-2.) - Move generation/validation/scoring use
Solver.GenerateMoves(ranked),Solver.ValidatePlayandSolver.ScorePlay; board mutation usesscrabble.Apply. The engine adds its own deterministic, seeded tile bag that can return tiles (an exchange needs this; the solver's self-play bag cannot). engine.Gameis the in-memory match state and the pure rules engine: it deals racks, applies legal plays / passes / exchanges / resignations, refills from the bag, keeps the scores and whose turn it is, and detects the end of the game — empty bag with an empty rack, or six consecutive scoreless turns, applying the end-game rack-value adjustment, or a resignation. On a resignation the resigner keeps their accumulated score (no rack adjustment) and never wins: the win goes to the highest score among the remaining seats, unconditionally the other player in a two-player game. The engine exposes a decoded, solver-free API (SubmitPlay/SubmitExchange/EvaluatePlay/HintView/Hand) sointernal/gamedrives it without importing the solver.- The game domain (
internal/game) owns everything the engine does not — persistence, turn scheduling, the configurable turn timeout / auto-resign, the hint budget, word-check complaints, history and GCG — and is the engine's only consumer. Timeout auto-resign reusesengine.Resign, recording the move as a timeout, so it inherits the resignation win/loss. - History is dictionary-independent (§9.1): the engine emits decoded
MoveRecords and reconstructs the board from them withengine.ReplayBoard(alphabet only, no dictionary).
6. Game rules
- Word legality: validate-at-submit. An illegal play is rejected by
Solver.ValidatePlay; there is no challenge phase. - End of game: the bag is empty and a player empties their rack, or 6 consecutive scoreless turns (passes/exchanges), or a resignation, or a missed turn. The per-game turn timeout is chosen at creation (5/10/15/30 min, 1/2/3/6/12/24 h; default 24 h); a turn not made within it becomes an automatic resignation, applied by a background sweeper. The sweeper honours each player's away window — a daily local-time sleep interval on the account (default 00:00–07:00, midnight-cross aware) — so a player is never timed out while asleep.
- Players: auto-match is always 2 players; friend games are 2–4 players.
backendowns turn order and the bag for any player count. A resignation or timeout in a two-player game ends it with the other player winning; richer multi-player drop-out (a leaver's seat skipped while the rest play on, with a per-game disposition of their tiles) is deferred to Stage 4, when friend games are formed. - Hint: governed by two per-game settings — whether hints are allowed and the
starting per-player allowance — plus a per-account hint wallet
(
hint_balance, spent after the allowance; top-ups are a later feature). A hint reveals the top-1 ranked move (GenerateMoves[0]). The lobby/tournament caller picks the per-game defaults (e.g. one in casual random games, none in tournaments). - Word-check tool: unlimited dictionary lookups against the game's pinned dictionary; each result offers a complaint (complainant, game, variant, dict_version, word, the disputed result, an optional note) that lands in an admin review queue (admin side planned, Stage 9).
7. Robot opponent
Substitutes for a human in 2-player auto-match when the pool yields no human within 10 seconds. Designed to be indistinguishable from a person.
- Balance: at game start it decides once whether to play to win, with
P(play-to-win) ≈ 0.40(so the human wins ≈ 60%). Adaptive difficulty is post-MVP. - Margin targeting: each turn it picks from
GenerateMovesa move that keeps the resulting lead (when playing to win) or deficit (when playing to lose) small (≈ 1–20 points), rather than always the maximum. - Timing: per-move delay sampled from a right-skewed distribution (short delays frequent), clamped to [2, 90] minutes; sleeps 00:00–07:00 in the opponent's profile timezone (fallback UTC); on a daytime nudge after 60 minutes idle it replies within 2–10 minutes; it proactively nudges the human after 12 hours idle.
- Blocks friend requests and direct messages; uses a human-like name pool.
8. Lobby & social
- Matchmaking (detail planned): a FIFO pool keyed by
(variant, language); 10 s with no human match → substitute the robot. - Friends: add by friend list, internal ID, or platform deep-link.
- Block settings independently suppress in-game chat and friend requests.
- Chat: per-game, persisted, length-limited, suppressed by the block setting.
- Nudge: a player may nudge the opponent whose turn is awaited once per hour; the opponent receives a platform-native notification.
- Profile:
preferred_language(en/ru), display name, linked platform accounts, email (confirm-code binding), timezone (drives robot sleep; default from platform/locale, user-editable), block toggles.
9. Persistence
- Single Postgres database, schema
backend;backendis the only writer. The "pgx pool" is adatabase/sqlhandle backed by the pgx stdlib driver and instrumented with otelsql; type-safe queries use go-jet (code generated intointernal/postgres/jetand committed, regenerated bycmd/jetgen). Migrations are embedded SQL applied withpressly/goose/v3at startup. Primary keys are application-generated UUIDv7. - Tables:
accounts(durable internal accounts; Stage 3 added the away-window columnsaway_start/away_endand the hint wallethint_balance),identities(platform/email identities, unique(kind, external_id)),sessions(revoke-only opaque-token hashes), and the Stage 3 game tablesgames,game_players,game_moves(the move journal),complaintsandaccount_stats. - Active games are event-sourced. A game is a
gamesrow (pinnedvariant/dict_version, bagseed, the per-game settings, and a denormalised turn cursor) plus an append-only, decoded move journal (game_moves); the live position is anengine.Gameheld in an in-memory cache (≈24 h idle TTL) and rebuilt by replaying the journal on a miss, which the seeded bag makes exact. Each game is serialised by a per-game lock; a persistence failure evicts the live game so the next access rebuilds from the journal.game_playersrecords each seat's account, running score, hints used and winner flag. - Statistics (
account_stats, recomputed on each finish, durable accounts only — guests never appear): wins, losses, draws, max points in a game, and max points for a single move (which already folds in every word the move formed plus the all-tiles bonus). A tie increments draws only; a resignation or timeout is a loss for the acting player.
9.1 History invariant (must hold forever)
Archived games must replay independently of any dictionary and of the
solver's internal encoding — at least visually. Therefore the move journal
persists only decoded concrete values: action kind (play / pass / exchange /
resign / timeout), acting player, per-move score and running total, timestamp,
and — in a per-move JSON payload — the acting player's rack before the move (with
? for a blank), and for a play its direction, main-word anchor, placed tiles
(letter as text, coordinate, blank flag) and the words formed; for an exchange,
the swapped tiles. This is exactly what is needed both to replay the game
through the engine (a cache miss) and to render history or emit GCG without a
dictionary: the board for visual replay is reconstructed by applying placements
onto an empty grid, since moves were validated at play time and scores are
stored. variant and dict_version are kept as metadata only (audit,
complaint review), never as a replay dependency. GCG export is derived from
the same rows and is likewise self-contained — we ship our own writer (the solver
exposes none): the standard Poslfit dialect (UTF-8, #player/#lexicon
pragmas, 8G/H8 coordinates, lower-case blanks, . pass-throughs, -TILES
exchanges), plus #note lines for resignations and timeouts, which the standard
does not cover.
10. Notifications
Two channels: platform-native push (out-of-app, via the platform side-service — your-turn, nudge) and the in-app live stream (chat, opponent-moved, while the app is open). Backend emits notification intents; delivery fans out to the appropriate channel.
11. Observability
- Structured logging with
go.uber.org/zap(JSON). OpenTelemetry tracer and meter providers are wired (Stage 1), env-gated byBACKEND_OTEL_{TRACES,METRICS}_EXPORTERwith a default ofnone(so no collector is required locally or in CI);stdoutis available for debugging and the Postgres pool is instrumented with otelsql. OTLP export, a Prometheus pull endpoint, and dashboards arrive with the first real workload. - Per-request server-side timing via gin middleware from day one (the access log carries method, route, status, latency and the active trace id). A client-measured RTT piggybacked on the next request is a later enhancement.
- Unauthenticated
GET /healthz(liveness) andGET /readyz(readiness — the database answers a bounded ping and the session cache is warmed).
12. Security boundaries
| Concern | Enforced by |
|---|---|
| Public rate limiting / anti-abuse | gateway |
| Platform credential validation, session minting | gateway |
Session → user_id resolution, X-User-ID injection |
gateway |
| Authorisation, ownership, state transitions | backend (X-User-ID is the sole identity input) |
| Admin authentication | gateway Basic Auth → backend admin endpoints |
| backend ↔ gateway trust | the network (only gateway may reach backend) |
This is an explicit, accepted MVP risk: compromise of the gateway↔backend network segment defeats backend authentication. Mitigated by network isolation; mutual auth is a future hardening step.
13. Deployment (informational)
Single public origin, path-routed: the UI, the gateway public surface and the
admin surface share one host that terminates TLS. MVP runs one gateway, one
backend, one Postgres. Docker/compose environments are introduced when there
is something to deploy.
14. CI & branches
- Trunk is
master; feature work happens onfeature/*branches merged via PR with a green CI gate (from Stage 1 onward — the genesis commit necessarily lands onmaster). .gitea/workflows/holds the CI.go-unit.yamlruns gofmt/vet/build/unit-test on Go changes;integration.yamlruns the Postgres-backed tests behind theintegrationbuild tag (testcontainerspostgres:17-alpine, Ryuk disabled, serial). Further workflows (ui-test, deploy) are added with the components they cover.- Since Stage 2 both Go workflows clone the public
scrabble-solversibling (master HEAD, no credentials) into../scrabble-solverbefore building, so thego.workreplaceresolves; the engine tests read the committed DAWGs from that checkout viaBACKEND_DICT_DIR. - After any push, the run is watched to green before a stage is declared done
(
python3 ~/.claude/bin/gitea-ci-watch.py).