Files
scrabble-game/docs/ARCHITECTURE.md
T
Ilia Denisov 6d0dd4fb14
Tests · Go / test (push) Successful in 6s
Tests · Integration / integration (push) Successful in 7s
Stage 2: engine package over scrabble-solver (registry, bag, Game, replay)
backend/internal/engine wraps the sibling scrabble-solver library in-process:

- Registry: versioned DAWG load via dafsa.Load, keyed by (variant, dict_version),
  latest-per-variant; English / Russian / Эрудит handled uniformly.
- Bag: own deterministic, seeded tile bag with Draw + Return (for exchanges),
  since the solver's self-play bag cannot return tiles.
- Game: pure rules engine — deal, play/pass/exchange/resign, refill, per-move
  scoring, turn order, and end-condition detection (empty bag + empty rack, six
  scoreless turns, resignation) with end-game rack adjustment.
- decode/ReplayBoard: dictionary-independent MoveRecords and board replay via
  scrabble.Apply (no internal/encoding), realising ARCHITECTURE §9.1.

Wiring: go.work gains "replace scrabble-solver => ../scrabble-solver"; backend
requires scrabble-solver (placeholder) and github.com/iliadenisov/dafsa directly.
Both Go CI workflows clone the public solver sibling (master HEAD, no token) and
set BACKEND_DICT_DIR.

Docs: ARCHITECTURE §5/§14, TESTING engine layer, backend README, and PLAN
refinements + deferred TODOs (publish/version solver; split engine vs dictionary
generator).
2026-06-02 15:10:08 +02:00

15 KiB
Raw Blame History

Scrabble Game — Architecture

Source of truth for the platform architecture, transport, security model and cross-service contracts. User-visible behaviour per domain lives in FUNCTIONAL.md; the staged build order lives in ../PLAN.md. This document always describes the current design, not the history of how it was reached. Sections describing not-yet-implemented components are marked (planned).

1. Overview

Three executables plus per-platform side-services:

  • gateway (planned) — the only public ingress. Performs anti-abuse (rate limiting), authenticates the player against the originating platform (or an email/guest session), resolves the internal user_id, and forwards authenticated traffic to backend with an X-User-ID header. Hosts an admin surface behind HTTP Basic Auth. Bridges live events from backend to the client.
  • backend — internal-only service that owns every domain concern: identity/sessions, accounts and linking, lobby and matchmaking, the game runtime, the robot opponent, chat, notifications, statistics, history, and administration. Embeds the scrabble-solver engine as a library, in-process — there is no per-game container. The only network consumer of backend is gateway (plus platform side-services over an internal API).
  • ui (planned) — pure-HTML5 client (plain Svelte + Vite, static build). Talks to backend only through gateway. Embeddable in platform webviews; packageable to native (iOS/Android) via Capacitor.
  • platform/<name> (planned) — per-platform side-services (Telegram bot first): deep-link invites and platform-native push notifications. They talk to backend over an internal API.
flowchart LR
  Client((Client / webview)) -- Connect-RPC + FlatBuffers (h2c) --> Gateway
  Gateway -- REST/JSON, X-User-ID --> Backend
  Backend -- gRPC server-stream (live events) --> Gateway
  Gateway -- in-app stream --> Client
  Backend -- pgx --> Postgres[(Postgres)]
  Backend -. embeds .- Solver[[scrabble-solver library]]
  Telegram[Telegram bot side-service] -- internal API --> Backend

The MVP runs gateway and backend as single-instance processes inside a trusted network. No Redis is planned (anti-replay crypto was deliberately dropped). Horizontal scaling is explicit future work.

2. Transport

  • client ↔ gateway: Connect-RPC + FlatBuffers over HTTP/2 cleartext (h2c). Binary payloads, server-streaming for the in-app live channel, first-class JS clients (@connectrpc/connect-web + the flatbuffers npm package). The contract is kept minimal.
  • gateway ↔ backend (sync): plain HTTP REST/JSON. The gateway injects X-User-ID for authenticated requests; backend never re-derives identity from the body.
  • backend → gateway (live): a single gRPC server-stream carries live events (your-turn, opponent-moved, chat, nudge). The gateway bridges them to the client's in-app stream while the app is open. Out-of-app delivery uses platform-native push via the platform side-service.

3. Authentication & sessions

Platform-native, deliberately simple: no Ed25519 client keys, no per-request signing, no anti-replay crypto (these were considered and dropped — players arrive from a platform rather than completing a mandatory registration).

  • The gateway validates the originating credential once — the platform's signed launch data (e.g. Telegram initData HMAC), an email-code login, or a guest bootstrap — then mints a thin opaque server session token (session_id).
  • The client holds session_id in memory for the app session (browser/OS storage is optional and may be unavailable; losing it means re-login).
  • The gateway caches session → user_id and injects X-User-ID. Session records live in backend, which stores only a SHA-256 hash of the opaque token (never the plaintext), keeps a warmed in-memory cache for fast resolution, and treats sessions as revoke-only — they have no TTL and live until explicitly revoked (statusrevoked).
  • Guest = ephemeral web session (no platform, no email): session-only, nothing persisted; restricted to auto-match, with no friends and no stats/history. Platform users are auto-provisioned durable accounts.

4. Accounts, identities, linking & merge

  • One internal account may carry several platform identities (telegram, vk, …) plus an optional email identity. First contact from a platform auto-provisions a durable account bound to that platform identity. Concretely, platform and email identities share one identities table keyed by a unique (kind, external_id); email is an identity with kind=email and a confirmed flag (the confirm-code flow lands later). Accounts and identities use application-generated UUIDv7 primary keys.
  • Linking is initiated from an authenticated profile: choose a platform → complete that platform's web-auth confirm → attach the identity to the current account.
  • Merge: if the identity being linked already has its own account with history, the two accounts are merged into the current one (A is primary): statistics are summed, games and friends are transferred, duplicates are de-duplicated, the secondary account is retired. High blast-radius; an isolated, well-tested stage.

5. Game engine integration (scrabble-solver)

backend embeds the solver library in-process behind internal/engine, the only package that imports scrabble-solver (see CLAUDE.md for the solver's public API and constraints). The engine is a self-contained rules library — no persistence, transport or scheduling; the game domain drives it. Key points:

  • Variants at launch: English Scrabble, Russian Scrabble, Эрудит (engine.Variant, mapping to rules.English() / RussianScrabble() / Erudit()). Эрудит's specifics (non-doubling centre, ё with no tiles, 3 blanks, a 15-point bonus) live entirely in the solver ruleset, so the engine treats every variant uniformly.
  • Dictionaries are committed DAWGs loaded with dawg.Load from a directory (a parameter today; a configurable BACKEND_DICT_DIR is wired when the first consumer needs it). The engine.Registry holds them in memory addressed by (variant, dict_version), tracking the latest version per variant.
  • Dictionary versioning — pin per game. A game records the dict_version it started on and finishes on that version; new games use the latest. Multiple versions may be resident at once. An admin reload (planned, Stage 9) registers a new version through Registry.Load; delivery is the DAWG file in the image / a volume mounted at the dictionary directory. (A future split of the solver into engine + dictionary generator with versioned artifacts is recorded in ../PLAN.md TODO-2.)
  • Move generation/validation/scoring use Solver.GenerateMoves (ranked), Solver.ValidatePlay and Solver.ScorePlay; board mutation uses scrabble.Apply. The engine adds its own deterministic, seeded tile bag that can return tiles (an exchange needs this; the solver's self-play bag cannot).
  • engine.Game is the in-memory match state and the pure rules engine: it deals racks, applies legal plays / passes / exchanges / resignations, refills from the bag, keeps the scores and whose turn it is, and detects the end of the game — empty bag with an empty rack, six consecutive scoreless turns, or a resignation — applying the end-game rack-value adjustment. The 24-hour timeout / auto-resign, turn scheduling and persistence belong to the game domain (Stage 3).
  • History is dictionary-independent (§9.1): the engine emits decoded MoveRecords and reconstructs the board from them with engine.ReplayBoard (alphabet only, no dictionary).

6. Game rules

  • Word legality: validate-at-submit. An illegal play is rejected by Solver.ValidatePlay; there is no challenge phase.
  • End of game: the bag is empty and a player empties their rack, or 6 consecutive scoreless turns (passes/exchanges). A move that is not made within the 24-hour turn timeout becomes an automatic resignation.
  • Players: auto-match is always 2 players; friend games are 24 players. backend owns turn order and the bag for any player count.
  • Hint: one per game; reveals the top-1 ranked move (GenerateMoves[0]).
  • Word-check tool: unlimited dictionary lookups; each result offers a complaint that lands in an admin review queue (admin side planned).

7. Robot opponent

Substitutes for a human in 2-player auto-match when the pool yields no human within 10 seconds. Designed to be indistinguishable from a person.

  • Balance: at game start it decides once whether to play to win, with P(play-to-win) ≈ 0.40 (so the human wins ≈ 60%). Adaptive difficulty is post-MVP.
  • Margin targeting: each turn it picks from GenerateMoves a move that keeps the resulting lead (when playing to win) or deficit (when playing to lose) small (≈ 120 points), rather than always the maximum.
  • Timing: per-move delay sampled from a right-skewed distribution (short delays frequent), clamped to [2, 90] minutes; sleeps 00:0007:00 in the opponent's profile timezone (fallback UTC); on a daytime nudge after 60 minutes idle it replies within 210 minutes; it proactively nudges the human after 12 hours idle.
  • Blocks friend requests and direct messages; uses a human-like name pool.

8. Lobby & social

  • Matchmaking (detail planned): a FIFO pool keyed by (variant, language); 10 s with no human match → substitute the robot.
  • Friends: add by friend list, internal ID, or platform deep-link.
  • Block settings independently suppress in-game chat and friend requests.
  • Chat: per-game, persisted, length-limited, suppressed by the block setting.
  • Nudge: a player may nudge the opponent whose turn is awaited once per hour; the opponent receives a platform-native notification.
  • Profile: preferred_language (en/ru), display name, linked platform accounts, email (confirm-code binding), timezone (drives robot sleep; default from platform/locale, user-editable), block toggles.

9. Persistence

  • Single Postgres database, schema backend; backend is the only writer. The "pgx pool" is a database/sql handle backed by the pgx stdlib driver and instrumented with otelsql; type-safe queries use go-jet (code generated into internal/postgres/jet and committed, regenerated by cmd/jetgen). Migrations are embedded SQL applied with pressly/goose/v3 at startup. Primary keys are application-generated UUIDv7.
  • Stage 1 tables: accounts (durable internal accounts), identities (platform/email identities, unique (kind, external_id)) and sessions (revoke-only opaque-token hashes).
  • Active game state is stored structurally with the dict_version pinned.
  • Statistics (computed on finish): wins, losses, max points in a game, max points for a single word.

9.1 History invariant (must hold forever)

Archived games must replay independently of any dictionary and of the solver's internal encoding — at least visually. Therefore the move log persists only decoded concrete values: letters as text, coordinates, blank flag, action kind (play / pass / exchange / resign / timeout), acting player, per-move score and running total, timestamp. The board for visual replay is reconstructed by applying placements onto an empty grid; no dictionary is needed because moves were validated at play time and scores are stored. variant and dict_version are kept as metadata only (audit, complaint review), never as a replay dependency. GCG export is derived from the same rows and is likewise self-contained (we ship our own writer; the solver exposes no public GCG writer).

10. Notifications

Two channels: platform-native push (out-of-app, via the platform side-service — your-turn, nudge) and the in-app live stream (chat, opponent-moved, while the app is open). Backend emits notification intents; delivery fans out to the appropriate channel.

11. Observability

  • Structured logging with go.uber.org/zap (JSON). OpenTelemetry tracer and meter providers are wired (Stage 1), env-gated by BACKEND_OTEL_{TRACES,METRICS}_EXPORTER with a default of none (so no collector is required locally or in CI); stdout is available for debugging and the Postgres pool is instrumented with otelsql. OTLP export, a Prometheus pull endpoint, and dashboards arrive with the first real workload.
  • Per-request server-side timing via gin middleware from day one (the access log carries method, route, status, latency and the active trace id). A client-measured RTT piggybacked on the next request is a later enhancement.
  • Unauthenticated GET /healthz (liveness) and GET /readyz (readiness — the database answers a bounded ping and the session cache is warmed).

12. Security boundaries

Concern Enforced by
Public rate limiting / anti-abuse gateway
Platform credential validation, session minting gateway
Session → user_id resolution, X-User-ID injection gateway
Authorisation, ownership, state transitions backend (X-User-ID is the sole identity input)
Admin authentication gateway Basic Auth → backend admin endpoints
backend ↔ gateway trust the network (only gateway may reach backend)

This is an explicit, accepted MVP risk: compromise of the gateway↔backend network segment defeats backend authentication. Mitigated by network isolation; mutual auth is a future hardening step.

13. Deployment (informational)

Single public origin, path-routed: the UI, the gateway public surface and the admin surface share one host that terminates TLS. MVP runs one gateway, one backend, one Postgres. Docker/compose environments are introduced when there is something to deploy.

14. CI & branches

  • Trunk is master; feature work happens on feature/* branches merged via PR with a green CI gate (from Stage 1 onward — the genesis commit necessarily lands on master).
  • .gitea/workflows/ holds the CI. go-unit.yaml runs gofmt/vet/build/unit-test on Go changes; integration.yaml runs the Postgres-backed tests behind the integration build tag (testcontainers postgres:17-alpine, Ryuk disabled, serial). Further workflows (ui-test, deploy) are added with the components they cover.
  • Since Stage 2 both Go workflows clone the public scrabble-solver sibling (master HEAD, no credentials) into ../scrabble-solver before building, so the go.work replace resolves; the engine tests read the committed DAWGs from that checkout via BACKEND_DICT_DIR.
  • After any push, the run is watched to green before a stage is declared done (python3 ~/.claude/bin/gitea-ci-watch.py).