# Scrabble Game — Architecture Source of truth for the platform architecture, transport, security model and cross-service contracts. User-visible behaviour per domain lives in [`FUNCTIONAL.md`](FUNCTIONAL.md); the staged build order lives in [`../PLAN.md`](../PLAN.md). This document always describes the **current** design, not the history of how it was reached. Sections describing not-yet-implemented components are marked *(planned)*. ## 1. Overview Three executables plus per-platform side-services: - **`gateway`** *(planned)* — the only public ingress. Performs anti-abuse (rate limiting), authenticates the player against the originating platform (or an email/guest session), resolves the internal `user_id`, and forwards authenticated traffic to `backend` with an `X-User-ID` header. Hosts an admin surface behind HTTP Basic Auth. Bridges live events from `backend` to the client. - **`backend`** — internal-only service that owns every domain concern: identity/sessions, accounts and linking, lobby and matchmaking, the game runtime, the robot opponent, chat, notifications, statistics, history, and administration. Embeds the **`scrabble-solver`** engine **as a library, in-process** — there is no per-game container. The only network consumer of `backend` is `gateway` (plus platform side-services over an internal API). - **`ui`** *(planned)* — pure-HTML5 client (plain Svelte + Vite, static build). Talks to `backend` only through `gateway`. Embeddable in platform webviews; packageable to native (iOS/Android) via Capacitor. - **`platform/`** *(planned)* — per-platform side-services (Telegram bot first): deep-link invites and platform-native push notifications. They talk to `backend` over an internal API. ```mermaid flowchart LR Client((Client / webview)) -- Connect-RPC + FlatBuffers (h2c) --> Gateway Gateway -- REST/JSON, X-User-ID --> Backend Backend -- gRPC server-stream (live events) --> Gateway Gateway -- in-app stream --> Client Backend -- pgx --> Postgres[(Postgres)] Backend -. embeds .- Solver[[scrabble-solver library]] Telegram[Telegram bot side-service] -- internal API --> Backend ``` The MVP runs `gateway` and `backend` as single-instance processes inside a trusted network. No Redis is planned (anti-replay crypto was deliberately dropped). Horizontal scaling is explicit future work. ## 2. Transport - **client ↔ gateway**: **Connect-RPC + FlatBuffers** over HTTP/2 cleartext (`h2c`). Binary payloads, server-streaming for the in-app live channel, first-class JS clients (`@connectrpc/connect-web` + the `flatbuffers` npm package). The contract is kept minimal. - **gateway ↔ backend (sync)**: plain HTTP REST/JSON. The gateway injects `X-User-ID` for authenticated requests; `backend` never re-derives identity from the body. - **backend → gateway (live)**: a single gRPC server-stream carries live events (your-turn, opponent-moved, chat, nudge). The gateway bridges them to the client's in-app stream while the app is open. Out-of-app delivery uses platform-native push via the platform side-service. ## 3. Authentication & sessions Platform-native, deliberately simple: **no Ed25519 client keys, no per-request signing, no anti-replay crypto** (these were considered and dropped — players arrive from a platform rather than completing a mandatory registration). - The gateway validates the originating credential **once** — the platform's signed launch data (e.g. Telegram `initData` HMAC), an email-code login, or a guest bootstrap — then mints a **thin opaque server session token** (`session_id`). - The client holds `session_id` in memory for the app session (browser/OS storage is optional and may be unavailable; losing it means re-login). - The gateway caches `session → user_id` and injects `X-User-ID`. Session records live in `backend`, which stores only a **SHA-256 hash** of the opaque token (never the plaintext), keeps a warmed in-memory cache for fast resolution, and treats sessions as **revoke-only** — they have no TTL and live until explicitly revoked (`status` → `revoked`). - **Guest** = ephemeral web session (no platform, no email): session-only, nothing persisted; restricted to auto-match, with no friends and no stats/history. Platform users are auto-provisioned **durable** accounts. ## 4. Accounts, identities, linking & merge - One internal account may carry several **platform identities** (`telegram`, `vk`, …) plus an optional **email** identity. First contact from a platform auto-provisions a durable account bound to that platform identity. Concretely, platform and email identities share one `identities` table keyed by a unique `(kind, external_id)`; email is an identity with `kind=email` and a `confirmed` flag (the confirm-code flow lands later). Accounts and identities use application-generated **UUIDv7** primary keys. - **Linking** is initiated from an authenticated profile: choose a platform → complete that platform's web-auth confirm → attach the identity to the current account. - **Merge**: if the identity being linked already has its own account with history, the two accounts are **merged into the current one (A is primary)**: statistics are summed, games and friends are transferred, duplicates are de-duplicated, the secondary account is retired. High blast-radius; an isolated, well-tested stage. ## 5. Game engine integration (`scrabble-solver`) `backend` embeds the solver library (see [`CLAUDE.md`](../CLAUDE.md) for the exact public API and constraints). Key points: - Variants at launch: **English Scrabble**, **Russian Scrabble**, **Эрудит** — `rules.English()`, `rules.RussianScrabble()`, `rules.Erudit()`. - Dictionaries are committed DAWGs loaded with `dawg.Load`; held in memory and addressed by `(variant, dict_version)`. - **Dictionary versioning — pin per game.** A game records the `dict_version` it started on and finishes on that version; new games use the latest. Multiple versions may be resident at once. An admin reload endpoint *(planned)* adds a new version; delivery is the DAWG file in the image / a mounted volume. - Move generation/validation/scoring use `Solver.GenerateMoves` (ranked), `Solver.ValidatePlay`, `Solver.ScorePlay`; board mutation uses `scrabble.Apply`. Tile bag follows the `selfplay.Bag` pattern. ## 6. Game rules - **Word legality: validate-at-submit.** An illegal play is rejected by `Solver.ValidatePlay`; there is no challenge phase. - **End of game**: the bag is empty **and** a player empties their rack, **or** **6 consecutive scoreless turns** (passes/exchanges). A move that is not made within the **24-hour** turn timeout becomes an automatic resignation. - **Players**: auto-match is always 2 players; friend games are 2–4 players. `backend` owns turn order and the bag for any player count. - **Hint**: one per game; reveals the top-1 ranked move (`GenerateMoves[0]`). - **Word-check tool**: unlimited dictionary lookups; each result offers a **complaint** that lands in an admin review queue *(admin side planned)*. ## 7. Robot opponent Substitutes for a human in 2-player auto-match when the pool yields no human within 10 seconds. Designed to be indistinguishable from a person. - **Balance**: at game start it decides once whether to play to win, with `P(play-to-win) ≈ 0.40` (so the human wins ≈ 60%). Adaptive difficulty is post-MVP. - **Margin targeting**: each turn it picks from `GenerateMoves` a move that keeps the resulting lead (when playing to win) or deficit (when playing to lose) small (≈ 1–20 points), rather than always the maximum. - **Timing**: per-move delay sampled from a right-skewed distribution (short delays frequent), clamped to **[2, 90] minutes**; **sleeps 00:00–07:00** in the opponent's profile timezone (fallback UTC); on a daytime nudge after 60 minutes idle it replies within **2–10 minutes**; it proactively nudges the human after 12 hours idle. - Blocks friend requests and direct messages; uses a human-like name pool. ## 8. Lobby & social - **Matchmaking** *(detail planned)*: a FIFO pool keyed by `(variant, language)`; 10 s with no human match → substitute the robot. - **Friends**: add by friend list, internal ID, or platform deep-link. - **Block** settings independently suppress in-game chat and friend requests. - **Chat**: per-game, persisted, length-limited, suppressed by the block setting. - **Nudge**: a player may nudge the opponent whose turn is awaited once per hour; the opponent receives a platform-native notification. - **Profile**: `preferred_language` (en/ru), display name, linked platform accounts, email (confirm-code binding), **timezone** (drives robot sleep; default from platform/locale, user-editable), block toggles. ## 9. Persistence - Single Postgres database, schema `backend`; `backend` is the only writer. The "pgx pool" is a `database/sql` handle backed by the pgx stdlib driver and instrumented with otelsql; type-safe queries use **go-jet** (code generated into `internal/postgres/jet` and committed, regenerated by `cmd/jetgen`). Migrations are embedded SQL applied with `pressly/goose/v3` at startup. Primary keys are application-generated **UUIDv7**. - Stage 1 tables: `accounts` (durable internal accounts), `identities` (platform/email identities, unique `(kind, external_id)`) and `sessions` (revoke-only opaque-token hashes). - **Active game state** is stored structurally with the `dict_version` pinned. - **Statistics** (computed on finish): wins, losses, max points in a game, max points for a single word. ### 9.1 History invariant (must hold forever) Archived games must replay **independently of any dictionary and of the solver's internal encoding** — at least visually. Therefore the move log persists only **decoded concrete values**: letters as text, coordinates, blank flag, action kind (play / pass / exchange / resign / timeout), acting player, per-move score and running total, timestamp. The board for visual replay is reconstructed by applying placements onto an empty grid; no dictionary is needed because moves were validated at play time and scores are stored. `variant` and `dict_version` are kept as **metadata only** (audit, complaint review), never as a replay dependency. **GCG export** is derived from the same rows and is likewise self-contained (we ship our own writer; the solver exposes no public GCG writer). ## 10. Notifications Two channels: **platform-native push** (out-of-app, via the platform side-service — your-turn, nudge) and the **in-app live stream** (chat, opponent-moved, while the app is open). Backend emits notification intents; delivery fans out to the appropriate channel. ## 11. Observability - Structured logging with `go.uber.org/zap` (JSON). OpenTelemetry tracer and meter providers are wired (Stage 1), env-gated by `BACKEND_OTEL_{TRACES,METRICS}_EXPORTER` with a default of `none` (so no collector is required locally or in CI); `stdout` is available for debugging and the Postgres pool is instrumented with otelsql. OTLP export, a Prometheus pull endpoint, and dashboards arrive with the first real workload. - Per-request server-side timing via gin middleware from day one (the access log carries method, route, status, latency and the active trace id). A client-measured RTT piggybacked on the next request is a later enhancement. - Unauthenticated `GET /healthz` (liveness) and `GET /readyz` (readiness — the database answers a bounded ping and the session cache is warmed). ## 12. Security boundaries | Concern | Enforced by | | --- | --- | | Public rate limiting / anti-abuse | gateway | | Platform credential validation, session minting | gateway | | Session → `user_id` resolution, `X-User-ID` injection | gateway | | Authorisation, ownership, state transitions | backend (`X-User-ID` is the sole identity input) | | Admin authentication | gateway Basic Auth → backend admin endpoints | | backend ↔ gateway trust | the network (only gateway may reach backend) | This is an explicit, accepted MVP risk: compromise of the gateway↔backend network segment defeats backend authentication. Mitigated by network isolation; mutual auth is a future hardening step. ## 13. Deployment (informational) Single public origin, path-routed: the UI, the gateway public surface and the admin surface share one host that terminates TLS. MVP runs one `gateway`, one `backend`, one Postgres. Docker/compose environments are introduced when there is something to deploy. ## 14. CI & branches - Trunk is **`master`**; feature work happens on `feature/*` branches merged via PR with a green CI gate (from Stage 1 onward — the genesis commit necessarily lands on `master`). - `.gitea/workflows/` holds the CI. `go-unit.yaml` runs gofmt/vet/build/unit-test on Go changes; `integration.yaml` runs the Postgres-backed tests behind the `integration` build tag (testcontainers `postgres:17-alpine`, Ryuk disabled, serial). Further workflows (ui-test, deploy) are added with the components they cover. - After any push, the run is watched to green before a stage is declared done (`python3 ~/.claude/bin/gitea-ci-watch.py`).