Files
scrabble-game/docs/ARCHITECTURE.md
T
Ilia Denisov 6d0dd4fb14
Tests · Go / test (push) Successful in 6s
Tests · Integration / integration (push) Successful in 7s
Stage 2: engine package over scrabble-solver (registry, bag, Game, replay)
backend/internal/engine wraps the sibling scrabble-solver library in-process:

- Registry: versioned DAWG load via dafsa.Load, keyed by (variant, dict_version),
  latest-per-variant; English / Russian / Эрудит handled uniformly.
- Bag: own deterministic, seeded tile bag with Draw + Return (for exchanges),
  since the solver's self-play bag cannot return tiles.
- Game: pure rules engine — deal, play/pass/exchange/resign, refill, per-move
  scoring, turn order, and end-condition detection (empty bag + empty rack, six
  scoreless turns, resignation) with end-game rack adjustment.
- decode/ReplayBoard: dictionary-independent MoveRecords and board replay via
  scrabble.Apply (no internal/encoding), realising ARCHITECTURE §9.1.

Wiring: go.work gains "replace scrabble-solver => ../scrabble-solver"; backend
requires scrabble-solver (placeholder) and github.com/iliadenisov/dafsa directly.
Both Go CI workflows clone the public solver sibling (master HEAD, no token) and
set BACKEND_DICT_DIR.

Docs: ARCHITECTURE §5/§14, TESTING engine layer, backend README, and PLAN
refinements + deferred TODOs (publish/version solver; split engine vs dictionary
generator).
2026-06-02 15:10:08 +02:00

274 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Scrabble Game — Architecture
Source of truth for the platform architecture, transport, security model and
cross-service contracts. User-visible behaviour per domain lives in
[`FUNCTIONAL.md`](FUNCTIONAL.md); the staged build order lives in
[`../PLAN.md`](../PLAN.md). This document always describes the **current**
design, not the history of how it was reached. Sections describing
not-yet-implemented components are marked *(planned)*.
## 1. Overview
Three executables plus per-platform side-services:
- **`gateway`** *(planned)* — the only public ingress. Performs anti-abuse
(rate limiting), authenticates the player against the originating platform
(or an email/guest session), resolves the internal `user_id`, and forwards
authenticated traffic to `backend` with an `X-User-ID` header. Hosts an admin
surface behind HTTP Basic Auth. Bridges live events from `backend` to the
client.
- **`backend`** — internal-only service that owns every domain concern:
identity/sessions, accounts and linking, lobby and matchmaking, the game
runtime, the robot opponent, chat, notifications, statistics, history, and
administration. Embeds the **`scrabble-solver`** engine **as a library,
in-process** — there is no per-game container. The only network consumer of
`backend` is `gateway` (plus platform side-services over an internal API).
- **`ui`** *(planned)* — pure-HTML5 client (plain Svelte + Vite, static build).
Talks to `backend` only through `gateway`. Embeddable in platform webviews;
packageable to native (iOS/Android) via Capacitor.
- **`platform/<name>`** *(planned)* — per-platform side-services (Telegram bot
first): deep-link invites and platform-native push notifications. They talk
to `backend` over an internal API.
```mermaid
flowchart LR
Client((Client / webview)) -- Connect-RPC + FlatBuffers (h2c) --> Gateway
Gateway -- REST/JSON, X-User-ID --> Backend
Backend -- gRPC server-stream (live events) --> Gateway
Gateway -- in-app stream --> Client
Backend -- pgx --> Postgres[(Postgres)]
Backend -. embeds .- Solver[[scrabble-solver library]]
Telegram[Telegram bot side-service] -- internal API --> Backend
```
The MVP runs `gateway` and `backend` as single-instance processes inside a
trusted network. No Redis is planned (anti-replay crypto was deliberately
dropped). Horizontal scaling is explicit future work.
## 2. Transport
- **client ↔ gateway**: **Connect-RPC + FlatBuffers** over HTTP/2 cleartext
(`h2c`). Binary payloads, server-streaming for the in-app live channel,
first-class JS clients (`@connectrpc/connect-web` + the `flatbuffers` npm
package). The contract is kept minimal.
- **gateway ↔ backend (sync)**: plain HTTP REST/JSON. The gateway injects
`X-User-ID` for authenticated requests; `backend` never re-derives identity
from the body.
- **backend → gateway (live)**: a single gRPC server-stream carries live events
(your-turn, opponent-moved, chat, nudge). The gateway bridges them to the
client's in-app stream while the app is open. Out-of-app delivery uses
platform-native push via the platform side-service.
## 3. Authentication & sessions
Platform-native, deliberately simple: **no Ed25519 client keys, no per-request
signing, no anti-replay crypto** (these were considered and dropped — players
arrive from a platform rather than completing a mandatory registration).
- The gateway validates the originating credential **once** — the platform's
signed launch data (e.g. Telegram `initData` HMAC), an email-code login, or a
guest bootstrap — then mints a **thin opaque server session token**
(`session_id`).
- The client holds `session_id` in memory for the app session (browser/OS
storage is optional and may be unavailable; losing it means re-login).
- The gateway caches `session → user_id` and injects `X-User-ID`. Session
records live in `backend`, which stores only a **SHA-256 hash** of the opaque
token (never the plaintext), keeps a warmed in-memory cache for fast
resolution, and treats sessions as **revoke-only** — they have no TTL and live
until explicitly revoked (`status``revoked`).
- **Guest** = ephemeral web session (no platform, no email): session-only,
nothing persisted; restricted to auto-match, with no friends and no
stats/history. Platform users are auto-provisioned **durable** accounts.
## 4. Accounts, identities, linking & merge
- One internal account may carry several **platform identities**
(`telegram`, `vk`, …) plus an optional **email** identity. First contact from
a platform auto-provisions a durable account bound to that platform identity.
Concretely, platform and email identities share one `identities` table keyed by
a unique `(kind, external_id)`; email is an identity with `kind=email` and a
`confirmed` flag (the confirm-code flow lands later). Accounts and identities
use application-generated **UUIDv7** primary keys.
- **Linking** is initiated from an authenticated profile: choose a platform →
complete that platform's web-auth confirm → attach the identity to the
current account.
- **Merge**: if the identity being linked already has its own account with
history, the two accounts are **merged into the current one (A is primary)**:
statistics are summed, games and friends are transferred, duplicates are
de-duplicated, the secondary account is retired. High blast-radius; an
isolated, well-tested stage.
## 5. Game engine integration (`scrabble-solver`)
`backend` embeds the solver library in-process behind `internal/engine`, the
only package that imports `scrabble-solver` (see [`CLAUDE.md`](../CLAUDE.md) for
the solver's public API and constraints). The engine is a self-contained rules
library — no persistence, transport or scheduling; the game domain drives it.
Key points:
- Variants at launch: **English Scrabble**, **Russian Scrabble**, **Эрудит**
(`engine.Variant`, mapping to `rules.English()` / `RussianScrabble()` /
`Erudit()`). Эрудит's specifics (non-doubling centre, `ё` with no tiles, 3
blanks, a 15-point bonus) live entirely in the solver ruleset, so the engine
treats every variant uniformly.
- **Dictionaries** are committed DAWGs loaded with `dawg.Load` from a directory
(a parameter today; a configurable `BACKEND_DICT_DIR` is wired when the first
consumer needs it). The `engine.Registry` holds them in memory addressed by
`(variant, dict_version)`, tracking the latest version per variant.
- **Dictionary versioning — pin per game.** A game records the `dict_version` it
started on and finishes on that version; new games use the latest. Multiple
versions may be resident at once. An admin reload *(planned, Stage 9)*
registers a new version through `Registry.Load`; delivery is the DAWG file in
the image / a volume mounted at the dictionary directory. (A future split of
the solver into engine + dictionary generator with versioned artifacts is
recorded in [`../PLAN.md`](../PLAN.md) TODO-2.)
- Move generation/validation/scoring use `Solver.GenerateMoves` (ranked),
`Solver.ValidatePlay` and `Solver.ScorePlay`; board mutation uses
`scrabble.Apply`. The engine adds its own deterministic, seeded tile **bag**
that can return tiles (an exchange needs this; the solver's self-play bag
cannot).
- **`engine.Game`** is the in-memory match state and the pure rules engine: it
deals racks, applies legal plays / passes / exchanges / resignations, refills
from the bag, keeps the scores and whose turn it is, and **detects the end of
the game** — empty bag with an empty rack, six consecutive scoreless turns, or
a resignation — applying the end-game rack-value adjustment. The 24-hour
timeout / auto-resign, turn scheduling and persistence belong to the game
domain *(Stage 3)*.
- History is dictionary-independent (§9.1): the engine emits decoded
`MoveRecord`s and reconstructs the board from them with `engine.ReplayBoard`
(alphabet only, no dictionary).
## 6. Game rules
- **Word legality: validate-at-submit.** An illegal play is rejected by
`Solver.ValidatePlay`; there is no challenge phase.
- **End of game**: the bag is empty **and** a player empties their rack, **or**
**6 consecutive scoreless turns** (passes/exchanges). A move that is not made
within the **24-hour** turn timeout becomes an automatic resignation.
- **Players**: auto-match is always 2 players; friend games are 24 players.
`backend` owns turn order and the bag for any player count.
- **Hint**: one per game; reveals the top-1 ranked move (`GenerateMoves[0]`).
- **Word-check tool**: unlimited dictionary lookups; each result offers a
**complaint** that lands in an admin review queue *(admin side planned)*.
## 7. Robot opponent
Substitutes for a human in 2-player auto-match when the pool yields no human
within 10 seconds. Designed to be indistinguishable from a person.
- **Balance**: at game start it decides once whether to play to win, with
`P(play-to-win) ≈ 0.40` (so the human wins ≈ 60%). Adaptive difficulty is
post-MVP.
- **Margin targeting**: each turn it picks from `GenerateMoves` a move that
keeps the resulting lead (when playing to win) or deficit (when playing to
lose) small (≈ 120 points), rather than always the maximum.
- **Timing**: per-move delay sampled from a right-skewed distribution (short
delays frequent), clamped to **[2, 90] minutes**; **sleeps 00:0007:00** in
the opponent's profile timezone (fallback UTC); on a daytime nudge after 60
minutes idle it replies within **210 minutes**; it proactively nudges the
human after 12 hours idle.
- Blocks friend requests and direct messages; uses a human-like name pool.
## 8. Lobby & social
- **Matchmaking** *(detail planned)*: a FIFO pool keyed by `(variant,
language)`; 10 s with no human match → substitute the robot.
- **Friends**: add by friend list, internal ID, or platform deep-link.
- **Block** settings independently suppress in-game chat and friend requests.
- **Chat**: per-game, persisted, length-limited, suppressed by the block
setting.
- **Nudge**: a player may nudge the opponent whose turn is awaited once per
hour; the opponent receives a platform-native notification.
- **Profile**: `preferred_language` (en/ru), display name, linked platform
accounts, email (confirm-code binding), **timezone** (drives robot sleep;
default from platform/locale, user-editable), block toggles.
## 9. Persistence
- Single Postgres database, schema `backend`; `backend` is the only writer. The
"pgx pool" is a `database/sql` handle backed by the pgx stdlib driver and
instrumented with otelsql; type-safe queries use **go-jet** (code generated
into `internal/postgres/jet` and committed, regenerated by `cmd/jetgen`).
Migrations are embedded SQL applied with `pressly/goose/v3` at startup. Primary
keys are application-generated **UUIDv7**.
- Stage 1 tables: `accounts` (durable internal accounts), `identities`
(platform/email identities, unique `(kind, external_id)`) and `sessions`
(revoke-only opaque-token hashes).
- **Active game state** is stored structurally with the `dict_version` pinned.
- **Statistics** (computed on finish): wins, losses, max points in a game, max
points for a single word.
### 9.1 History invariant (must hold forever)
Archived games must replay **independently of any dictionary and of the
solver's internal encoding** — at least visually. Therefore the move log
persists only **decoded concrete values**: letters as text, coordinates, blank
flag, action kind (play / pass / exchange / resign / timeout), acting player,
per-move score and running total, timestamp. The board for visual replay is
reconstructed by applying placements onto an empty grid; no dictionary is
needed because moves were validated at play time and scores are stored.
`variant` and `dict_version` are kept as **metadata only** (audit, complaint
review), never as a replay dependency. **GCG export** is derived from the same
rows and is likewise self-contained (we ship our own writer; the solver exposes
no public GCG writer).
## 10. Notifications
Two channels: **platform-native push** (out-of-app, via the platform
side-service — your-turn, nudge) and the **in-app live stream** (chat,
opponent-moved, while the app is open). Backend emits notification intents;
delivery fans out to the appropriate channel.
## 11. Observability
- Structured logging with `go.uber.org/zap` (JSON). OpenTelemetry tracer and
meter providers are wired (Stage 1), env-gated by
`BACKEND_OTEL_{TRACES,METRICS}_EXPORTER` with a default of `none` (so no
collector is required locally or in CI); `stdout` is available for debugging
and the Postgres pool is instrumented with otelsql. OTLP export, a Prometheus
pull endpoint, and dashboards arrive with the first real workload.
- Per-request server-side timing via gin middleware from day one (the access log
carries method, route, status, latency and the active trace id). A
client-measured RTT piggybacked on the next request is a later enhancement.
- Unauthenticated `GET /healthz` (liveness) and `GET /readyz` (readiness — the
database answers a bounded ping and the session cache is warmed).
## 12. Security boundaries
| Concern | Enforced by |
| --- | --- |
| Public rate limiting / anti-abuse | gateway |
| Platform credential validation, session minting | gateway |
| Session → `user_id` resolution, `X-User-ID` injection | gateway |
| Authorisation, ownership, state transitions | backend (`X-User-ID` is the sole identity input) |
| Admin authentication | gateway Basic Auth → backend admin endpoints |
| backend ↔ gateway trust | the network (only gateway may reach backend) |
This is an explicit, accepted MVP risk: compromise of the gateway↔backend
network segment defeats backend authentication. Mitigated by network isolation;
mutual auth is a future hardening step.
## 13. Deployment (informational)
Single public origin, path-routed: the UI, the gateway public surface and the
admin surface share one host that terminates TLS. MVP runs one `gateway`, one
`backend`, one Postgres. Docker/compose environments are introduced when there
is something to deploy.
## 14. CI & branches
- Trunk is **`master`**; feature work happens on `feature/*` branches merged via
PR with a green CI gate (from Stage 1 onward — the genesis commit necessarily
lands on `master`).
- `.gitea/workflows/` holds the CI. `go-unit.yaml` runs gofmt/vet/build/unit-test
on Go changes; `integration.yaml` runs the Postgres-backed tests behind the
`integration` build tag (testcontainers `postgres:17-alpine`, Ryuk disabled,
serial). Further workflows (ui-test, deploy) are added with the components they
cover.
- Since Stage 2 both Go workflows clone the public `scrabble-solver` sibling
(master HEAD, no credentials) into `../scrabble-solver` before building, so the
`go.work` `replace` resolves; the engine tests read the committed DAWGs from
that checkout via `BACKEND_DICT_DIR`.
- After any push, the run is watched to green before a stage is declared done
(`python3 ~/.claude/bin/gitea-ci-watch.py`).