R6(a): de-stage code, docs, READMEs; split stage6_test

Mechanical, behaviour-preserving removal of Stage N / TODO-N / phase (RN)
references from comments, doc-comments, service READMEs, the current-state docs
(ARCHITECTURE, FUNCTIONAL+_ru, TESTING, UI_DESIGN), config-file comments, and the
.fbs/.proto schema comments. PLAN.md / PRERELEASE.md / CLAUDE.md keep the stage
history.

- Rename the only stage-named identifiers: registerStage8 -> registerSocialOps,
  registerStage11 -> registerLinkOps (gateway transcode).
- Split stage6_test.go: TestEmailLoginFlow -> email_test.go,
  TestGuestAutoMatchLeavesNoStats (+ provisionGuest) -> account_test.go.
- Regenerated proto bindings (push.pb.go, telegram_grpc.pb.go) from the de-staged
  .proto comments; FB Go/TS bindings unchanged (flatc strips schema comments).

go build/vet/gofmt clean across modules; integration typecheck and pnpm check green.
This commit is contained in:
Ilia Denisov
2026-06-10 16:56:03 +02:00
parent a372343797
commit 8881214213
156 changed files with 749 additions and 778 deletions
+74 -77
View File
@@ -28,10 +28,10 @@ Three executables plus per-platform side-services:
- **`ui`** — pure-HTML5 client (plain Svelte 5 + TypeScript + Vite, static build;
no SvelteKit). Talks to `backend` only through `gateway` over Connect-RPC +
FlatBuffers, with the edge TS bindings generated from the **same** `edge.proto`
and `scrabble.fbs` and committed under `ui/src/gen/`. The **playable slice**
(Stage 7) covers auth, "my games", auto-match, the board (play/pass/exchange/
and `scrabble.fbs` and committed under `ui/src/gen/`. The client covers auth,
"my games", auto-match, the board (play/pass/exchange/
resign), hint, word-check, chat/nudge, the live stream, i18n (en/ru) and a profile
view; the social/account/history surfaces follow in Stage 8. There is no board on
view, plus the social/account/history surfaces. There is no board on
the wire — the client **reconstructs the 15×15 board by replaying the move
journal** (§9.1) and renders board, tiles, premium squares and effects as pure
CSS + Unicode (no image/font/SVG assets). Tiles are placed by Pointer-Events drag
@@ -41,7 +41,7 @@ Three executables plus per-platform side-services:
Embeddable in platform webviews; packageable to native (iOS/Android) via Capacitor.
The client uses a mobile-app shell (a growing nav bar; content pinned to the bottom),
a one-line **announcement banner** under the nav (a client-side mock rotation, **gated off
in the build until polished after release**, Stage 17 — a server-driven channel later, §10),
in the build until polished after release** — a server-driven channel later, §10),
and a client **board-style** setting (bonus-label
mode). The visual/interaction design system is documented in
[`UI_DESIGN.md`](UI_DESIGN.md).
@@ -89,7 +89,7 @@ dropped). Horizontal scaling is explicit future work.
auth operations are unauthenticated and return the minted token. A unary
operation's domain outcome rides back in `ExecuteResponse.result_code` (HTTP
200); only edge failures (rate limit, missing session, unknown type, internal)
surface as Connect error codes. The client (Stage 17) treats a connectivity edge failure as
surface as Connect error codes. The client treats a connectivity edge failure as
**state, not a per-call toast**: a transport `unavailable` or a `rate_limited` flips a global
`online` signal that drives a header **"Connecting…"** spinner and softly disables proactive
actions, and the transport **auto-retries with capped exponential backoff** — every op on a
@@ -98,7 +98,7 @@ dropped). Horizontal scaling is explicit future work.
response was lost — its button is disabled while offline and the player re-issues it on
reconnect). A reachability watcher (a lightweight `profile.get` probe) clears the signal when no
other traffic is in flight; the live `Subscribe` stream's drop/recovery feeds the same signal.
**Edge hardening (R3):** every request body on the public listener is capped at
**Edge hardening:** every request body on the public listener is capped at
`GATEWAY_MAX_BODY_BYTES` (default 1 MiB — far above any legitimate payload), both at the HTTP
layer (`http.MaxBytesReader`) and as the Connect per-message read limit, so an oversized
`Execute` is refused (`resource_exhausted`) without buffering. The h2c server carries explicit
@@ -106,8 +106,7 @@ dropped). Horizontal scaling is explicit future work.
`Subscribe` stream plus a few unary calls) and a 3-minute connection `IdleTimeout` (a live
`Subscribe` stream keeps its connection active, so only abandoned connections are reaped); the
`http.Server` sets only `ReadHeaderTimeout` (10 s) — Read/WriteTimeout would kill the stream.
R7 revisits the exact values under load.
- **Alphabet on the wire (Stage 13)**: live play exchanges **alphabet indices**, not
- **Alphabet on the wire**: live play exchanges **alphabet indices**, not
concrete letters. The rack (`StateView.rack`), the `SubmitPlay`/`Evaluate` tiles, the
`Exchange` tiles and the `CheckWord` word are `ubyte` indices into the variant's alphabet
(a blank is the sentinel index **255**). The client is **alphabet-agnostic**: on a
@@ -139,7 +138,7 @@ arrive from a platform rather than completing a mandatory registration).
bootstrap — then mints a **thin opaque server session token** (`session_id`). First
Telegram contact seeds the new account's language (from the launch `language_code`)
and display name (§4).
- **Service language & variant gating (Stage 15).** The connector hosts **one bot per
- **Service language & variant gating.** The connector hosts **one bot per
service language** (`en`/`ru`), each its own token + game channel; the same Telegram
user id spans both. `ValidateInitData` tries each token in turn and returns the
validating bot's **service language** and its **supported-languages set**. The set
@@ -151,7 +150,7 @@ arrive from a platform rather than completing a mandatory registration).
**persisted** per account (`accounts.service_language`, updated on every Telegram
login — last-login-wins) and routes the user's out-of-app push back through the right
bot (§10) — **except a game event, which routes by the game's own language** (its variant →
en/ru, Stage 17), so a game's notification always comes from the game's bot rather than the
en/ru), so a game's notification always comes from the game's bot rather than the
recipient's latest login bot. The service language is distinct from `preferred_language` (the
interface language) and from a game's variant language. Non-Telegram logins (web / email / guest) carry the
gateway's default set (`GATEWAY_DEFAULT_SUPPORTED_LANGUAGES`, all variants by default).
@@ -170,7 +169,7 @@ arrive from a platform rather than completing a mandatory registration).
require one, the same way the robot pool is durable), not a profile: no
friends, statistics or history are kept for it, and it is restricted to
auto-match. Platform and email users are auto-provisioned **durable** accounts
with an identity. (Reaping abandoned guest rows is deferred — PLAN.md TODO-3.)
with an identity.
## 4. Accounts, identities, linking & merge
@@ -179,15 +178,15 @@ arrive from a platform rather than completing a mandatory registration).
a platform auto-provisions a durable account bound to that platform identity.
Concretely, platform and email identities share one `identities` table keyed by
a unique `(kind, external_id)`; email is an identity with `kind=email` and a
`confirmed` flag. A synthetic `kind='robot'` identity (Stage 5) backs each pooled
robot opponent (§7). The **email confirm-code flow** (Stage 4) binds an email to the
`confirmed` flag. A synthetic `kind='robot'` identity backs each pooled
robot opponent (§7). The **email confirm-code flow** binds an email to the
authenticated account: a 6-digit code (stored only as a SHA-256 hash, 15-minute
TTL, ≤ 5 attempts) is sent through a `Mailer` seam (an SMTP relay, or a
development log mailer when none is configured) and, once verified, attaches a
confirmed email identity. Accounts and identities use application-generated
**UUIDv7** primary keys. A service flag `paid_account` (lifetime one-time
payment; no purchase flow yet) is carried on the account and ORed on a merge.
- **Linking** (Stage 11) is initiated from an authenticated profile and proves
- **Linking** is initiated from an authenticated profile and proves
control of the identity before attaching it: **email** through the confirm-code
flow, **Telegram** through the web **Login Widget** (validated by the connector,
HMAC under `SHA-256(bot_token)` — distinct from Mini App initData; the gateway
@@ -210,7 +209,7 @@ arrive from a platform rather than completing a mandatory registration).
already has a **durable** owner: then the durable account wins, the guest's active
games move into it, the guest is retired, and a **fresh session** is minted for the
durable account (the client switches to it). The secondary's sessions are revoked
(§3). High blast-radius; an isolated, well-tested stage.
(§3). High blast-radius; isolated and well-tested.
## 5. Game engine integration (`scrabble-solver`)
@@ -241,8 +240,7 @@ Key points:
boot version plus each subdirectory). In-flight games keep their pinned version;
new games use the latest. (The solver is published as a versioned module and the
dictionaries ship as a separate versioned **release artifact** from the
`scrabble-dictionary` repo — TODO-1/TODO-2, Stage 14; the runtime contract above is
unchanged.)
`scrabble-dictionary` repo; the runtime contract above is unchanged.)
- Move generation/validation/scoring use `Solver.GenerateMoves` (ranked),
`Solver.ValidatePlay` and `Solver.ScorePlay`; board mutation uses
`scrabble.Apply`. The engine adds its own deterministic, seeded tile **bag**
@@ -258,7 +256,7 @@ Key points:
unconditionally the other player in a two-player game. A player may resign **on the
opponent's turn** (a forfeit is not a turn-scoped move): `engine.ResignSeat(seat)`
resigns that player's own seat whoever is to move, and the game domain skips the turn
check for resign (Stage 17). The engine exposes a
check for resign. The engine exposes a
decoded, solver-free API (`SubmitPlay`/`SubmitExchange`/`EvaluatePlay`/
`HintView`/`Hand`) so `internal/game` drives it without importing the solver.
- The **game domain** (`internal/game`) owns everything the engine does not —
@@ -326,7 +324,7 @@ behaviour on every scan and after a restart — the same philosophy as journal
replay. A pool of durable accounts — each a `kind='robot'` identity (§4), keyed
`robot-<lang>-<index>` and provisioned at startup with **chat blocked but friend
requests open** — a request to a robot is accepted as pending and expires unanswered
(the robot never responds), mirroring a human who ignores it (Stage 17); the chat
(the robot never responds), mirroring a human who ignores it; the chat
block backs the human-like names (there is no DM surface; chat is per-game). Names are
**composed per language** from a first-name pool (32 full + 32 colloquial forms) and
a surname pool (gender-agreed for Russian) in one of three forms (first only /
@@ -352,12 +350,12 @@ English game the Latin pool.
rather than running anti-phase; on a daytime nudge it replies near the move's lower
band; it proactively nudges the idle human on a **lengthening, randomized schedule** — the
first ~60-90 min into the turn, each later reminder spaced further out toward 1-6 h — so a long
wait gets a handful of increasingly-spaced nudges rather than an hourly stream (Stage 17).
wait gets a handful of increasingly-spaced nudges rather than an hourly stream.
- **Observability**: robot accounts accrue ordinary statistics (§9) — the
authoritative balance metric (target ≈ 40% robot wins) — and a
`robot_games_finished_total` OTel counter plus a per-finish log give a live view.
The **admin game card** surfaces each robot seat's per-game play-to-win intent (from
the seed) and, on the robot's turn, its deterministic **next-move ETA** (Stage 17).
the seed) and, on the robot's turn, its deterministic **next-move ETA**.
## 8. Lobby & social
@@ -371,8 +369,8 @@ English game the Latin pool.
`Poll` remains as a fallback for a client that is not currently streaming.
**Cancel** (`POST /lobby/cancel`) removes the player from the pool and drops any
pending matched result, so a cancelled quick-match is dequeued rather than left for
the reaper to robot-substitute (Stage 17).
- **Friends** (Stage 8): two add paths over one `friendships` table. A **one-time
the reaper to robot-substitute.
- **Friends**: two add paths over one `friendships` table. A **one-time
code** the to-be-added player issues (a `friend_codes` row: 6-digit numeric,
SHA-256-hashed, **12 h** TTL, one live code per issuer, single-use, redeem
rate-limited) is redeemed by the other player to become friends immediately.
@@ -382,7 +380,7 @@ English game the Latin pool.
remembered (`status='declined'`) and blocks further requests from that sender,
unless they hand them a code, which overrides it. The requester's own cancel still
deletes the row; blocking someone severs an existing friendship. (Discovery by
friend list or platform deep-link arrives with Stage 9 / TODO-5.)
friend list or platform deep-link is future work.)
- **Block**: two independent **global** account toggles (`block_chat`,
`block_friend_requests`) **plus** a **per-user block list**. A per-user block is
applied mutually: it hides the pair's chat from each other and refuses friend
@@ -395,25 +393,25 @@ English game the Latin pool.
and **validated on input** — links, email addresses and phone numbers (including
lightly obfuscated forms) are rejected, since the chat is for quick reactions,
not contact exchange. Each message stores the sender's IP (forwarded by the
gateway in Stage 6) for moderation. A sender who has disabled chat cannot post,
gateway) for moderation. A sender who has disabled chat cannot post,
and messages from a blocked sender are hidden from the viewer. The operator console
has a **Messages** section (Stage 17) that lists posted messages (nudges excluded)
has a **Messages** section that lists posted messages (nudges excluded)
newest-first with the sender's resolved name, **source** (guest / robot / oldest
identity kind), IP and game, searchable by sender name / external-id glob masks and
pinnable to one game or sender (linked from the game and user cards).
- **Nudge**: folded into the chat as a `nudge` message kind. The player awaiting
the opponent may nudge **once per hour per game**; it is not allowed on one's own
turn. The platform-native delivery is wired with the gateway / platform
side-service (Stage 6 / 8).
turn. The platform-native delivery runs through the gateway and the platform
side-service.
- **Profile**: `preferred_language` (en/ru, edited in Settings), display name, email
(confirm-code binding, see §4), **timezone**, the daily **away window** and the
block toggles — all editable through `account.UpdateProfile`, which validates them
(Stage 8): a display name is Unicode letters joined by single ` `/`.`/`_`
block toggles — all editable through `account.UpdateProfile`, which validates them:
a display name is Unicode letters joined by single ` `/`.`/`_`
separators (no leading/trailing/adjacent separators, ≤ 32 runes); the timezone is a
fixed `±HH:MM` **UTC offset** (or a legacy IANA name) resolved by `account.ResolveZone`
for the sweeper and the robot's sleep (a fixed offset trades DST for a simple
picker); the away window is at most **12 h** (midnight-wrap aware). Linked platform
accounts and merge are Stage 11.
accounts and merge are covered in §4.
## 9. Persistence
@@ -423,23 +421,22 @@ English game the Latin pool.
into `internal/postgres/jet` and committed, regenerated by `cmd/jetgen`).
Migrations are embedded SQL applied with `pressly/goose/v3` at startup. Primary
keys are application-generated **UUIDv7**.
- Tables: `accounts` (durable internal accounts; Stage 3 added the away-window
columns `away_start`/`away_end` and the hint wallet `hint_balance`; Stage 6's
migration `00005` added the `is_guest` flag for ephemeral guest rows; Stage 9's
migration `00007` added the `notifications_in_app_only` out-of-app push toggle;
Stage 11's migration `00009` added the `paid_account` service flag and the
merge-tombstone columns `merged_into`/`merged_at`),
`identities` (platform/email/robot identities, unique `(kind, external_id)`;
Stage 5's migration `00004` admits the `robot` kind),
`sessions` (revoke-only opaque-token hashes), the Stage 3 game tables
`games` (Stage 4 added the `dropout_tiles` disposition column), `game_players`,
- Tables: `accounts` (durable internal accounts, carrying the away-window
columns `away_start`/`away_end`, the hint wallet `hint_balance`, the `is_guest`
flag for ephemeral guest rows, the `notifications_in_app_only` out-of-app push
toggle, the `paid_account` service flag and the merge-tombstone columns
`merged_into`/`merged_at`),
`identities` (platform/email/robot identities, unique `(kind, external_id)`,
the `kind` admitting `robot`),
`sessions` (revoke-only opaque-token hashes), the game tables
`games` (carrying the `dropout_tiles` disposition column), `game_players`,
`game_moves` (the move journal), `complaints` and `account_stats`, and the
Stage 4 social/lobby tables `friendships` (the request/accept graph), `blocks`
social/lobby tables `friendships` (the request/accept graph, its status admitting
`declined`), `blocks`
(per-user blocks), `chat_messages` (per-game chat and nudges), `email_confirmations`
(pending confirm-codes) and `game_invitations` / `game_invitation_invitees`
(friend-game invitations). Stage 8's migration `00006` widened the `friendships`
status to admit `declined` and added `friend_codes` (one-time add-a-friend codes).
Stage 17 added `game_drafts` (a player's in-progress rack order + board composition per
(pending confirm-codes), `game_invitations` / `game_invitation_invitees`
(friend-game invitations), `friend_codes` (one-time add-a-friend codes),
`game_drafts` (a player's in-progress rack order + board composition per
game) and `game_hidden` (`(account_id, game_id)` rows that drop a finished game from one
account's own lobby list, leaving it visible to the other players — finished-only and
irreversible by design, so there is no un-hide).
@@ -479,18 +476,18 @@ exposes none): the standard Poslfit dialect (UTF-8, `#player`/`#lexicon`
pragmas, `8G`/`H8` coordinates, lower-case blanks, `.` pass-throughs, `-TILES`
exchanges), plus `#note` lines for resignations and timeouts, which the standard
does not cover. **GCG export is offered only on a finished game** (`game.ErrGameActive`
otherwise, Stage 8), so an in-progress journal is never leaked mid-play; the client
otherwise), so an in-progress journal is never leaked mid-play; the client
shares the `.gcg` file via the Web Share API where available, else downloads it.
The Stage 13 alphabet-on-the-wire change does **not** touch this invariant: the live edge
The alphabet-on-the-wire transport does **not** touch this invariant: the live edge
exchanges alphabet indices, but the persisted journal (and everything derived from it —
replay, history, GCG) keeps the decoded concrete letters described above, so an archived
game still replays with the variant's `rules.Alphabet` alone, independent of any dictionary.
## 10. Notifications
Two channels: the **in-app live stream** (delivered from Stage 6) and
**platform-native push** (out-of-app, via the platform side-service — Stage 9).
Two channels: the **in-app live stream** and
**platform-native push** (out-of-app, via the platform side-service).
The backend emits notification intents through an in-process hub
(`internal/notify`, a `Publisher` seam installed on the game, social and lobby
services); a single backend→gateway **gRPC server-stream** (`Push.Subscribe`,
@@ -501,16 +498,16 @@ robot-driver and timeout-sweeper moves emit too; opponent-moved goes to **every
including the mover**, so the mover's own other devices and their lobby refresh — it is
in-app only, so the actor gets no out-of-app push for their own move), **chat-message** and **nudge**
(from the social service), **match-found** (from the matchmaker, §8), and **notify**
(Stage 8 — a lightweight "re-poll" signal carrying a sub-kind: friend-request,
(a lightweight "re-poll" signal carrying a sub-kind: friend-request,
friend-added, friend-declined, invitation or game-started; emitted on a friend-request,
on answering one (accept → friend-added, decline → friend-declined — to the original
requester, so a game screen watching that opponent re-derives its "add to friends" state,
Stage 17), and on an invitation create or its game start). Stage 17 added **game-over** (emitted to every
requester, so a game screen watching that opponent re-derives its "add to friends" state),
and on an invitation create or its game start). **game-over** is emitted to every
seat from the same game commit when a game finishes — any path: a closing play, all-pass,
resign or timeout) and **enriched your-turn** so the out-of-app push reads in full: it now
resign or timeout and **your-turn** is enriched so the out-of-app push reads in full: it
also carries the mover's display name, their last action and the main word of a scoring play,
and a **recipient-first** running score line (e.g. `120:95:80`, the reader's score first).
**R4 enriched the in-app stream into a delta channel** so the client renders from the event
The in-app stream is a **delta channel** so the client renders from the event
without a follow-up `game.state`: **opponent-moved** carries the committed move plus the post-move
summary (per-seat scores, whose turn, move count, status) and the bag size, which the client
applies to its per-game cache keyed on the **move count** — idempotent (a re-delivered or own-move
@@ -526,12 +523,12 @@ verbatim. A client that is not currently streaming falls back to the matchmaker'
match-found — the client polls **only while the stream is down**, since a live stream delivers
match-found itself; for the lobby **notification badge** (incoming friend requests + open
invitations) the client re-polls on the `notify` event and on lobby open / focus, covering a push
missed while the app was hidden. **Out-of-app platform push** (Stage 9) is a fallback
missed while the app was hidden. **Out-of-app platform push** is a fallback
the **gateway** routes from the same firehose: for an event whose recipient has **no
live in-app stream** it resolves the backend `/internal/push-target` (their Telegram
`external_id`, the **service language** — the bot they last signed in through, falling
back to the interface language — and the `notifications_in_app_only` flag). A **game** event,
however, carries the **game's own language** on the push (Stage 17), and the gateway routes by
however, carries the **game's own language** on the push, and the gateway routes by
that instead of the service language — so a game's notification always comes from the game's bot,
not the recipient's latest-login bot. It then asks the **Telegram connector** to deliver a
localized message with a Mini App deep-link button — only when the recipient has a Telegram
@@ -560,19 +557,19 @@ promotions) is future work and would deliver short markdown messages (text + lin
and the gateway↔connector calls. The OTLP **Collector** (OTLP/gRPC → Prometheus
metrics + Tempo traces), **Prometheus** (15d), **Tempo** (72h) and **Grafana**
(provisioned datasources + dashboards, behind the caddy `/_gm/grafana` Basic-Auth)
are stood up with the deploy (`deploy/`, Stage 16); the default exporter stays
are stood up with the deploy (`deploy/`); the default exporter stays
`none`, so CI needs no collector. The contour also runs **cAdvisor** (per-container
CPU/memory/network) and **postgres_exporter** (connections, cache-hit ratio,
transactions, db size), scraped by Prometheus and surfaced on the **Scrabble —
Resources** Grafana dashboard (R2), so the pre-release stress runs capture a resource
Resources** Grafana dashboard, which captures a resource
baseline; these export directly in Prometheus format (not through the collector).
- Per-request server-side timing via gin middleware from day one (the access log
carries method, route, status, latency and the active trace id). A
client-measured RTT piggybacked on the next request is a later enhancement.
- Domain/operational metrics (Stage 12), recorded through the meter and invisible
- Domain/operational metrics, recorded through the meter and invisible
until an exporter is configured: histograms `game_replay_duration` (journal
rebuild on a cache miss), `game_move_validate_duration` and `game_move_duration`
(Stage 17 — a seat's think time per committed move, attributed by `variant` and a
(a seat's think time per committed move, attributed by `variant` and a
`phase` of opening/middle/endgame; it aggregates **all** seats including robots,
whose synthetic timing dominates the tail, so per-human analysis lives in the admin
console, below); counters `games_started_total`, `games_abandoned_total` (a
@@ -581,18 +578,18 @@ promotions) is future work and would deliver short markdown messages (text + lin
`edge_request_duration` (the UI-perceived roundtrip, by `message_type`/`result`);
and Go runtime/heap metrics. Game-scoped metrics carry a `variant` attribute
(scrabble_en/scrabble_ru/erudit_ru).
- Per-user move-time analytics (Stage 17) are **offline**, derived in the admin
- Per-user move-time analytics are **offline**, derived in the admin
console from the move journal (`game_moves.created_at` deltas, the first move from
the game's creation), not Prometheus labels (which an `account_id` would explode):
the user list shows each account's min/avg/max think time, and the user-detail page
draws a zero-JS inline-SVG chart of min/mean/max by the player's move number.
- User metrics (Stage 16): a backend counter `accounts_created_total` (`kind` =
- User metrics: a backend counter `accounts_created_total` (`kind` =
telegram/email/guest; robots are a provisioned pool, not users, and are excluded)
and a gateway **in-memory** observable gauge `active_users` (`window` = 24h/7d) —
distinct accounts that performed an authenticated edge action in the window. The
gauge is single-process by design (single-instance MVP, §10): it is correct for one
gateway, resets on restart, and is a live operational figure, not a billing count.
- **Rate-limit observability (R3):** every limiter rejection increments the gateway
- **Rate-limit observability:** every limiter rejection increments the gateway
counter `gateway_rate_limited_total` (`class` = user/public/email/admin — aggregate
only, honouring the no-per-user-label discipline above) and logs one **Debug** line;
a gateway reporter drains the per-key rejection tracker every 30 s, emits one **Warn**
@@ -617,19 +614,19 @@ promotions) is future work and would deliver short markdown messages (text + lin
| Concern | Enforced by |
| --- | --- |
| Public rate limiting / anti-abuse | gateway (per-IP public/email/admin classes, per-user authenticated class; a request body cap of `GATEWAY_MAX_BODY_BYTES`; rejections are metered, summarised to the backend and surfaced in the admin console with a conservative reversible auto-flag — R3, §11) |
| Public rate limiting / anti-abuse | gateway (per-IP public/email/admin classes, per-user authenticated class; a request body cap of `GATEWAY_MAX_BODY_BYTES`; rejections are metered, summarised to the backend and surfaced in the admin console with a conservative reversible auto-flag — §11) |
| Telegram initData validation (bot-token HMAC) | the Telegram connector; the gateway delegates it over gRPC, so the bot token lives only in the connector |
| Session minting; email-code / guest validation | gateway (with backend) |
| Session → `user_id` resolution, `X-User-ID` injection | gateway |
| Authorisation, ownership, state transitions | backend (`X-User-ID` is the sole identity input) |
| Admin authentication | a single Basic-Auth gate on `/_gm/*`, forwarded **verbatim** to the backend's server-rendered admin console (and, in the deployed contour, routing `/_gm/grafana/*` to Grafana). In the deploy the **caddy** owns this gate (§13); a local non-caddy run uses the gateway's own `GATEWAY_ADMIN_*` proxy, which the per-IP admin limiter class guards ahead of its Basic-Auth (R3) — the caddy-fronted path has no limiter (stock caddy), an accepted gap. The backend trusts the proxy (no admin principal) and guards its state-changing POSTs with a **same-origin** check — the console's CSRF defence. No operator identity is tracked |
| Admin authentication | a single Basic-Auth gate on `/_gm/*`, forwarded **verbatim** to the backend's server-rendered admin console (and, in the deployed contour, routing `/_gm/grafana/*` to Grafana). In the deploy the **caddy** owns this gate (§13); a local non-caddy run uses the gateway's own `GATEWAY_ADMIN_*` proxy, which the per-IP admin limiter class guards ahead of its Basic-Auth — the caddy-fronted path has no limiter (stock caddy), an accepted gap. The backend trusts the proxy (no admin principal) and guards its state-changing POSTs with a **same-origin** check — the console's CSRF defence. No operator identity is tracked |
| backend ↔ gateway ↔ connector trust | the network (only gateway may reach backend; the connector serves unauthenticated gRPC on the internal segment) |
This is an explicit, accepted MVP risk: compromise of the gateway↔backend
network segment defeats backend authentication. Mitigated by network isolation;
mutual auth is a future hardening step.
**Short numeric codes** (email confirm-codes and Stage 8 friend codes) are stored
**Short numeric codes** (email confirm-codes and friend codes) are stored
only as SHA-256 hashes and are short-lived and single-use. The unauthenticated
email path carries a tight per-IP sub-limit (5 / 10 min); the **friend-code redeem**
is authenticated, so it rides the per-user limit (300 / min) and is further bounded
@@ -645,7 +642,7 @@ Single public origin, path-routed. The Vite build has two entries: a lightweight
(`go:embed`, baked in by a node stage in `gateway/Dockerfile`) and serves it at
`/app/` (web) and `/telegram/` (the Telegram Mini App; outside Telegram that path
redirects to the root — the client-side guard); a stray hit on the gateway's `/`
308-redirects to `/app/`. The **landing** ships in its own static container (R3): the
308-redirects to `/app/`. The **landing** ships in its own static container: the
`landing` target of `gateway/Dockerfile` (caddy:2-alpine + the same Vite build,
`deploy/landing/Caddyfile`) serves it at `/`, so stray public traffic is absorbed by
static file serving and never reaches the Go edge. Hash-named `/assets/*` are served
@@ -671,23 +668,23 @@ network (project-scoped DNS); only caddy joins the shared external `edge` networ
(alias `scrabble`).
Two contours, two secret/variable prefixes (`TEST_` / `PROD_`):
- **Test** (Stage 16): auto-deploys on a PR into — or a push to — `development`
- **Test**: auto-deploys on a PR into — or a push to — `development`
(`.gitea/workflows/ci.yaml``docker compose up -d --build` on the Gitea runner
host, then `GET /` + `GET /app/` probes through caddy — the landing container and
the gateway, R3). The host caddy terminates TLS and
the gateway). The host caddy terminates TLS and
forwards the domain to `scrabble:80`, so the in-compose caddy serves plain HTTP
(`CADDY_SITE_ADDRESS=:80`). The in-compose caddy **trusts X-Forwarded-For from
private-range upstreams** (`trusted_proxies private_ranges`), so the real client IP —
used for chat-moderation logging and the gateway's per-IP rate limiting — survives the
host-caddy hop; in prod (no host caddy) public clients are untrusted and Caddy uses the
real peer, so the single config is correct and spoof-safe in both contours (Stage 17).
- **Prod** (Stage 18): a manual SSH deploy after `development → master`. There is no
real peer, so the single config is correct and spoof-safe in both contours.
- **Prod**: a manual SSH deploy after `development → master`. There is no
host caddy, so the contour ships its own caddy terminating TLS — set
`CADDY_SITE_ADDRESS` to the domain and the caddy does its own ACME.
## 14. CI & branches
- **Two long-lived branches** (Stage 16): **`development`** is the integration
- **Two long-lived branches**: **`development`** is the integration
trunk and **`master`** the production trunk; `feature/*` branches are cut from
`development` and PR back into it (the genesis commit necessarily landed on
`master`). A commit to a `feature/*` branch triggers nothing.
@@ -695,18 +692,18 @@ Two contours, two secret/variable prefixes (`TEST_` / `PROD_`):
suite on a PR into `development`/`master` and on a push to `development`. Its
`unit` (gofmt/vet/build/unit-test), `integration` (Postgres-backed `integration`
tag, testcontainers `postgres:17-alpine`, Ryuk off, serial) and `ui`
(check/unit/build/bundle-budget/e2e) jobs are **path-conditional** (Stage 17 — a
(check/unit/build/bundle-budget/e2e) jobs are **path-conditional** (a
`changes` job filters by changed paths), and an always-running **`gate`** job
aggregates them (passing when each succeeded or was **skipped**) and is the single
branch-protection required check (`CI / gate`), so a path-skipped job never blocks
a merge.
- A gated **`deploy`** job auto-rolls the **test contour** on a PR into — or a push
to — `development` (`docker compose up -d --build` on the runner host), then probes
the gateway (`GET /`) **and the Telegram connector's liveness** (Stage 17 —
the gateway (`GET /`) **and the Telegram connector's liveness** (via
`docker inspect`: running, not restarting, stable restart count, with a
VPN-handshake grace period, since the connector has no public ingress and a
crash-loop is otherwise invisible). A PR into `master` is test-only; the prod
deploy is the manual Stage 18 workflow. Secrets/variables are prefixed
deploy is the manual workflow. Secrets/variables are prefixed
`TEST_`/`PROD_` per contour.
- The engine consumes `scrabble-solver` as a **published, versioned module**
(`gitea.iliadenisov.ru/developer/scrabble-solver`, pinned in `backend/go.mod`); both Go
@@ -714,6 +711,6 @@ Two contours, two secret/variable prefixes (`TEST_` / `PROD_`):
(no public proxy/checksum DB, no sibling clone). The dictionaries ship as a **release
artifact** from the `scrabble-dictionary` repo; the workflows download
`scrabble-dawg-<DICT_VERSION>.tar.gz` and point the engine tests at it via
`BACKEND_DICT_DIR` (TODO-1/TODO-2 discharged in Stage 14).
`BACKEND_DICT_DIR`.
- After any push, the run is watched to green before a stage is declared done
(`python3 ~/.claude/bin/gitea-ci-watch.py`).