R6(a): de-stage code, docs, READMEs; split stage6_test
Mechanical, behaviour-preserving removal of Stage N / TODO-N / phase (RN) references from comments, doc-comments, service READMEs, the current-state docs (ARCHITECTURE, FUNCTIONAL+_ru, TESTING, UI_DESIGN), config-file comments, and the .fbs/.proto schema comments. PLAN.md / PRERELEASE.md / CLAUDE.md keep the stage history. - Rename the only stage-named identifiers: registerStage8 -> registerSocialOps, registerStage11 -> registerLinkOps (gateway transcode). - Split stage6_test.go: TestEmailLoginFlow -> email_test.go, TestGuestAutoMatchLeavesNoStats (+ provisionGuest) -> account_test.go. - Regenerated proto bindings (push.pb.go, telegram_grpc.pb.go) from the de-staged .proto comments; FB Go/TS bindings unchanged (flatc strips schema comments). go build/vet/gofmt clean across modules; integration typecheck and pnpm check green.
This commit is contained in:
+74
-77
@@ -28,10 +28,10 @@ Three executables plus per-platform side-services:
|
||||
- **`ui`** — pure-HTML5 client (plain Svelte 5 + TypeScript + Vite, static build;
|
||||
no SvelteKit). Talks to `backend` only through `gateway` over Connect-RPC +
|
||||
FlatBuffers, with the edge TS bindings generated from the **same** `edge.proto`
|
||||
and `scrabble.fbs` and committed under `ui/src/gen/`. The **playable slice**
|
||||
(Stage 7) covers auth, "my games", auto-match, the board (play/pass/exchange/
|
||||
and `scrabble.fbs` and committed under `ui/src/gen/`. The client covers auth,
|
||||
"my games", auto-match, the board (play/pass/exchange/
|
||||
resign), hint, word-check, chat/nudge, the live stream, i18n (en/ru) and a profile
|
||||
view; the social/account/history surfaces follow in Stage 8. There is no board on
|
||||
view, plus the social/account/history surfaces. There is no board on
|
||||
the wire — the client **reconstructs the 15×15 board by replaying the move
|
||||
journal** (§9.1) and renders board, tiles, premium squares and effects as pure
|
||||
CSS + Unicode (no image/font/SVG assets). Tiles are placed by Pointer-Events drag
|
||||
@@ -41,7 +41,7 @@ Three executables plus per-platform side-services:
|
||||
Embeddable in platform webviews; packageable to native (iOS/Android) via Capacitor.
|
||||
The client uses a mobile-app shell (a growing nav bar; content pinned to the bottom),
|
||||
a one-line **announcement banner** under the nav (a client-side mock rotation, **gated off
|
||||
in the build until polished after release**, Stage 17 — a server-driven channel later, §10),
|
||||
in the build until polished after release** — a server-driven channel later, §10),
|
||||
and a client **board-style** setting (bonus-label
|
||||
mode). The visual/interaction design system is documented in
|
||||
[`UI_DESIGN.md`](UI_DESIGN.md).
|
||||
@@ -89,7 +89,7 @@ dropped). Horizontal scaling is explicit future work.
|
||||
auth operations are unauthenticated and return the minted token. A unary
|
||||
operation's domain outcome rides back in `ExecuteResponse.result_code` (HTTP
|
||||
200); only edge failures (rate limit, missing session, unknown type, internal)
|
||||
surface as Connect error codes. The client (Stage 17) treats a connectivity edge failure as
|
||||
surface as Connect error codes. The client treats a connectivity edge failure as
|
||||
**state, not a per-call toast**: a transport `unavailable` or a `rate_limited` flips a global
|
||||
`online` signal that drives a header **"Connecting…"** spinner and softly disables proactive
|
||||
actions, and the transport **auto-retries with capped exponential backoff** — every op on a
|
||||
@@ -98,7 +98,7 @@ dropped). Horizontal scaling is explicit future work.
|
||||
response was lost — its button is disabled while offline and the player re-issues it on
|
||||
reconnect). A reachability watcher (a lightweight `profile.get` probe) clears the signal when no
|
||||
other traffic is in flight; the live `Subscribe` stream's drop/recovery feeds the same signal.
|
||||
**Edge hardening (R3):** every request body on the public listener is capped at
|
||||
**Edge hardening:** every request body on the public listener is capped at
|
||||
`GATEWAY_MAX_BODY_BYTES` (default 1 MiB — far above any legitimate payload), both at the HTTP
|
||||
layer (`http.MaxBytesReader`) and as the Connect per-message read limit, so an oversized
|
||||
`Execute` is refused (`resource_exhausted`) without buffering. The h2c server carries explicit
|
||||
@@ -106,8 +106,7 @@ dropped). Horizontal scaling is explicit future work.
|
||||
`Subscribe` stream plus a few unary calls) and a 3-minute connection `IdleTimeout` (a live
|
||||
`Subscribe` stream keeps its connection active, so only abandoned connections are reaped); the
|
||||
`http.Server` sets only `ReadHeaderTimeout` (10 s) — Read/WriteTimeout would kill the stream.
|
||||
R7 revisits the exact values under load.
|
||||
- **Alphabet on the wire (Stage 13)**: live play exchanges **alphabet indices**, not
|
||||
- **Alphabet on the wire**: live play exchanges **alphabet indices**, not
|
||||
concrete letters. The rack (`StateView.rack`), the `SubmitPlay`/`Evaluate` tiles, the
|
||||
`Exchange` tiles and the `CheckWord` word are `ubyte` indices into the variant's alphabet
|
||||
(a blank is the sentinel index **255**). The client is **alphabet-agnostic**: on a
|
||||
@@ -139,7 +138,7 @@ arrive from a platform rather than completing a mandatory registration).
|
||||
bootstrap — then mints a **thin opaque server session token** (`session_id`). First
|
||||
Telegram contact seeds the new account's language (from the launch `language_code`)
|
||||
and display name (§4).
|
||||
- **Service language & variant gating (Stage 15).** The connector hosts **one bot per
|
||||
- **Service language & variant gating.** The connector hosts **one bot per
|
||||
service language** (`en`/`ru`), each its own token + game channel; the same Telegram
|
||||
user id spans both. `ValidateInitData` tries each token in turn and returns the
|
||||
validating bot's **service language** and its **supported-languages set**. The set
|
||||
@@ -151,7 +150,7 @@ arrive from a platform rather than completing a mandatory registration).
|
||||
**persisted** per account (`accounts.service_language`, updated on every Telegram
|
||||
login — last-login-wins) and routes the user's out-of-app push back through the right
|
||||
bot (§10) — **except a game event, which routes by the game's own language** (its variant →
|
||||
en/ru, Stage 17), so a game's notification always comes from the game's bot rather than the
|
||||
en/ru), so a game's notification always comes from the game's bot rather than the
|
||||
recipient's latest login bot. The service language is distinct from `preferred_language` (the
|
||||
interface language) and from a game's variant language. Non-Telegram logins (web / email / guest) carry the
|
||||
gateway's default set (`GATEWAY_DEFAULT_SUPPORTED_LANGUAGES`, all variants by default).
|
||||
@@ -170,7 +169,7 @@ arrive from a platform rather than completing a mandatory registration).
|
||||
require one, the same way the robot pool is durable), not a profile: no
|
||||
friends, statistics or history are kept for it, and it is restricted to
|
||||
auto-match. Platform and email users are auto-provisioned **durable** accounts
|
||||
with an identity. (Reaping abandoned guest rows is deferred — PLAN.md TODO-3.)
|
||||
with an identity.
|
||||
|
||||
## 4. Accounts, identities, linking & merge
|
||||
|
||||
@@ -179,15 +178,15 @@ arrive from a platform rather than completing a mandatory registration).
|
||||
a platform auto-provisions a durable account bound to that platform identity.
|
||||
Concretely, platform and email identities share one `identities` table keyed by
|
||||
a unique `(kind, external_id)`; email is an identity with `kind=email` and a
|
||||
`confirmed` flag. A synthetic `kind='robot'` identity (Stage 5) backs each pooled
|
||||
robot opponent (§7). The **email confirm-code flow** (Stage 4) binds an email to the
|
||||
`confirmed` flag. A synthetic `kind='robot'` identity backs each pooled
|
||||
robot opponent (§7). The **email confirm-code flow** binds an email to the
|
||||
authenticated account: a 6-digit code (stored only as a SHA-256 hash, 15-minute
|
||||
TTL, ≤ 5 attempts) is sent through a `Mailer` seam (an SMTP relay, or a
|
||||
development log mailer when none is configured) and, once verified, attaches a
|
||||
confirmed email identity. Accounts and identities use application-generated
|
||||
**UUIDv7** primary keys. A service flag `paid_account` (lifetime one-time
|
||||
payment; no purchase flow yet) is carried on the account and ORed on a merge.
|
||||
- **Linking** (Stage 11) is initiated from an authenticated profile and proves
|
||||
- **Linking** is initiated from an authenticated profile and proves
|
||||
control of the identity before attaching it: **email** through the confirm-code
|
||||
flow, **Telegram** through the web **Login Widget** (validated by the connector,
|
||||
HMAC under `SHA-256(bot_token)` — distinct from Mini App initData; the gateway
|
||||
@@ -210,7 +209,7 @@ arrive from a platform rather than completing a mandatory registration).
|
||||
already has a **durable** owner: then the durable account wins, the guest's active
|
||||
games move into it, the guest is retired, and a **fresh session** is minted for the
|
||||
durable account (the client switches to it). The secondary's sessions are revoked
|
||||
(§3). High blast-radius; an isolated, well-tested stage.
|
||||
(§3). High blast-radius; isolated and well-tested.
|
||||
|
||||
## 5. Game engine integration (`scrabble-solver`)
|
||||
|
||||
@@ -241,8 +240,7 @@ Key points:
|
||||
boot version plus each subdirectory). In-flight games keep their pinned version;
|
||||
new games use the latest. (The solver is published as a versioned module and the
|
||||
dictionaries ship as a separate versioned **release artifact** from the
|
||||
`scrabble-dictionary` repo — TODO-1/TODO-2, Stage 14; the runtime contract above is
|
||||
unchanged.)
|
||||
`scrabble-dictionary` repo; the runtime contract above is unchanged.)
|
||||
- Move generation/validation/scoring use `Solver.GenerateMoves` (ranked),
|
||||
`Solver.ValidatePlay` and `Solver.ScorePlay`; board mutation uses
|
||||
`scrabble.Apply`. The engine adds its own deterministic, seeded tile **bag**
|
||||
@@ -258,7 +256,7 @@ Key points:
|
||||
unconditionally the other player in a two-player game. A player may resign **on the
|
||||
opponent's turn** (a forfeit is not a turn-scoped move): `engine.ResignSeat(seat)`
|
||||
resigns that player's own seat whoever is to move, and the game domain skips the turn
|
||||
check for resign (Stage 17). The engine exposes a
|
||||
check for resign. The engine exposes a
|
||||
decoded, solver-free API (`SubmitPlay`/`SubmitExchange`/`EvaluatePlay`/
|
||||
`HintView`/`Hand`) so `internal/game` drives it without importing the solver.
|
||||
- The **game domain** (`internal/game`) owns everything the engine does not —
|
||||
@@ -326,7 +324,7 @@ behaviour on every scan and after a restart — the same philosophy as journal
|
||||
replay. A pool of durable accounts — each a `kind='robot'` identity (§4), keyed
|
||||
`robot-<lang>-<index>` and provisioned at startup with **chat blocked but friend
|
||||
requests open** — a request to a robot is accepted as pending and expires unanswered
|
||||
(the robot never responds), mirroring a human who ignores it (Stage 17); the chat
|
||||
(the robot never responds), mirroring a human who ignores it; the chat
|
||||
block backs the human-like names (there is no DM surface; chat is per-game). Names are
|
||||
**composed per language** from a first-name pool (32 full + 32 colloquial forms) and
|
||||
a surname pool (gender-agreed for Russian) in one of three forms (first only /
|
||||
@@ -352,12 +350,12 @@ English game the Latin pool.
|
||||
rather than running anti-phase; on a daytime nudge it replies near the move's lower
|
||||
band; it proactively nudges the idle human on a **lengthening, randomized schedule** — the
|
||||
first ~60-90 min into the turn, each later reminder spaced further out toward 1-6 h — so a long
|
||||
wait gets a handful of increasingly-spaced nudges rather than an hourly stream (Stage 17).
|
||||
wait gets a handful of increasingly-spaced nudges rather than an hourly stream.
|
||||
- **Observability**: robot accounts accrue ordinary statistics (§9) — the
|
||||
authoritative balance metric (target ≈ 40% robot wins) — and a
|
||||
`robot_games_finished_total` OTel counter plus a per-finish log give a live view.
|
||||
The **admin game card** surfaces each robot seat's per-game play-to-win intent (from
|
||||
the seed) and, on the robot's turn, its deterministic **next-move ETA** (Stage 17).
|
||||
the seed) and, on the robot's turn, its deterministic **next-move ETA**.
|
||||
|
||||
## 8. Lobby & social
|
||||
|
||||
@@ -371,8 +369,8 @@ English game the Latin pool.
|
||||
`Poll` remains as a fallback for a client that is not currently streaming.
|
||||
**Cancel** (`POST /lobby/cancel`) removes the player from the pool and drops any
|
||||
pending matched result, so a cancelled quick-match is dequeued rather than left for
|
||||
the reaper to robot-substitute (Stage 17).
|
||||
- **Friends** (Stage 8): two add paths over one `friendships` table. A **one-time
|
||||
the reaper to robot-substitute.
|
||||
- **Friends**: two add paths over one `friendships` table. A **one-time
|
||||
code** the to-be-added player issues (a `friend_codes` row: 6-digit numeric,
|
||||
SHA-256-hashed, **12 h** TTL, one live code per issuer, single-use, redeem
|
||||
rate-limited) is redeemed by the other player to become friends immediately.
|
||||
@@ -382,7 +380,7 @@ English game the Latin pool.
|
||||
remembered (`status='declined'`) and blocks further requests from that sender,
|
||||
unless they hand them a code, which overrides it. The requester's own cancel still
|
||||
deletes the row; blocking someone severs an existing friendship. (Discovery by
|
||||
friend list or platform deep-link arrives with Stage 9 / TODO-5.)
|
||||
friend list or platform deep-link is future work.)
|
||||
- **Block**: two independent **global** account toggles (`block_chat`,
|
||||
`block_friend_requests`) **plus** a **per-user block list**. A per-user block is
|
||||
applied mutually: it hides the pair's chat from each other and refuses friend
|
||||
@@ -395,25 +393,25 @@ English game the Latin pool.
|
||||
and **validated on input** — links, email addresses and phone numbers (including
|
||||
lightly obfuscated forms) are rejected, since the chat is for quick reactions,
|
||||
not contact exchange. Each message stores the sender's IP (forwarded by the
|
||||
gateway in Stage 6) for moderation. A sender who has disabled chat cannot post,
|
||||
gateway) for moderation. A sender who has disabled chat cannot post,
|
||||
and messages from a blocked sender are hidden from the viewer. The operator console
|
||||
has a **Messages** section (Stage 17) that lists posted messages (nudges excluded)
|
||||
has a **Messages** section that lists posted messages (nudges excluded)
|
||||
newest-first with the sender's resolved name, **source** (guest / robot / oldest
|
||||
identity kind), IP and game, searchable by sender name / external-id glob masks and
|
||||
pinnable to one game or sender (linked from the game and user cards).
|
||||
- **Nudge**: folded into the chat as a `nudge` message kind. The player awaiting
|
||||
the opponent may nudge **once per hour per game**; it is not allowed on one's own
|
||||
turn. The platform-native delivery is wired with the gateway / platform
|
||||
side-service (Stage 6 / 8).
|
||||
turn. The platform-native delivery runs through the gateway and the platform
|
||||
side-service.
|
||||
- **Profile**: `preferred_language` (en/ru, edited in Settings), display name, email
|
||||
(confirm-code binding, see §4), **timezone**, the daily **away window** and the
|
||||
block toggles — all editable through `account.UpdateProfile`, which validates them
|
||||
(Stage 8): a display name is Unicode letters joined by single ` `/`.`/`_`
|
||||
block toggles — all editable through `account.UpdateProfile`, which validates them:
|
||||
a display name is Unicode letters joined by single ` `/`.`/`_`
|
||||
separators (no leading/trailing/adjacent separators, ≤ 32 runes); the timezone is a
|
||||
fixed `±HH:MM` **UTC offset** (or a legacy IANA name) resolved by `account.ResolveZone`
|
||||
for the sweeper and the robot's sleep (a fixed offset trades DST for a simple
|
||||
picker); the away window is at most **12 h** (midnight-wrap aware). Linked platform
|
||||
accounts and merge are Stage 11.
|
||||
accounts and merge are covered in §4.
|
||||
|
||||
## 9. Persistence
|
||||
|
||||
@@ -423,23 +421,22 @@ English game the Latin pool.
|
||||
into `internal/postgres/jet` and committed, regenerated by `cmd/jetgen`).
|
||||
Migrations are embedded SQL applied with `pressly/goose/v3` at startup. Primary
|
||||
keys are application-generated **UUIDv7**.
|
||||
- Tables: `accounts` (durable internal accounts; Stage 3 added the away-window
|
||||
columns `away_start`/`away_end` and the hint wallet `hint_balance`; Stage 6's
|
||||
migration `00005` added the `is_guest` flag for ephemeral guest rows; Stage 9's
|
||||
migration `00007` added the `notifications_in_app_only` out-of-app push toggle;
|
||||
Stage 11's migration `00009` added the `paid_account` service flag and the
|
||||
merge-tombstone columns `merged_into`/`merged_at`),
|
||||
`identities` (platform/email/robot identities, unique `(kind, external_id)`;
|
||||
Stage 5's migration `00004` admits the `robot` kind),
|
||||
`sessions` (revoke-only opaque-token hashes), the Stage 3 game tables
|
||||
`games` (Stage 4 added the `dropout_tiles` disposition column), `game_players`,
|
||||
- Tables: `accounts` (durable internal accounts, carrying the away-window
|
||||
columns `away_start`/`away_end`, the hint wallet `hint_balance`, the `is_guest`
|
||||
flag for ephemeral guest rows, the `notifications_in_app_only` out-of-app push
|
||||
toggle, the `paid_account` service flag and the merge-tombstone columns
|
||||
`merged_into`/`merged_at`),
|
||||
`identities` (platform/email/robot identities, unique `(kind, external_id)`,
|
||||
the `kind` admitting `robot`),
|
||||
`sessions` (revoke-only opaque-token hashes), the game tables
|
||||
`games` (carrying the `dropout_tiles` disposition column), `game_players`,
|
||||
`game_moves` (the move journal), `complaints` and `account_stats`, and the
|
||||
Stage 4 social/lobby tables `friendships` (the request/accept graph), `blocks`
|
||||
social/lobby tables `friendships` (the request/accept graph, its status admitting
|
||||
`declined`), `blocks`
|
||||
(per-user blocks), `chat_messages` (per-game chat and nudges), `email_confirmations`
|
||||
(pending confirm-codes) and `game_invitations` / `game_invitation_invitees`
|
||||
(friend-game invitations). Stage 8's migration `00006` widened the `friendships`
|
||||
status to admit `declined` and added `friend_codes` (one-time add-a-friend codes).
|
||||
Stage 17 added `game_drafts` (a player's in-progress rack order + board composition per
|
||||
(pending confirm-codes), `game_invitations` / `game_invitation_invitees`
|
||||
(friend-game invitations), `friend_codes` (one-time add-a-friend codes),
|
||||
`game_drafts` (a player's in-progress rack order + board composition per
|
||||
game) and `game_hidden` (`(account_id, game_id)` rows that drop a finished game from one
|
||||
account's own lobby list, leaving it visible to the other players — finished-only and
|
||||
irreversible by design, so there is no un-hide).
|
||||
@@ -479,18 +476,18 @@ exposes none): the standard Poslfit dialect (UTF-8, `#player`/`#lexicon`
|
||||
pragmas, `8G`/`H8` coordinates, lower-case blanks, `.` pass-throughs, `-TILES`
|
||||
exchanges), plus `#note` lines for resignations and timeouts, which the standard
|
||||
does not cover. **GCG export is offered only on a finished game** (`game.ErrGameActive`
|
||||
otherwise, Stage 8), so an in-progress journal is never leaked mid-play; the client
|
||||
otherwise), so an in-progress journal is never leaked mid-play; the client
|
||||
shares the `.gcg` file via the Web Share API where available, else downloads it.
|
||||
|
||||
The Stage 13 alphabet-on-the-wire change does **not** touch this invariant: the live edge
|
||||
The alphabet-on-the-wire transport does **not** touch this invariant: the live edge
|
||||
exchanges alphabet indices, but the persisted journal (and everything derived from it —
|
||||
replay, history, GCG) keeps the decoded concrete letters described above, so an archived
|
||||
game still replays with the variant's `rules.Alphabet` alone, independent of any dictionary.
|
||||
|
||||
## 10. Notifications
|
||||
|
||||
Two channels: the **in-app live stream** (delivered from Stage 6) and
|
||||
**platform-native push** (out-of-app, via the platform side-service — Stage 9).
|
||||
Two channels: the **in-app live stream** and
|
||||
**platform-native push** (out-of-app, via the platform side-service).
|
||||
The backend emits notification intents through an in-process hub
|
||||
(`internal/notify`, a `Publisher` seam installed on the game, social and lobby
|
||||
services); a single backend→gateway **gRPC server-stream** (`Push.Subscribe`,
|
||||
@@ -501,16 +498,16 @@ robot-driver and timeout-sweeper moves emit too; opponent-moved goes to **every
|
||||
including the mover**, so the mover's own other devices and their lobby refresh — it is
|
||||
in-app only, so the actor gets no out-of-app push for their own move), **chat-message** and **nudge**
|
||||
(from the social service), **match-found** (from the matchmaker, §8), and **notify**
|
||||
(Stage 8 — a lightweight "re-poll" signal carrying a sub-kind: friend-request,
|
||||
(a lightweight "re-poll" signal carrying a sub-kind: friend-request,
|
||||
friend-added, friend-declined, invitation or game-started; emitted on a friend-request,
|
||||
on answering one (accept → friend-added, decline → friend-declined — to the original
|
||||
requester, so a game screen watching that opponent re-derives its "add to friends" state,
|
||||
Stage 17), and on an invitation create or its game start). Stage 17 added **game-over** (emitted to every
|
||||
requester, so a game screen watching that opponent re-derives its "add to friends" state),
|
||||
and on an invitation create or its game start). **game-over** is emitted to every
|
||||
seat from the same game commit when a game finishes — any path: a closing play, all-pass,
|
||||
resign or timeout) and **enriched your-turn** so the out-of-app push reads in full: it now
|
||||
resign or timeout — and **your-turn** is enriched so the out-of-app push reads in full: it
|
||||
also carries the mover's display name, their last action and the main word of a scoring play,
|
||||
and a **recipient-first** running score line (e.g. `120:95:80`, the reader's score first).
|
||||
**R4 enriched the in-app stream into a delta channel** so the client renders from the event
|
||||
The in-app stream is a **delta channel** so the client renders from the event
|
||||
without a follow-up `game.state`: **opponent-moved** carries the committed move plus the post-move
|
||||
summary (per-seat scores, whose turn, move count, status) and the bag size, which the client
|
||||
applies to its per-game cache keyed on the **move count** — idempotent (a re-delivered or own-move
|
||||
@@ -526,12 +523,12 @@ verbatim. A client that is not currently streaming falls back to the matchmaker'
|
||||
match-found — the client polls **only while the stream is down**, since a live stream delivers
|
||||
match-found itself; for the lobby **notification badge** (incoming friend requests + open
|
||||
invitations) the client re-polls on the `notify` event and on lobby open / focus, covering a push
|
||||
missed while the app was hidden. **Out-of-app platform push** (Stage 9) is a fallback
|
||||
missed while the app was hidden. **Out-of-app platform push** is a fallback
|
||||
the **gateway** routes from the same firehose: for an event whose recipient has **no
|
||||
live in-app stream** it resolves the backend `/internal/push-target` (their Telegram
|
||||
`external_id`, the **service language** — the bot they last signed in through, falling
|
||||
back to the interface language — and the `notifications_in_app_only` flag). A **game** event,
|
||||
however, carries the **game's own language** on the push (Stage 17), and the gateway routes by
|
||||
however, carries the **game's own language** on the push, and the gateway routes by
|
||||
that instead of the service language — so a game's notification always comes from the game's bot,
|
||||
not the recipient's latest-login bot. It then asks the **Telegram connector** to deliver a
|
||||
localized message with a Mini App deep-link button — only when the recipient has a Telegram
|
||||
@@ -560,19 +557,19 @@ promotions) is future work and would deliver short markdown messages (text + lin
|
||||
and the gateway↔connector calls. The OTLP **Collector** (OTLP/gRPC → Prometheus
|
||||
metrics + Tempo traces), **Prometheus** (15d), **Tempo** (72h) and **Grafana**
|
||||
(provisioned datasources + dashboards, behind the caddy `/_gm/grafana` Basic-Auth)
|
||||
are stood up with the deploy (`deploy/`, Stage 16); the default exporter stays
|
||||
are stood up with the deploy (`deploy/`); the default exporter stays
|
||||
`none`, so CI needs no collector. The contour also runs **cAdvisor** (per-container
|
||||
CPU/memory/network) and **postgres_exporter** (connections, cache-hit ratio,
|
||||
transactions, db size), scraped by Prometheus and surfaced on the **Scrabble —
|
||||
Resources** Grafana dashboard (R2), so the pre-release stress runs capture a resource
|
||||
Resources** Grafana dashboard, which captures a resource
|
||||
baseline; these export directly in Prometheus format (not through the collector).
|
||||
- Per-request server-side timing via gin middleware from day one (the access log
|
||||
carries method, route, status, latency and the active trace id). A
|
||||
client-measured RTT piggybacked on the next request is a later enhancement.
|
||||
- Domain/operational metrics (Stage 12), recorded through the meter and invisible
|
||||
- Domain/operational metrics, recorded through the meter and invisible
|
||||
until an exporter is configured: histograms `game_replay_duration` (journal
|
||||
rebuild on a cache miss), `game_move_validate_duration` and `game_move_duration`
|
||||
(Stage 17 — a seat's think time per committed move, attributed by `variant` and a
|
||||
(a seat's think time per committed move, attributed by `variant` and a
|
||||
`phase` of opening/middle/endgame; it aggregates **all** seats including robots,
|
||||
whose synthetic timing dominates the tail, so per-human analysis lives in the admin
|
||||
console, below); counters `games_started_total`, `games_abandoned_total` (a
|
||||
@@ -581,18 +578,18 @@ promotions) is future work and would deliver short markdown messages (text + lin
|
||||
`edge_request_duration` (the UI-perceived roundtrip, by `message_type`/`result`);
|
||||
and Go runtime/heap metrics. Game-scoped metrics carry a `variant` attribute
|
||||
(scrabble_en/scrabble_ru/erudit_ru).
|
||||
- Per-user move-time analytics (Stage 17) are **offline**, derived in the admin
|
||||
- Per-user move-time analytics are **offline**, derived in the admin
|
||||
console from the move journal (`game_moves.created_at` deltas, the first move from
|
||||
the game's creation), not Prometheus labels (which an `account_id` would explode):
|
||||
the user list shows each account's min/avg/max think time, and the user-detail page
|
||||
draws a zero-JS inline-SVG chart of min/mean/max by the player's move number.
|
||||
- User metrics (Stage 16): a backend counter `accounts_created_total` (`kind` =
|
||||
- User metrics: a backend counter `accounts_created_total` (`kind` =
|
||||
telegram/email/guest; robots are a provisioned pool, not users, and are excluded)
|
||||
and a gateway **in-memory** observable gauge `active_users` (`window` = 24h/7d) —
|
||||
distinct accounts that performed an authenticated edge action in the window. The
|
||||
gauge is single-process by design (single-instance MVP, §10): it is correct for one
|
||||
gateway, resets on restart, and is a live operational figure, not a billing count.
|
||||
- **Rate-limit observability (R3):** every limiter rejection increments the gateway
|
||||
- **Rate-limit observability:** every limiter rejection increments the gateway
|
||||
counter `gateway_rate_limited_total` (`class` = user/public/email/admin — aggregate
|
||||
only, honouring the no-per-user-label discipline above) and logs one **Debug** line;
|
||||
a gateway reporter drains the per-key rejection tracker every 30 s, emits one **Warn**
|
||||
@@ -617,19 +614,19 @@ promotions) is future work and would deliver short markdown messages (text + lin
|
||||
|
||||
| Concern | Enforced by |
|
||||
| --- | --- |
|
||||
| Public rate limiting / anti-abuse | gateway (per-IP public/email/admin classes, per-user authenticated class; a request body cap of `GATEWAY_MAX_BODY_BYTES`; rejections are metered, summarised to the backend and surfaced in the admin console with a conservative reversible auto-flag — R3, §11) |
|
||||
| Public rate limiting / anti-abuse | gateway (per-IP public/email/admin classes, per-user authenticated class; a request body cap of `GATEWAY_MAX_BODY_BYTES`; rejections are metered, summarised to the backend and surfaced in the admin console with a conservative reversible auto-flag — §11) |
|
||||
| Telegram initData validation (bot-token HMAC) | the Telegram connector; the gateway delegates it over gRPC, so the bot token lives only in the connector |
|
||||
| Session minting; email-code / guest validation | gateway (with backend) |
|
||||
| Session → `user_id` resolution, `X-User-ID` injection | gateway |
|
||||
| Authorisation, ownership, state transitions | backend (`X-User-ID` is the sole identity input) |
|
||||
| Admin authentication | a single Basic-Auth gate on `/_gm/*`, forwarded **verbatim** to the backend's server-rendered admin console (and, in the deployed contour, routing `/_gm/grafana/*` to Grafana). In the deploy the **caddy** owns this gate (§13); a local non-caddy run uses the gateway's own `GATEWAY_ADMIN_*` proxy, which the per-IP admin limiter class guards ahead of its Basic-Auth (R3) — the caddy-fronted path has no limiter (stock caddy), an accepted gap. The backend trusts the proxy (no admin principal) and guards its state-changing POSTs with a **same-origin** check — the console's CSRF defence. No operator identity is tracked |
|
||||
| Admin authentication | a single Basic-Auth gate on `/_gm/*`, forwarded **verbatim** to the backend's server-rendered admin console (and, in the deployed contour, routing `/_gm/grafana/*` to Grafana). In the deploy the **caddy** owns this gate (§13); a local non-caddy run uses the gateway's own `GATEWAY_ADMIN_*` proxy, which the per-IP admin limiter class guards ahead of its Basic-Auth — the caddy-fronted path has no limiter (stock caddy), an accepted gap. The backend trusts the proxy (no admin principal) and guards its state-changing POSTs with a **same-origin** check — the console's CSRF defence. No operator identity is tracked |
|
||||
| backend ↔ gateway ↔ connector trust | the network (only gateway may reach backend; the connector serves unauthenticated gRPC on the internal segment) |
|
||||
|
||||
This is an explicit, accepted MVP risk: compromise of the gateway↔backend
|
||||
network segment defeats backend authentication. Mitigated by network isolation;
|
||||
mutual auth is a future hardening step.
|
||||
|
||||
**Short numeric codes** (email confirm-codes and Stage 8 friend codes) are stored
|
||||
**Short numeric codes** (email confirm-codes and friend codes) are stored
|
||||
only as SHA-256 hashes and are short-lived and single-use. The unauthenticated
|
||||
email path carries a tight per-IP sub-limit (5 / 10 min); the **friend-code redeem**
|
||||
is authenticated, so it rides the per-user limit (300 / min) and is further bounded
|
||||
@@ -645,7 +642,7 @@ Single public origin, path-routed. The Vite build has two entries: a lightweight
|
||||
(`go:embed`, baked in by a node stage in `gateway/Dockerfile`) and serves it at
|
||||
`/app/` (web) and `/telegram/` (the Telegram Mini App; outside Telegram that path
|
||||
redirects to the root — the client-side guard); a stray hit on the gateway's `/`
|
||||
308-redirects to `/app/`. The **landing** ships in its own static container (R3): the
|
||||
308-redirects to `/app/`. The **landing** ships in its own static container: the
|
||||
`landing` target of `gateway/Dockerfile` (caddy:2-alpine + the same Vite build,
|
||||
`deploy/landing/Caddyfile`) serves it at `/`, so stray public traffic is absorbed by
|
||||
static file serving and never reaches the Go edge. Hash-named `/assets/*` are served
|
||||
@@ -671,23 +668,23 @@ network (project-scoped DNS); only caddy joins the shared external `edge` networ
|
||||
(alias `scrabble`).
|
||||
|
||||
Two contours, two secret/variable prefixes (`TEST_` / `PROD_`):
|
||||
- **Test** (Stage 16): auto-deploys on a PR into — or a push to — `development`
|
||||
- **Test**: auto-deploys on a PR into — or a push to — `development`
|
||||
(`.gitea/workflows/ci.yaml` → `docker compose up -d --build` on the Gitea runner
|
||||
host, then `GET /` + `GET /app/` probes through caddy — the landing container and
|
||||
the gateway, R3). The host caddy terminates TLS and
|
||||
the gateway). The host caddy terminates TLS and
|
||||
forwards the domain to `scrabble:80`, so the in-compose caddy serves plain HTTP
|
||||
(`CADDY_SITE_ADDRESS=:80`). The in-compose caddy **trusts X-Forwarded-For from
|
||||
private-range upstreams** (`trusted_proxies private_ranges`), so the real client IP —
|
||||
used for chat-moderation logging and the gateway's per-IP rate limiting — survives the
|
||||
host-caddy hop; in prod (no host caddy) public clients are untrusted and Caddy uses the
|
||||
real peer, so the single config is correct and spoof-safe in both contours (Stage 17).
|
||||
- **Prod** (Stage 18): a manual SSH deploy after `development → master`. There is no
|
||||
real peer, so the single config is correct and spoof-safe in both contours.
|
||||
- **Prod**: a manual SSH deploy after `development → master`. There is no
|
||||
host caddy, so the contour ships its own caddy terminating TLS — set
|
||||
`CADDY_SITE_ADDRESS` to the domain and the caddy does its own ACME.
|
||||
|
||||
## 14. CI & branches
|
||||
|
||||
- **Two long-lived branches** (Stage 16): **`development`** is the integration
|
||||
- **Two long-lived branches**: **`development`** is the integration
|
||||
trunk and **`master`** the production trunk; `feature/*` branches are cut from
|
||||
`development` and PR back into it (the genesis commit necessarily landed on
|
||||
`master`). A commit to a `feature/*` branch triggers nothing.
|
||||
@@ -695,18 +692,18 @@ Two contours, two secret/variable prefixes (`TEST_` / `PROD_`):
|
||||
suite on a PR into `development`/`master` and on a push to `development`. Its
|
||||
`unit` (gofmt/vet/build/unit-test), `integration` (Postgres-backed `integration`
|
||||
tag, testcontainers `postgres:17-alpine`, Ryuk off, serial) and `ui`
|
||||
(check/unit/build/bundle-budget/e2e) jobs are **path-conditional** (Stage 17 — a
|
||||
(check/unit/build/bundle-budget/e2e) jobs are **path-conditional** (a
|
||||
`changes` job filters by changed paths), and an always-running **`gate`** job
|
||||
aggregates them (passing when each succeeded or was **skipped**) and is the single
|
||||
branch-protection required check (`CI / gate`), so a path-skipped job never blocks
|
||||
a merge.
|
||||
- A gated **`deploy`** job auto-rolls the **test contour** on a PR into — or a push
|
||||
to — `development` (`docker compose up -d --build` on the runner host), then probes
|
||||
the gateway (`GET /`) **and the Telegram connector's liveness** (Stage 17 —
|
||||
the gateway (`GET /`) **and the Telegram connector's liveness** (via
|
||||
`docker inspect`: running, not restarting, stable restart count, with a
|
||||
VPN-handshake grace period, since the connector has no public ingress and a
|
||||
crash-loop is otherwise invisible). A PR into `master` is test-only; the prod
|
||||
deploy is the manual Stage 18 workflow. Secrets/variables are prefixed
|
||||
deploy is the manual workflow. Secrets/variables are prefixed
|
||||
`TEST_`/`PROD_` per contour.
|
||||
- The engine consumes `scrabble-solver` as a **published, versioned module**
|
||||
(`gitea.iliadenisov.ru/developer/scrabble-solver`, pinned in `backend/go.mod`); both Go
|
||||
@@ -714,6 +711,6 @@ Two contours, two secret/variable prefixes (`TEST_` / `PROD_`):
|
||||
(no public proxy/checksum DB, no sibling clone). The dictionaries ship as a **release
|
||||
artifact** from the `scrabble-dictionary` repo; the workflows download
|
||||
`scrabble-dawg-<DICT_VERSION>.tar.gz` and point the engine tests at it via
|
||||
`BACKEND_DICT_DIR` (TODO-1/TODO-2 discharged in Stage 14).
|
||||
`BACKEND_DICT_DIR`.
|
||||
- After any push, the run is watched to green before a stage is declared done
|
||||
(`python3 ~/.claude/bin/gitea-ci-watch.py`).
|
||||
|
||||
Reference in New Issue
Block a user