30 KiB
Pre-release plan — hardening before Stage 18
Living tracker for the pre-release hardening pass that runs before Stage 18 (the
prod cutover). Same discipline as PLAN.md: one phase per session,
interview the owner on the open details at the start of each phase, bake every
decision back into PLAN.md / docs/ / the affected READMEs / Go Doc comments in
the same PR, get CI green, then mark the phase done. Phases run as
feature/* → development PRs (the Stage 16 branch model); the owner approves+merges.
Why now: the system is feature-complete through Stage 17 and the test contour is green, but there is no prod data yet — schema, wire labels and the dictionary layout can still change for free. These phases spend that one-time freedom and harden the edge before prod. Each phase maps back to the owner's raw pre-release TODO list (numbers in the tracker).
Phase tracker
| # | Phase | Raw TODOs | Status |
|---|---|---|---|
| R1 | Schema & naming reset | 1 + 10 | done |
| R2 | Stress harness + contour observability + early run | 9a | done |
| R3 | Edge hardening | 2 + 8 + 3 | done |
| R4 | Push enrichment + kill the last poll | 4 + 5 | done |
| R5 | Bundle slimming | 6 | done |
| R6 | Refactor + docs reconciliation + de-staging | 7 | done |
| R7 | Final stress run + tuning | 9b | todo |
| → | Stage 18 — prod contour deploy | — | see PLAN.md |
Key findings (these reshaped the raw list — read before starting a phase)
- R1 (TODO 1 + 10) is one cheap moment, now. Squashing the 12 goose migrations is
safe precisely because there is no prod data and the contour DB is wiped. Folding the
new variant labels (
scrabble_ru/scrabble_en/erudit_ru) into that single baseline makes the rename need no data migration and no back-compat mapping. Today's labels (english/russian_scrabble/erudit) are persisted ingames.variant,game_invitations.variant, inpkg/fbsand the UI — ~100 files, but a mechanical sweep on a clean DB. - R4 (TODO 4 + 5): the app is already push-first. Game state refreshes on
your_turn/opponent_moved, the lobby onnotify, chat onchat_message. The only genuine periodic server poll islobby.poll(matchmaking, 2.5 s,ui/src/screens/NewGame.svelte). What remains is killing that one poll and enriching push events to carry payloads so the UI stops re-fetching after each signal. - R3 (TODO 2): identity forgery is already mitigated. Identity is always derived from
the session (
Authorization: Bearer→X-User-ID); the client cannot inject identity, the backend re-validates resource ownership, Telegram initData is HMAC-checked. The real gaps are a missing request-body size limit (cheap DoS) and invisible rate-limit rejections (no log/metric/admin view — that is TODO 8). Static landing serving is not covered by the gateway token bucket (it only guardsExecute). - R6 (TODO 7) scale: ~431
Stage Nreferences across ~104 files (incl. the file namebackend/internal/inttest/stage6_test.go). Code is the source of truth;docs/describe current state;PLAN.mdkeeps the decision history.
Locked decisions (owner interview)
- Stress test (TODO 9): early + final runs. Driver = edge protocol (Connect/FB through the gateway, moves generated by the solver) plus a separate gateway-hammer saturation test. Pacing = realistic (under limits) + saturation (ramp to the knee). Resource metrics = add cAdvisor + postgres_exporter to the contour (today only Go-runtime metrics exist). The harness stays in the repo for repeats.
- Push (TODO 4 + 5): both — kill
lobby.poll(use the existingmatch_found, keep poll as the ws-down fallback) and enrich push events with payloads. - Refactor (TODO 7): hygiene + structural changes by a reviewed list — behaviour-preserving, test-gated, contentious items surfaced to the owner before applying.
- Landing (TODO 3): separate static container behind the project caddy
(
/→ landing,/app/+/telegram/→ gateway); droplanding.htmlfrom the gatewaygo:embed. - Rate-abuse (TODO 8): metric + Grafana + admin view plus a conservative auto-flag — a soft, reversible "suspected high-rate" marker for operator review, tunable threshold, no auto-ban.
Phases
Each phase: read this tracker + the relevant docs/, interview the owner on the open
details below, implement within scope, then update the tracker + docs/code and get CI
green before marking it done.
R1 — Schema & naming reset (TODO 1 + 10) — first
Squash backend/internal/postgres/migrations/00001..00012 into one 00001_baseline.sql
(method: pg_dump --schema-only from a fully-migrated DB → wrap as the goose baseline →
prove a fresh migrate yields a schema identical to the 12-migration chain via the
integration suite → delete the old files; keep goose). Bake the new variant labels into the
baseline. Propagate scrabble_ru/scrabble_en/erudit_ru through the backend
(engine.Variant/ParseVariant, registry.dictFiles, the CHECK values), the wire
(pkg/fbs variant:string, regenerate FB) and the UI (lib/model.ts union, variants.ts,
fixtures, premium/alphabet keys, tests); i18n display keys stay display-only. Tidy
../scrabble-dictionary to a single source→dawg build point and align the dawg artifact
names to the new labels (crosses into ../scrabble-solver's committed fixtures — keep them
byte-identical). After merge, wipe the contour DB (drop the volume) so it re-provisions
on the next deploy.
- Critical files:
backend/internal/postgres/migrations/,backend/internal/engine/{engine,registry}.go,pkg/fbs/scrabble.fbs,ui/src/lib/{model,variants}.ts,../scrabble-dictionary/{Makefile,cmd/builddict,…}. - Open details to interview: the exact dawg filename scheme; whether the dict-repo tidy is one PR or split; how to script the contour DB wipe in the deploy.
R2 — Stress harness + contour observability + early run (TODO 9, part 1)
Build the reusable load harness as a new loadtest module in go.work (reuses pkg/fbs,
connect-go, and scrabble-solver for legal-move generation): a seeder that inserts
1000 guest + 10000 durable accounts with pre-created sessions (token hashes) directly in
the DB and hands the plaintext tokens to the client; a driver that runs N virtual users,
each in 3–5 concurrent 2–4-player games, exercising submit-play / pass / exchange / nudge /
chat / check-word / draft-move / profile-save through the edge protocol, in
realistic (under rate limits) and saturation (ramp) modes; plus a separate
gateway-hammer that deliberately exceeds limits to verify the limiter holds and measure
its cost. Add cAdvisor + postgres_exporter to deploy/docker-compose.yml and a Grafana
resource dashboard. Run the early pass against the freshly-wiped contour; produce a
trip report (logic/concurrency bugs + a resource baseline) that feeds R3 and R6.
- Critical files: new
loadtest/,deploy/docker-compose.yml,deploy/observability/*,docs/TESTING.md. - Open details: the scale ramp steps; the move-selection policy (a mid-ranked solver move for realistic game progress); run duration; the pass/fail bar.
R3 — Edge hardening (TODO 2 + 8 + 3)
Add a request-body size cap at the gateway h2c mux / Execute (e.g. ~1 MB). Add
rate-limit observability: a gateway_rate_limited_total{class} counter + a structured
log per rejection; an aggregate Grafana panel (request rate + rejection rate — spikes
visible without per-user label cardinality, honouring the Stage 12/17 discipline); an
admin-console view of recently throttled users/IPs (in-memory ring buffer, single-
instance, reset-on-restart, like the active_users gauge). Add the conservative
auto-flag: when a user is sustained-throttled past a tunable threshold, set a soft,
reversible account.flagged_high_rate_at marker (baked into the R1 baseline) surfaced in the
admin user list/detail — no auto-ban; the operator clears it. Split the landing into
its own static container (deploy/ + a Caddyfile route / → landing) and drop
landing.html from the gateway go:embed.
- Critical files:
gateway/internal/connectsrv/server.go,gateway/internal/ratelimit/,gateway/internal/connectsrv/metrics.go,backend/internal/adminconsole/,deploy/caddy/Caddyfile,deploy/docker-compose.yml,gateway/internal/webui/. - Open details: the auto-flag threshold/window + whether the marker is persisted vs in-memory; the landing image base (caddy vs nginx).
R4 — Push enrichment + kill the last poll (TODO 4 + 5)
Replace lobby.poll with the existing match_found push (keep the poll as a ws-down
fallback). Enrich your_turn/opponent_moved/notify to carry the state payload so the UI
renders from the event without a follow-up game.state (removes the lobby↔game nav latency
the owner noticed). Wire-contract change: pkg/fbs event payloads → backend notify emit →
UI stream consumers (ui/src/lib/app.svelte.ts), with the per-game cache as the landing
spot; regenerate FB.
- Critical files:
pkg/fbs/scrabble.fbs,backend/internal/notify/events.go,ui/src/lib/{app.svelte,transport}.ts,ui/src/screens/NewGame.svelte. - Open details: which events carry full vs delta payloads; the fallback-poll cadence when the stream is down.
R5 — Bundle slimming (TODO 6) — done
Analysed the bundle against the 100 KB-gzip budget; no code slimming was warranted, and the budget metric was retargeted to measure the app correctly. The build already minifies + tree-shakes; the dominant cost is the Connect/FlatBuffers transport runtime + generated bindings
- the Svelte runtime (≈⅔ of
main's source is third-party/generated) — irreducible within scope. Lazy-loading was rejected:bundle-size.mjssums every emitted chunk, so code-splitting yields no total-size win and adds request latency (+N gateway fetches on first navigation to a split screen). i18n lazy-load was skipped (the catalogs are a sliver of a Svelte-runtime-dominated shared chunk, andenmust stay bundled as theMessageKeytype source + fallback). Instead,bundle-size.mjsnow measures per HTML entry, with three independent gates on the natural chunk boundaries — app entry ≤ 100 KB, the Svelte+i18n shared chunk ≤ 30 KB, the landing's own chunk ≤ 5 KB — since the app's real payload is its entry chunk plus the shared chunk (≈97 KB), while the landing (≈24 KB) is reported separately and kept minimal. Same CLI + exit-code contract, so the CI step is unchanged.
- Critical files:
ui/scripts/bundle-size.mjs; no app code changed.
R6 — Refactor + docs reconciliation + de-staging (TODO 7) — done
Behaviour-preserving only. Three separable, separately-committed passes: (a) mechanical
de-staging — remove Stage N/TODO-N references from code, comments and service
READMEs (rename stage6_test.go); (b) docs↔code reconciliation — reconcile
docs/ARCHITECTURE.md / docs/FUNCTIONAL.md(+_ru) against the code-as-truth, fixing drift
and Go Doc comments; (c) structural changes by a reviewed list — surface a list of
proposed optimizations / test-suite consolidations to the owner, apply only the approved,
behaviour-preserving, test-gated ones. The full suite + the final stress run (R7) are the
regression gate. Incorporates the early-run (R2) bug fixes not already shipped.
- Open details: the structural-changes list itself (owner-approved before applying); the test consolidation targets.
R7 — Final stress run + tuning (TODO 9, part 2) — before Stage 18
Re-run the R2 harness against the final, refactored system on a clean contour; analyse resource consumption across all components (gateway, backend, Postgres, the metrics/observability stack, docker log volume) and agree the tuning (pool sizes, rate limits, cache TTLs, container limits, GOMAXPROCS, log levels). Apply the agreed tuning; record the methodology + results in the repo.
→ Stage 18 (prod contour) then proceeds per PLAN.md.
Sequencing rationale
R1 first (cheapest now; everything builds on the final schema/naming and the stress test
must run against it). R2 builds the harness and runs the early pass to surface bugs and
a resource baseline that feed R3 and R6. R3/R4/R5 harden and improve the system.
R6 (de-stage + reconcile + structural) runs near the end so it sweeps settled code once and
benefits from all accumulated bug knowledge. R7 validates the final system and tunes it.
Then Stage 18.
Regression-safety discipline (cross-cutting)
- Every phase is a
feature/* → developmentPR; CI (unit+integration+uibehind theCI / gatecheck) must be green before the owner merges; watch the post-merge contour deploy withgitea-ci-watch.py. R6structural changes are behaviour-preserving, test-gated, and split from the mechanical sweeps; contentious items are owner-approved first.- The two stress runs (
R2early,R7final) are the system-level regression gate.
Verification (per phase)
go build ./<module>/...,go vet,gofmt -l .clean,go test -count=1 ./<module>/...; UI:pnpm check && pnpm test:unit && pnpm build; the integration suite (-tags integration) for DB/schema changes;docker compose configfor deploy changes; green CI on the PR + a healthy contour deploy.R1: prove the squashed baseline yields a schema identical to the 12-migration chain (integration suite on a fresh DB) before deleting the old files.R2/R7: the harness runs end-to-end against the contour; the trip report lists concrete defects + a resource profile from the Grafana cAdvisor/postgres_exporter panels.
Refinements logged during implementation
-
R1 (interview + implementation):
- Variant labels
english/russian_scrabble/erudit→scrabble_en/scrabble_ru/erudit_ruacross the backend (engine.Variant.String/ParseVariant; thegames/game_invitationsvariantCHECK in the baseline; GCG#lexiconand thevariantmetric attribute both flow fromString), the wire (pkg/fbsvariantis astringfield — values change with no FlatBuffers regen) and the UI (model.tsunion,variants.tsrecords,codec/premiums/mocks/tests, the admindictionary.gohtml). Kept: the Go enum identifiers (VariantEnglish…, internal) and the i18n display keys (new.english/new.russian/new.erudit, display-only).complaints.variantstays free-text (no CHECK, as before). - dawg filenames kept descriptive (
en_sowpods/ru_scrabble/ru_erudit) — only the registry'sVariantkey carries the rename, soregistry.go, the publishedscrabble-solverfixtures and the dictionary release artifact are untouched (decouples the three repos). - Migrations squashed 12 → one hand-written
00001_baseline.sql. Verified by apg_dump --schema-onlydiff (the chain vs the baseline are identical but for the two intended variant-CHECK values) plus the green integration suite. No data migration (no production data). - Done (cross-repo + contour): the
scrabble-dictionarytidy merged (PR #2) and was re-cut as the byte-identicalv1.0.1release for clean provenance (the backend stays onv1.0.0— same bytes, no rewire; the backend pulls a version-pinned release artifact, not master). Post-merge the contourbackendschema was wiped (DROP SCHEMA backend CASCADE+ restart, not a volume drop) and re-migrated to the baseline — verified the new variant CHECK (scrabble_en/scrabble_ru/erudit_ru),games=0 and a clean boot.
- Variant labels
-
R2 (interview + implementation):
- Locked decisions: game assembly via invitations (real path, no robots; not direct game-row
inserts); moderate ramp 50 → 200 → 500 at 10 min/step; diagnostic pass bar (no SLO gate);
run as a one-shot container on
scrabble-internalin this PR. - Harness = new
scrabble/loadtestmodule (use ./loadtest+ areplace scrabble/gatewayfor the dot-free edge-proto import). It seeds 1000 guest + 10000 durable accounts + sessions directly in Postgres (token hash mirrorsbackend/internal/session), drives players over the edge protocol, generates mid-ranked legal moves locally with the embeddedscrabble-solverby replayinggame.history(the edge carries no board — mirrorsengine.ReplayBoardvia the public API), and a gateway-hammer. Compact CLI (run/cleanup), distroless Dockerfile (DAWGs baked), Go unit tests. - Adding the module broke the other images' builds — backend/gateway/telegram Dockerfiles reduce the
workspace but still referenced
./loadtest(not in their context); each now also-dropuse=./loadtest(backend/telegram additionally-dropreplacethe gateway replace). Caught by the first deploy run; verified by building all four images. - Harness payload fixes found by the smoke pass: the draft DTO's
rack_orderis a string (was sent as[]→bad_request); the display-name validator forbids digits/colons, so the cleanup marker became a letters-onlyZzloadtestsoprofile.updateresends the seeded name.chat_not_your_turn/nudge_own_turnare by-design turn gates, correctly exercised. - Observability: added cAdvisor + postgres_exporter + the Scrabble — Resources dashboard +
two Prometheus jobs. Finding: cAdvisor yields only the root cgroup on the contour host (separate
XFS
/var/lib/dockerbreaks its layer-ID resolution — the existing galaxy deploy has the same limit), so per-container CPU/RSS for the early pass was captured viadocker stats. R7: adopt the otelcoldocker_statsreceiver (already the contrib image) for per-container metrics in Grafana. - Early run (2026-06-09): ramped clean to 500 players, no crash/deadlock, cleanup removed all 11000
accounts. 1.2 M edge calls, 48 870 plays, 2 798 games finished; the per-user limiter held under the
hammer (99.97 % rejected, p99 2 ms). Top finding: ~14 %
transport_errorongame.stateat 500 players, under CPU saturation (backend/gateway/Postgres each ~1 core) and amplified by the harness's single sharedhttp2.Transport; the harness itself peaked at 86 % of a core on the same host, so the figures are pessimistic. Full trip report in../loadtest/REPORT-R2.md; it feeds R3 (h2cMaxConcurrentStreams/timeouts, body-size cap), R6 and R7 (per-player transports, separate hardware, pool/limit sizing). - CI:
./loadtest/...added to the path filter + vet/build/test;go.work.sumcarries the new deps.
- Locked decisions: game assembly via invitations (real path, no robots; not direct game-row
inserts); moderate ramp 50 → 200 → 500 at 10 min/step; diagnostic pass bar (no SLO gate);
run as a one-shot container on
-
R3 (interview + implementation):
- Locked decisions: the flag column lands by editing the R1 baseline (+ a contour schema
wipe after merge — no migration chain accrues before prod); auto-flag defaults 1000 rejected /
10 min (
BACKEND_HIGHRATE_FLAG_THRESHOLD/_WINDOW, rolling window, set-once, operator clears, no auto-ban); landing image = caddy:2-alpine; throttle data flows gateway → backend (a 30 s per-key summary POST to the new/api/v1/internal/ratelimit/report, the existing trusted direction) with the episode window + flag rule in the backend (internal/ratewatch); rejection logging = Warn summary per key per window + Debug per rejection — a deliberate deviation from the phase's "structured log per rejection" (the R2 hammer would have logged ~522k lines in minutes); all three R2-report tails included (explicit h2c sizing, the session-resolve failure cause at Warn, reviving the admin limiter). - Body cap:
GATEWAY_MAX_BODY_BYTES(default 1 MiB) as both the Connect per-message read limit and anhttp.MaxBytesReaderwrap of the public mux; an oversized Execute isresource_exhausted. - Dead config found:
AdminPerMinute/AdminBurstwere never wired — the gateway/_gmmount is now 429-guarded per IP ahead of its Basic-Auth. The caddy-fronted contour path stays unlimited (stock caddy has no limiter) — an accepted gap, recorded indocs/ARCHITECTURE.md§12. - Landing split: a
landingtarget ingateway/Dockerfile(the UI build stage is shared; identical compose build args keep it one cached build); the gateway dropslanding.htmlfrom the embed and 308-redirects/→/app/; the contour caddy routes/app/,/telegram/and the Connect path to the gateway and the catch-all to the landing container; the CI deploy probe now checks both/(landing) and/app/(gateway). - Observability:
gateway_rate_limited_total{class}(user/public/email/admin, aggregate-only)- a rate-vs-rejections panel on the Edge/UX dashboard; the admin console gains the Throttled
page (the in-memory episode window, reset-on-restart like
active_users, plus the flagged-account queue) and the flag badge / clear action on the user list / card.
- a rate-vs-rejections panel on the Edge/UX dashboard; the admin console gains the Throttled
page (the in-memory episode window, reset-on-restart like
- The jet regen also restored the previously missing
game_drafts/game_hiddengenerated models (their tables were added after the last jetgen run; no behaviour change).
- Locked decisions: the flag column lands by editing the R1 baseline (+ a contour schema
wipe after merge — no migration chain accrues before prod); auto-flag defaults 1000 rejected /
10 min (
-
R4 (interview + implementation):
- Locked decisions: delta-first, not full snapshots — an event carries only the new move and
the UI applies it to its per-game cache, keyed on
move_count(idempotent + gap-safe: a gap or the actor's own move falls back to agame.state+game.historyrefetch).match_found/game_startedcarry the recipient's initialStateView(instant lobby→game); the fallback refetch stays the existing two calls (no merged endpoint); the matchmaking poll runs only while the stream is down (2.5 s); all UI-state-changing events carry their payload (incl. lobbynotify). - Enriched events (
pkg/fbstrailing fields — backward-compatible, no FB regen of values, only the schema):opponent_moved(+move/game/bag_len),your_turn(+move_count),match_found(+state),game_over(+game),notify(+account/invitation/state). The pre-R4opponent_movedscalars (seat/action/score/total) stay for wire back-compat, now redundant withmove/game— slated for the R6 de-stage. - Encoding placement: the
notifypackage keeps ownership of the FlatBuffers encoding (a newencode.gomirrors the gateway transcode but reads wire-agnosticnotify.*input structs +engine.MoveRecord); the game/lobby/social services map their domain types to those structs, so the wire schema stays out of the domain. Flagged for R6: this partly duplicates the gateway encoders (different source types) — a candidate consolidation. - Actor self-fetch killed too (beyond literal "push"): the
submit_play/pass/exchange/resignresponse (MoveResult) now returns the actor's refilled rack + bag size, so the mover renders the next turn from the response —Game.svelte'scommit/pass/exchange/resigndrop theirawait load(). match_foundenrichment needs a per-seat initial state:lobby.GameCreatorgainedInitialState, andgame.Service.InitialStatebuilds thenotify.PlayerState(rack re-encoded to wire indices, the variant alphabet embedded for a first-seen variant).- UI: a pure
lib/gamedelta.tsreducer (applyMoveDelta/applyGameOver/seedInitialState, unit-tested) advances the cache;app.svelteseeds it onmatch_found/game_started;Game.svelteapplies the delta (falling back toload()while composing, on a gap, or on its own move's new rack);NewGame.sveltepolls only whenapp.streamAliveis false and guards its teardown so a push-delivered match is not cancelled. - notify (friends/invitations) scope: the backend carries the full account / invitation payload on the
wire (per "all events → push"); the UI seeds the game cache from
game_startedbut keeps its lightweight authoritative badge refresh (refreshNotifications, on the rarenotifyevent + on foreground) rather than adding client-side friend/invitation caches — the per-move hot path is fully de-fetched, which was the goal. Deeper lobby-cache consumption is an easy follow-up. - No schema change (no migration); the contour needs no DB wipe. Tests:
notifyFB round-trips +emitMovedelta + thegamedeltareducer; the e2e mock now emits the enriched delta.
- Locked decisions: delta-first, not full snapshots — an event carries only the new move and
the UI applies it to its per-game cache, keyed on
-
R5 (interview + implementation):
- No code slimming — by analysis. A gzip measure + sourcemap attribution of the real
distshowed the app bundle is already minified + tree-shaken and dominated by the Connect/FlatBuffers transport runtime + generated FB/PB bindings (≈⅔ ofmain's source) and the Svelte runtime — all third-party/generated, irreducible within R5's scope. App-authored code carries no hand-trimmable fat. - Lazy-load rejected (screens and i18n):
bundle-size.mjssums every emitted chunk, so code-splitting moves bytes between chunks for zero total-size win while adding request latency (+N gateway fetches on first navigation to a split screen). i18n lazy-load additionally buys ≤3 KB (en-only users) at the cost of an asynct(), andenmust stay bundled (it is theMessageKeytype source + fallback). Chunk-collapsing rejected too — keeping the near-static Svelte runtime in its own cacheable chunk is the recommended practice (an app deploy then re-busts onlymain, not the runtime), and HTTP/2 makes the extra preload request negligible. - Metric retargeted to the app. The two-entry build (
index.htmlapp +landing.html) makes Rollup hoist the code shared by both (Svelte runtime + i18n +aboutContent) into one preloaded chunk, so the app actually loads its entry chunk + the shared chunk (≈74 + ≈23 = ≈97 KB), neverlanding.js(≈1.6 KB). The old script summed all three chunks (98.8 KB), over-counting the app bylanding.js.bundle-size.mjsnow parses each built HTML for the JS it eagerly loads and gates three parts independently — app entry ≤ 100 KB, shared (Svelte+i18n) ≤ 30 KB, landing-own ≤ 5 KB — reporting the app total (≈97) and landing total (≈24.5). Same CLI + exit-code contract, so the CI step is unchanged. - No app/source/build change (
App.svelte,lib/i18n/,vite.config.tsuntouched); no schema change, no contour wipe. The stale "~82 KB" figure was corrected inbundle-size.mjsandui/README.md.
- No code slimming — by analysis. A gzip measure + sourcemap attribution of the real
-
R6 (interview + implementation):
- Locked decisions: apply both wire/code structural changes (B + A) and only C1+C2 of
the test consolidation (not C3/C5); strip the
*(Stage N)*tags from all current-state docs (ARCHITECTURE / FUNCTIONAL+_ru/ TESTING / UI_DESIGN), keeping PLAN.md / PRERELEASE.md / CLAUDE.md as history; splitstage6_test.goby domain. Theh2cMaxConcurrentStreamssizing stays an R7 concern (tuning, not behaviour-preserving); the R2 early run forced no code fix, so nothing was carried in. - (a) De-staging: removed the
Stage N/TODO-N/(RN)references across code, comments, service READMEs and the current-state docs, rewording narratives to present tense (no technical content lost). Renamed the only stage-named identifiers (registerStage8→registerSocialOps,registerStage11→registerLinkOps) and splitstage6_test.go(TestEmailLoginFlow→email_test.go;TestGuestAutoMatchLeavesNoStats+provisionGuest→account_test.go). De-staged the.fbs/.protocomments and regenerated: only the.proto-derived Go docstrings (*_grpc.pb.go,push.pb.go) changed — flatc strips schema comments, so the FB Go/TS bindings were untouched. - (b) Reconciliation: the docs were accurate (each R-phase baked its own); the one drift was a stale
"guest-reaping deferred (TODO-3)" note in
ARCHITECTURE.md§3 — guest reaping is implemented, so the note was replaced with the current behaviour (FUNCTIONAL/TESTING already described it). - (c) B — dead
opponent_movedscalars: removedseat/action/score/totalfromOpponentMovedEvent(pkg/fbs/scrabble.fbs+ thenotifyemit + the round-trip test); regenerated FB Go + TS. No reader used them (the UI codec/mock takemove/game/bag_len; the gateway forwards the payload verbatim). A pre-release wire-slot renumber — free with no prod data, no DB change. - (c) A — shared FB builders: new
scrabble/pkg/wireholds the single definition of the nested wire tables (GameView / MoveRecord / StateView / AccountRef / Invitation) shared by the backendnotifyencoder and the gatewaytranscode; both map their own source types to neutralwire.*structs and delegate. Honest tradeoff: the verboseStart/Add/End+ reverse-prepend boilerplate is now written once, but the field set is still mapped per side, and the new package makes the change net +~145 LOC — a single-source / anti-drift win for the fiddly mechanics rather than a line-count cut. Behaviour- preserving: the two sides' field sets were verified identical and the round-trip tests pass unchanged. - (c) C1+C2 — inttest fixtures: moved the cross-file service/game fixtures (
newGameServicewas used by 10 files) intobackend/internal/inttest/helpers.go; single-file helpers stay local. Pure relocation. - No schema change → no contour DB wipe. Regression gate: the full unit + integration + UI suites plus the R7 stress run.
- Locked decisions: apply both wire/code structural changes (B + A) and only C1+C2 of
the test consolidation (not C3/C5); strip the