d4ef951db9
CI / changes (pull_request) Successful in 2s
CI / unit (pull_request) Has been skipped
CI / integration (pull_request) Has been skipped
CI / ui (pull_request) Successful in 37s
CI / gate (pull_request) Successful in 0s
CI / deploy (pull_request) Successful in 1m0s
Analysed the real dist (gzip + sourcemap attribution): the bundle is already minified + tree-shaken and dominated by the Connect/FlatBuffers transport runtime + generated bindings + the Svelte runtime (~2/3 of main), so no in-scope code slimming is warranted. Lazy-loading was rejected (bundle-size.mjs sums every chunk -> zero total-size win, plus +N gateway fetches of latency); i18n lazy-load and chunk-collapsing likewise (caching/HTTP2). Instead bundle-size.mjs now measures per HTML entry with three independent gates (app entry <=100 KB, Svelte+i18n shared <=30 KB, landing-own <=5 KB): the app's real payload is its entry chunk + the shared chunk (~97 KB), never landing.js. Same CLI + exit-code contract, CI step unchanged. Fixed the stale ~82 KB figure in the script and ui/README.md. No app code change.
351 lines
26 KiB
Markdown
351 lines
26 KiB
Markdown
# Pre-release plan — hardening before Stage 18
|
||
|
||
Living tracker for the pre-release hardening pass that runs **before Stage 18** (the
|
||
prod cutover). Same discipline as [`PLAN.md`](PLAN.md): one phase per session,
|
||
**interview the owner on the open details** at the start of each phase, bake every
|
||
decision back into `PLAN.md` / `docs/` / the affected `README`s / Go Doc comments in
|
||
the **same** PR, get CI green, then mark the phase done. Phases run as
|
||
`feature/* → development` PRs (the Stage 16 branch model); the owner approves+merges.
|
||
|
||
**Why now:** the system is feature-complete through Stage 17 and the test contour is
|
||
green, but there is **no prod data yet** — schema, wire labels and the dictionary
|
||
layout can still change for free. These phases spend that one-time freedom and harden
|
||
the edge before prod. Each phase maps back to the owner's raw pre-release TODO list
|
||
(numbers in the tracker).
|
||
|
||
## Phase tracker
|
||
|
||
| # | Phase | Raw TODOs | Status |
|
||
|---|-------|-----------|--------|
|
||
| R1 | Schema & naming reset | 1 + 10 | **done** |
|
||
| R2 | Stress harness + contour observability + early run | 9a | **done** |
|
||
| R3 | Edge hardening | 2 + 8 + 3 | **done** |
|
||
| R4 | Push enrichment + kill the last poll | 4 + 5 | **done** |
|
||
| R5 | Bundle slimming | 6 | **done** |
|
||
| R6 | Refactor + docs reconciliation + de-staging | 7 | todo |
|
||
| R7 | Final stress run + tuning | 9b | todo |
|
||
| → | Stage 18 — prod contour deploy | — | see [`PLAN.md`](PLAN.md) |
|
||
|
||
## Key findings (these reshaped the raw list — read before starting a phase)
|
||
|
||
- **R1 (TODO 1 + 10) is one cheap moment, now.** Squashing the 12 goose migrations is
|
||
safe precisely because there is no prod data and the contour DB is wiped. Folding the
|
||
new variant labels (`scrabble_ru`/`scrabble_en`/`erudit_ru`) into that single baseline
|
||
makes the rename need **no data migration and no back-compat mapping**. Today's labels
|
||
(`english`/`russian_scrabble`/`erudit`) are persisted in `games.variant`,
|
||
`game_invitations.variant`, in `pkg/fbs` and the UI — ~100 files, but a mechanical sweep
|
||
on a clean DB.
|
||
- **R4 (TODO 4 + 5): the app is already push-first.** Game state refreshes on
|
||
`your_turn`/`opponent_moved`, the lobby on `notify`, chat on `chat_message`. The **only**
|
||
genuine periodic server poll is `lobby.poll` (matchmaking, 2.5 s,
|
||
`ui/src/screens/NewGame.svelte`). What remains is killing that one poll **and** enriching
|
||
push events to carry payloads so the UI stops re-fetching after each signal.
|
||
- **R3 (TODO 2): identity forgery is already mitigated.** Identity is always derived from
|
||
the session (`Authorization: Bearer` → `X-User-ID`); the client cannot inject identity,
|
||
the backend re-validates resource ownership, Telegram initData is HMAC-checked. The real
|
||
gaps are a missing **request-body size limit** (cheap DoS) and **invisible rate-limit
|
||
rejections** (no log/metric/admin view — that is TODO 8). Static landing serving is **not**
|
||
covered by the gateway token bucket (it only guards `Execute`).
|
||
- **R6 (TODO 7) scale:** ~431 `Stage N` references across ~104 files (incl. the file name
|
||
`backend/internal/inttest/stage6_test.go`). Code is the source of truth; `docs/` describe
|
||
current state; `PLAN.md` keeps the decision history.
|
||
|
||
## Locked decisions (owner interview)
|
||
|
||
- **Stress test (TODO 9):** **early + final** runs. Driver = **edge protocol** (Connect/FB
|
||
through the gateway, moves generated by the solver) **plus a separate gateway-hammer**
|
||
saturation test. Pacing = **realistic (under limits) + saturation (ramp to the knee)**.
|
||
Resource metrics = **add cAdvisor + postgres_exporter to the contour** (today only
|
||
Go-runtime metrics exist). The harness stays in the repo for repeats.
|
||
- **Push (TODO 4 + 5):** **both** — kill `lobby.poll` (use the existing `match_found`, keep
|
||
poll as the ws-down fallback) **and** enrich push events with payloads.
|
||
- **Refactor (TODO 7):** **hygiene + structural changes by a reviewed list** —
|
||
behaviour-preserving, test-gated, contentious items surfaced to the owner before applying.
|
||
- **Landing (TODO 3):** **separate static container** behind the project caddy
|
||
(`/` → landing, `/app/` + `/telegram/` → gateway); drop `landing.html` from the gateway
|
||
`go:embed`.
|
||
- **Rate-abuse (TODO 8):** metric + Grafana + admin view **plus a conservative auto-flag** —
|
||
a *soft, reversible* "suspected high-rate" marker for operator review, tunable threshold,
|
||
**no auto-ban**.
|
||
|
||
## Phases
|
||
|
||
Each phase: read this tracker + the relevant `docs/`, **interview the owner on the open
|
||
details below**, implement within scope, then update the tracker + docs/code and get CI
|
||
green before marking it done.
|
||
|
||
### R1 — Schema & naming reset *(TODO 1 + 10)* — first
|
||
Squash `backend/internal/postgres/migrations/00001..00012` into one `00001_baseline.sql`
|
||
(method: `pg_dump --schema-only` from a fully-migrated DB → wrap as the goose baseline →
|
||
prove a fresh migrate yields a schema identical to the 12-migration chain via the
|
||
integration suite → delete the old files; keep goose). Bake the new variant labels into the
|
||
baseline. Propagate `scrabble_ru`/`scrabble_en`/`erudit_ru` through the backend
|
||
(`engine.Variant`/`ParseVariant`, `registry.dictFiles`, the CHECK values), the wire
|
||
(`pkg/fbs` `variant:string`, regenerate FB) and the UI (`lib/model.ts` union, `variants.ts`,
|
||
fixtures, premium/alphabet keys, tests); i18n display keys stay display-only. Tidy
|
||
`../scrabble-dictionary` to a single source→dawg build point and align the dawg artifact
|
||
names to the new labels (crosses into `../scrabble-solver`'s committed fixtures — keep them
|
||
byte-identical). After merge, **wipe the contour DB** (drop the volume) so it re-provisions
|
||
on the next deploy.
|
||
- Critical files: `backend/internal/postgres/migrations/`,
|
||
`backend/internal/engine/{engine,registry}.go`, `pkg/fbs/scrabble.fbs`,
|
||
`ui/src/lib/{model,variants}.ts`, `../scrabble-dictionary/{Makefile,cmd/builddict,…}`.
|
||
- Open details to interview: the exact dawg filename scheme; whether the dict-repo tidy is
|
||
one PR or split; how to script the contour DB wipe in the deploy.
|
||
|
||
### R2 — Stress harness + contour observability + early run *(TODO 9, part 1)*
|
||
Build the reusable load harness as a new `loadtest` module in `go.work` (reuses `pkg/fbs`,
|
||
`connect-go`, and `scrabble-solver` for legal-move generation): a seeder that inserts
|
||
**1000 guest + 10000 durable** accounts with pre-created sessions (token hashes) directly in
|
||
the DB and hands the plaintext tokens to the client; a driver that runs N virtual users,
|
||
each in 3–5 concurrent 2–4-player games, exercising submit-play / pass / exchange / nudge /
|
||
chat / check-word / draft-move / profile-save through the **edge protocol**, in
|
||
**realistic** (under rate limits) and **saturation** (ramp) modes; plus a separate
|
||
**gateway-hammer** that deliberately exceeds limits to verify the limiter holds and measure
|
||
its cost. Add **cAdvisor + postgres_exporter** to `deploy/docker-compose.yml` and a Grafana
|
||
resource dashboard. Run the **early pass** against the freshly-wiped contour; produce a
|
||
**trip report** (logic/concurrency bugs + a resource baseline) that feeds R3 and R6.
|
||
- Critical files: new `loadtest/`, `deploy/docker-compose.yml`, `deploy/observability/*`,
|
||
`docs/TESTING.md`.
|
||
- Open details: the scale ramp steps; the move-selection policy (a mid-ranked solver move
|
||
for realistic game progress); run duration; the pass/fail bar.
|
||
|
||
### R3 — Edge hardening *(TODO 2 + 8 + 3)*
|
||
Add a **request-body size cap** at the gateway h2c mux / `Execute` (e.g. ~1 MB). Add
|
||
**rate-limit observability**: a `gateway_rate_limited_total{class}` counter + a structured
|
||
log per rejection; an **aggregate** Grafana panel (request rate + rejection rate — spikes
|
||
visible without per-user label cardinality, honouring the Stage 12/17 discipline); an
|
||
**admin-console view** of recently throttled users/IPs (in-memory ring buffer, single-
|
||
instance, reset-on-restart, like the `active_users` gauge). Add the **conservative
|
||
auto-flag**: when a user is *sustained*-throttled past a tunable threshold, set a soft,
|
||
reversible `account.flagged_high_rate_at` marker (baked into the R1 baseline) surfaced in the
|
||
admin user list/detail — **no auto-ban**; the operator clears it. Split the **landing** into
|
||
its own static container (`deploy/` + a Caddyfile route `/` → landing) and drop
|
||
`landing.html` from the gateway `go:embed`.
|
||
- Critical files: `gateway/internal/connectsrv/server.go`, `gateway/internal/ratelimit/`,
|
||
`gateway/internal/connectsrv/metrics.go`, `backend/internal/adminconsole/`,
|
||
`deploy/caddy/Caddyfile`, `deploy/docker-compose.yml`, `gateway/internal/webui/`.
|
||
- Open details: the auto-flag threshold/window + whether the marker is persisted vs
|
||
in-memory; the landing image base (caddy vs nginx).
|
||
|
||
### R4 — Push enrichment + kill the last poll *(TODO 4 + 5)*
|
||
Replace `lobby.poll` with the existing `match_found` push (keep the poll as a ws-down
|
||
fallback). Enrich `your_turn`/`opponent_moved`/`notify` to carry the state payload so the UI
|
||
renders from the event without a follow-up `game.state` (removes the lobby↔game nav latency
|
||
the owner noticed). Wire-contract change: `pkg/fbs` event payloads → backend `notify` emit →
|
||
UI stream consumers (`ui/src/lib/app.svelte.ts`), with the per-game cache as the landing
|
||
spot; regenerate FB.
|
||
- Critical files: `pkg/fbs/scrabble.fbs`, `backend/internal/notify/events.go`,
|
||
`ui/src/lib/{app.svelte,transport}.ts`, `ui/src/screens/NewGame.svelte`.
|
||
- Open details: which events carry full vs delta payloads; the fallback-poll cadence when the
|
||
stream is down.
|
||
|
||
### R5 — Bundle slimming *(TODO 6)* — done
|
||
Analysed the bundle against the 100 KB-gzip budget; **no code slimming was warranted**, and the
|
||
budget metric was retargeted to measure the app correctly. The build already minifies +
|
||
tree-shakes; the dominant cost is the Connect/FlatBuffers transport runtime + generated bindings
|
||
+ the Svelte runtime (≈⅔ of `main`'s source is third-party/generated) — irreducible within scope.
|
||
**Lazy-loading was rejected**: `bundle-size.mjs` sums every emitted chunk, so code-splitting yields
|
||
no total-size win and adds request latency (+N gateway fetches on first navigation to a split
|
||
screen). i18n lazy-load was skipped (the catalogs are a sliver of a Svelte-runtime-dominated shared
|
||
chunk, and `en` must stay bundled as the `MessageKey` type source + fallback). Instead,
|
||
`bundle-size.mjs` now measures **per HTML entry**, with three independent gates on the natural chunk
|
||
boundaries — **app entry ≤ 100 KB, the Svelte+i18n shared chunk ≤ 30 KB, the landing's own chunk
|
||
≤ 5 KB** — since the app's real payload is its entry chunk plus the shared chunk (≈97 KB), while the
|
||
landing (≈24 KB) is reported separately and kept minimal. Same CLI + exit-code contract, so the CI
|
||
step is unchanged.
|
||
- Critical files: `ui/scripts/bundle-size.mjs`; no app code changed.
|
||
|
||
### R6 — Refactor + docs reconciliation + de-staging *(TODO 7)* — near last
|
||
Behaviour-preserving only. Three separable, separately-committed passes: (a) mechanical
|
||
**de-staging** — remove `Stage N`/`TODO-N` references from code, comments and service
|
||
READMEs (rename `stage6_test.go`); (b) **docs↔code reconciliation** — reconcile
|
||
`docs/ARCHITECTURE.md` / `docs/FUNCTIONAL.md`(+`_ru`) against the code-as-truth, fixing drift
|
||
and Go Doc comments; (c) **structural changes by a reviewed list** — surface a list of
|
||
proposed optimizations / test-suite consolidations to the owner, apply only the approved,
|
||
behaviour-preserving, test-gated ones. The full suite + the final stress run (R7) are the
|
||
regression gate. Incorporates the early-run (R2) bug fixes not already shipped.
|
||
- Open details: the structural-changes list itself (owner-approved before applying); the test
|
||
consolidation targets.
|
||
|
||
### R7 — Final stress run + tuning *(TODO 9, part 2)* — before Stage 18
|
||
Re-run the R2 harness against the final, refactored system on a clean contour; analyse
|
||
resource consumption across **all** components (gateway, backend, Postgres, the
|
||
metrics/observability stack, docker log volume) and agree the tuning (pool sizes, rate
|
||
limits, cache TTLs, container limits, GOMAXPROCS, log levels). Apply the agreed tuning; record
|
||
the methodology + results in the repo.
|
||
|
||
→ **Stage 18** (prod contour) then proceeds per [`PLAN.md`](PLAN.md).
|
||
|
||
## Sequencing rationale
|
||
|
||
`R1` first (cheapest now; everything builds on the final schema/naming and the stress test
|
||
must run against it). `R2` builds the harness and runs the **early** pass to surface bugs and
|
||
a resource baseline that feed `R3` and `R6`. `R3`/`R4`/`R5` harden and improve the system.
|
||
`R6` (de-stage + reconcile + structural) runs near the end so it sweeps settled code once and
|
||
benefits from all accumulated bug knowledge. `R7` validates the final system and tunes it.
|
||
Then Stage 18.
|
||
|
||
## Regression-safety discipline (cross-cutting)
|
||
|
||
- Every phase is a `feature/* → development` PR; CI (`unit` + `integration` + `ui` behind the
|
||
`CI / gate` check) must be green before the owner merges; watch the post-merge contour
|
||
deploy with `gitea-ci-watch.py`.
|
||
- `R6` structural changes are behaviour-preserving, test-gated, and split from the mechanical
|
||
sweeps; contentious items are owner-approved first.
|
||
- The two stress runs (`R2` early, `R7` final) are the system-level regression gate.
|
||
|
||
## Verification (per phase)
|
||
|
||
- `go build ./<module>/...`, `go vet`, `gofmt -l .` clean, `go test -count=1 ./<module>/...`;
|
||
UI: `pnpm check && pnpm test:unit && pnpm build`; the integration suite
|
||
(`-tags integration`) for DB/schema changes; `docker compose config` for deploy changes;
|
||
green CI on the PR + a healthy contour deploy.
|
||
- `R1`: prove the squashed baseline yields a schema identical to the 12-migration chain
|
||
(integration suite on a fresh DB) **before** deleting the old files.
|
||
- `R2`/`R7`: the harness runs end-to-end against the contour; the trip report lists concrete
|
||
defects + a resource profile from the Grafana cAdvisor/postgres_exporter panels.
|
||
|
||
## Refinements logged during implementation
|
||
|
||
- **R1** (interview + implementation):
|
||
- **Variant labels** `english`/`russian_scrabble`/`erudit` → **`scrabble_en`/`scrabble_ru`/`erudit_ru`**
|
||
across the backend (`engine.Variant.String`/`ParseVariant`; the `games`/`game_invitations` `variant`
|
||
CHECK in the baseline; GCG `#lexicon` and the `variant` metric attribute both flow from `String`),
|
||
the wire (`pkg/fbs` `variant` is a `string` field — values change with **no FlatBuffers regen**) and
|
||
the UI (`model.ts` union, `variants.ts` records, `codec`/`premiums`/mocks/tests, the admin
|
||
`dictionary.gohtml`). **Kept:** the Go enum identifiers (`VariantEnglish`…, internal) and the i18n
|
||
display keys (`new.english`/`new.russian`/`new.erudit`, display-only). `complaints.variant` stays
|
||
free-text (no CHECK, as before).
|
||
- **dawg filenames kept descriptive** (`en_sowpods`/`ru_scrabble`/`ru_erudit`) — only the registry's
|
||
`Variant` key carries the rename, so `registry.go`, the published `scrabble-solver` fixtures and the
|
||
dictionary release artifact are untouched (decouples the three repos).
|
||
- **Migrations squashed** 12 → one hand-written `00001_baseline.sql`. Verified by a
|
||
`pg_dump --schema-only` diff (the chain vs the baseline are **identical** but for the two intended
|
||
variant-CHECK values) plus the green integration suite. **No data migration** (no production data).
|
||
- **Done (cross-repo + contour):** the **`scrabble-dictionary` tidy** merged (PR #2) and was re-cut as
|
||
the **byte-identical `v1.0.1`** release for clean provenance (the backend stays on `v1.0.0` — same
|
||
bytes, no rewire; the backend pulls a version-pinned release artifact, not master). Post-merge the
|
||
contour `backend` schema was wiped (`DROP SCHEMA backend CASCADE` + restart, not a volume drop) and
|
||
re-migrated to the baseline — verified the new variant CHECK (`scrabble_en/scrabble_ru/erudit_ru`),
|
||
`games`=0 and a clean boot.
|
||
|
||
- **R2** (interview + implementation):
|
||
- **Locked decisions:** game assembly via **invitations** (real path, no robots; not direct game-row
|
||
inserts); **moderate** ramp **50 → 200 → 500** at 10 min/step; **diagnostic** pass bar (no SLO gate);
|
||
run as a **one-shot container on `scrabble-internal`** in this PR.
|
||
- **Harness** = new `scrabble/loadtest` module (`use ./loadtest` + a `replace scrabble/gateway` for the
|
||
dot-free edge-proto import). It seeds 1000 guest + 10000 durable accounts + sessions **directly in
|
||
Postgres** (token hash mirrors `backend/internal/session`), drives players over the **edge protocol**,
|
||
generates **mid-ranked legal moves locally** with the embedded `scrabble-solver` by replaying
|
||
`game.history` (the edge carries no board — mirrors `engine.ReplayBoard` via the public API), and a
|
||
**gateway-hammer**. Compact CLI (`run` / `cleanup`), distroless Dockerfile (DAWGs baked), Go unit tests.
|
||
- **Adding the module broke the other images' builds** — backend/gateway/telegram Dockerfiles reduce the
|
||
workspace but still referenced `./loadtest` (not in their context); each now also
|
||
`-dropuse=./loadtest` (backend/telegram additionally `-dropreplace` the gateway replace). Caught by the
|
||
first deploy run; verified by building all four images.
|
||
- **Harness payload fixes found by the smoke pass:** the draft DTO's `rack_order` is a string (was sent
|
||
as `[]` → `bad_request`); the display-name validator forbids digits/colons, so the cleanup marker
|
||
became a letters-only `Zzloadtest` so `profile.update` resends the seeded name. `chat_not_your_turn` /
|
||
`nudge_own_turn` are **by-design** turn gates, correctly exercised.
|
||
- **Observability:** added **cAdvisor + postgres_exporter** + the **Scrabble — Resources** dashboard +
|
||
two Prometheus jobs. **Finding:** cAdvisor yields only the root cgroup on the contour host (separate
|
||
XFS `/var/lib/docker` breaks its layer-ID resolution — the existing galaxy deploy has the same limit),
|
||
so per-container CPU/RSS for the early pass was captured via `docker stats`. **R7:** adopt the otelcol
|
||
`docker_stats` receiver (already the contrib image) for per-container metrics in Grafana.
|
||
- **Early run (2026-06-09):** ramped clean to 500 players, no crash/deadlock, cleanup removed all 11000
|
||
accounts. 1.2 M edge calls, 48 870 plays, 2 798 games finished; the per-user limiter held under the
|
||
hammer (99.97 % rejected, p99 2 ms). **Top finding:** ~14 % `transport_error` on `game.state` at 500
|
||
players, under CPU saturation (backend/gateway/Postgres each ~1 core) and amplified by the harness's
|
||
single shared `http2.Transport`; the harness itself peaked at 86 % of a core on the same host, so the
|
||
figures are pessimistic. Full trip report in [`../loadtest/REPORT-R2.md`](../loadtest/REPORT-R2.md);
|
||
it feeds R3 (h2c `MaxConcurrentStreams`/timeouts, body-size cap), R6 and R7 (per-player transports,
|
||
separate hardware, pool/limit sizing).
|
||
- **CI:** `./loadtest/...` added to the path filter + vet/build/test; `go.work.sum` carries the new deps.
|
||
|
||
- **R3** (interview + implementation):
|
||
- **Locked decisions:** the flag column lands by **editing the R1 baseline** (+ a contour schema
|
||
wipe after merge — no migration chain accrues before prod); auto-flag defaults **1000 rejected /
|
||
10 min** (`BACKEND_HIGHRATE_FLAG_THRESHOLD`/`_WINDOW`, rolling window, set-once, operator clears,
|
||
no auto-ban); landing image = **caddy:2-alpine**; throttle data flows **gateway → backend** (a
|
||
30 s per-key summary POST to the new `/api/v1/internal/ratelimit/report`, the existing trusted
|
||
direction) with the episode window + flag rule in the backend (`internal/ratewatch`); rejection
|
||
logging = **Warn summary per key per window + Debug per rejection** — a deliberate deviation from
|
||
the phase's "structured log per rejection" (the R2 hammer would have logged ~522k lines in
|
||
minutes); all three R2-report tails included (explicit h2c sizing, the session-resolve failure
|
||
cause at Warn, reviving the admin limiter).
|
||
- **Body cap:** `GATEWAY_MAX_BODY_BYTES` (default 1 MiB) as both the Connect per-message read limit
|
||
and an `http.MaxBytesReader` wrap of the public mux; an oversized Execute is `resource_exhausted`.
|
||
- **Dead config found:** `AdminPerMinute`/`AdminBurst` were never wired — the gateway `/_gm` mount is
|
||
now 429-guarded per IP ahead of its Basic-Auth. The caddy-fronted contour path stays unlimited
|
||
(stock caddy has no limiter) — an accepted gap, recorded in `docs/ARCHITECTURE.md` §12.
|
||
- **Landing split:** a `landing` target in `gateway/Dockerfile` (the UI build stage is shared;
|
||
identical compose build args keep it one cached build); the gateway drops `landing.html` from the
|
||
embed and 308-redirects `/` → `/app/`; the contour caddy routes `/app/`, `/telegram/` and the
|
||
Connect path to the gateway and the catch-all to the landing container; the CI deploy probe now
|
||
checks both `/` (landing) and `/app/` (gateway).
|
||
- **Observability:** `gateway_rate_limited_total{class}` (user/public/email/admin, aggregate-only)
|
||
+ a rate-vs-rejections panel on the Edge/UX dashboard; the admin console gains the **Throttled**
|
||
page (the in-memory episode window, reset-on-restart like `active_users`, plus the flagged-account
|
||
queue) and the flag badge / clear action on the user list / card.
|
||
- The jet regen also restored the previously missing `game_drafts`/`game_hidden` generated models
|
||
(their tables were added after the last jetgen run; no behaviour change).
|
||
|
||
- **R4** (interview + implementation):
|
||
- **Locked decisions:** **delta-first**, not full snapshots — an event carries only the new move and
|
||
the UI applies it to its per-game cache, keyed on `move_count` (idempotent + gap-safe: a gap or the
|
||
actor's own move falls back to a `game.state` + `game.history` refetch). `match_found` /
|
||
`game_started` carry the recipient's **initial `StateView`** (instant lobby→game); the fallback
|
||
refetch stays the existing two calls (no merged endpoint); the matchmaking poll runs **only while
|
||
the stream is down** (2.5 s); **all** UI-state-changing events carry their payload (incl. lobby `notify`).
|
||
- **Enriched events** (`pkg/fbs` trailing fields — backward-compatible, no FB regen of *values*, only
|
||
the schema): `opponent_moved` (+`move`/`game`/`bag_len`), `your_turn` (+`move_count`), `match_found`
|
||
(+`state`), `game_over` (+`game`), `notify` (+`account`/`invitation`/`state`). The pre-R4
|
||
`opponent_moved` scalars (`seat`/`action`/`score`/`total`) stay for wire back-compat, now redundant
|
||
with `move`/`game` — slated for the R6 de-stage.
|
||
- **Encoding placement:** the `notify` package keeps ownership of the FlatBuffers encoding (a new
|
||
`encode.go` mirrors the gateway transcode but reads wire-agnostic `notify.*` input structs +
|
||
`engine.MoveRecord`); the game/lobby/social services map their domain types to those structs, so the
|
||
wire schema stays out of the domain. **Flagged for R6:** this partly duplicates the gateway encoders
|
||
(different source types) — a candidate consolidation.
|
||
- **Actor self-fetch killed too** (beyond literal "push"): the `submit_play`/`pass`/`exchange`/`resign`
|
||
**response** (`MoveResult`) now returns the actor's refilled rack + bag size, so the mover renders the
|
||
next turn from the response — `Game.svelte`'s `commit`/`pass`/`exchange`/`resign` drop their `await load()`.
|
||
- **`match_found` enrichment** needs a per-seat initial state: `lobby.GameCreator` gained `InitialState`,
|
||
and `game.Service.InitialState` builds the `notify.PlayerState` (rack re-encoded to wire indices, the
|
||
variant alphabet embedded for a first-seen variant).
|
||
- **UI:** a pure `lib/gamedelta.ts` reducer (`applyMoveDelta` / `applyGameOver` / `seedInitialState`,
|
||
unit-tested) advances the cache; `app.svelte` seeds it on `match_found` / `game_started`; `Game.svelte`
|
||
applies the delta (falling back to `load()` while composing, on a gap, or on its own move's new rack);
|
||
`NewGame.svelte` polls only when `app.streamAlive` is false and guards its teardown so a push-delivered
|
||
match is not cancelled.
|
||
- **notify (friends/invitations) scope:** the backend carries the full account / invitation payload on the
|
||
wire (per "all events → push"); the UI seeds the game cache from `game_started` but keeps its lightweight
|
||
**authoritative** badge refresh (`refreshNotifications`, on the rare `notify` event + on foreground) rather
|
||
than adding client-side friend/invitation caches — the per-move hot path is fully de-fetched, which was the
|
||
goal. Deeper lobby-cache consumption is an easy follow-up.
|
||
- **No schema change** (no migration); the contour needs no DB wipe. Tests: `notify` FB round-trips +
|
||
`emitMove` delta + the `gamedelta` reducer; the e2e mock now emits the enriched delta.
|
||
|
||
- **R5** (interview + implementation):
|
||
- **No code slimming — by analysis.** A gzip measure + sourcemap attribution of the real `dist` showed
|
||
the app bundle is already minified + tree-shaken and dominated by the Connect/FlatBuffers transport
|
||
runtime + generated FB/PB bindings (≈⅔ of `main`'s source) and the Svelte runtime — all
|
||
third-party/generated, irreducible within R5's scope. App-authored code carries no hand-trimmable fat.
|
||
- **Lazy-load rejected** (screens *and* i18n): `bundle-size.mjs` sums every emitted chunk, so
|
||
code-splitting moves bytes between chunks for **zero total-size win** while adding request latency (+N
|
||
gateway fetches on first navigation to a split screen). i18n lazy-load additionally buys ≤3 KB (en-only
|
||
users) at the cost of an async `t()`, and `en` must stay bundled (it is the `MessageKey` type source +
|
||
fallback). **Chunk-collapsing rejected** too — keeping the near-static Svelte runtime in its own
|
||
cacheable chunk is the recommended practice (an app deploy then re-busts only `main`, not the runtime),
|
||
and HTTP/2 makes the extra preload request negligible.
|
||
- **Metric retargeted to the app.** The two-entry build (`index.html` app + `landing.html`) makes Rollup
|
||
hoist the code shared by both (Svelte runtime + i18n + `aboutContent`) into one preloaded chunk, so the
|
||
app actually loads its entry chunk **+ the shared chunk** (≈74 + ≈23 = **≈97 KB**), never `landing.js`
|
||
(≈1.6 KB). The old script summed all three chunks (98.8 KB), over-counting the app by `landing.js`.
|
||
`bundle-size.mjs` now parses each built HTML for the JS it eagerly loads and gates three parts
|
||
independently — **app entry ≤ 100 KB, shared (Svelte+i18n) ≤ 30 KB, landing-own ≤ 5 KB** — reporting the
|
||
app total (≈97) and landing total (≈24.5). Same CLI + exit-code contract, so the CI step is unchanged.
|
||
- **No app/source/build change** (`App.svelte`, `lib/i18n/`, `vite.config.ts` untouched); no schema
|
||
change, no contour wipe. The stale "~82 KB" figure was corrected in `bundle-size.mjs` and `ui/README.md`.
|