R3: dashboards, docs and tracker bake-back
CI / changes (pull_request) Successful in 1s
CI / unit (pull_request) Successful in 8s
CI / integration (pull_request) Successful in 12s
CI / ui (pull_request) Successful in 36s
CI / gate (pull_request) Successful in 0s
CI / deploy (pull_request) Successful in 1m7s
CI / changes (pull_request) Successful in 1s
CI / unit (pull_request) Successful in 8s
CI / integration (pull_request) Successful in 12s
CI / ui (pull_request) Successful in 36s
CI / gate (pull_request) Successful in 0s
CI / deploy (pull_request) Successful in 1m7s
- Edge/UX dashboard: aggregate request-rate vs rejection-rate panel (gateway_rate_limited_total by class; no per-user labels). - ARCHITECTURE §2/§11/§12/§13: body cap + explicit h2c sizing, the rate-limit observability pipeline and auto-flag policy, the admin-limiter note (and the caddy-path gap), the landing container topology; fixed the stale 120/min per-user figure. - FUNCTIONAL (+_ru): the Throttled view and the reversible high-rate flag. - gateway/backend/deploy READMEs, TESTING.md, root CLAUDE.md updated. - PRERELEASE.md: R3 interview decisions + implementation refinements logged; tracker R3 -> done (this PR implements it; CI gates the merge).
This commit is contained in:
+29
-1
@@ -19,7 +19,7 @@ the edge before prod. Each phase maps back to the owner's raw pre-release TODO l
|
||||
|---|-------|-----------|--------|
|
||||
| R1 | Schema & naming reset | 1 + 10 | **done** |
|
||||
| R2 | Stress harness + contour observability + early run | 9a | **done** |
|
||||
| R3 | Edge hardening | 2 + 8 + 3 | todo |
|
||||
| R3 | Edge hardening | 2 + 8 + 3 | **done** |
|
||||
| R4 | Push enrichment + kill the last poll | 4 + 5 | todo |
|
||||
| R5 | Bundle slimming | 6 | todo |
|
||||
| R6 | Refactor + docs reconciliation + de-staging | 7 | todo |
|
||||
@@ -253,3 +253,31 @@ Then Stage 18.
|
||||
it feeds R3 (h2c `MaxConcurrentStreams`/timeouts, body-size cap), R6 and R7 (per-player transports,
|
||||
separate hardware, pool/limit sizing).
|
||||
- **CI:** `./loadtest/...` added to the path filter + vet/build/test; `go.work.sum` carries the new deps.
|
||||
|
||||
- **R3** (interview + implementation):
|
||||
- **Locked decisions:** the flag column lands by **editing the R1 baseline** (+ a contour schema
|
||||
wipe after merge — no migration chain accrues before prod); auto-flag defaults **1000 rejected /
|
||||
10 min** (`BACKEND_HIGHRATE_FLAG_THRESHOLD`/`_WINDOW`, rolling window, set-once, operator clears,
|
||||
no auto-ban); landing image = **caddy:2-alpine**; throttle data flows **gateway → backend** (a
|
||||
30 s per-key summary POST to the new `/api/v1/internal/ratelimit/report`, the existing trusted
|
||||
direction) with the episode window + flag rule in the backend (`internal/ratewatch`); rejection
|
||||
logging = **Warn summary per key per window + Debug per rejection** — a deliberate deviation from
|
||||
the phase's "structured log per rejection" (the R2 hammer would have logged ~522k lines in
|
||||
minutes); all three R2-report tails included (explicit h2c sizing, the session-resolve failure
|
||||
cause at Warn, reviving the admin limiter).
|
||||
- **Body cap:** `GATEWAY_MAX_BODY_BYTES` (default 1 MiB) as both the Connect per-message read limit
|
||||
and an `http.MaxBytesReader` wrap of the public mux; an oversized Execute is `resource_exhausted`.
|
||||
- **Dead config found:** `AdminPerMinute`/`AdminBurst` were never wired — the gateway `/_gm` mount is
|
||||
now 429-guarded per IP ahead of its Basic-Auth. The caddy-fronted contour path stays unlimited
|
||||
(stock caddy has no limiter) — an accepted gap, recorded in `docs/ARCHITECTURE.md` §12.
|
||||
- **Landing split:** a `landing` target in `gateway/Dockerfile` (the UI build stage is shared;
|
||||
identical compose build args keep it one cached build); the gateway drops `landing.html` from the
|
||||
embed and 308-redirects `/` → `/app/`; the contour caddy routes `/app/`, `/telegram/` and the
|
||||
Connect path to the gateway and the catch-all to the landing container; the CI deploy probe now
|
||||
checks both `/` (landing) and `/app/` (gateway).
|
||||
- **Observability:** `gateway_rate_limited_total{class}` (user/public/email/admin, aggregate-only)
|
||||
+ a rate-vs-rejections panel on the Edge/UX dashboard; the admin console gains the **Throttled**
|
||||
page (the in-memory episode window, reset-on-restart like `active_users`, plus the flagged-account
|
||||
queue) and the flag badge / clear action on the user list / card.
|
||||
- The jet regen also restored the previously missing `game_drafts`/`game_hidden` generated models
|
||||
(their tables were added after the last jetgen run; no behaviour change).
|
||||
|
||||
Reference in New Issue
Block a user