R3: dashboards, docs and tracker bake-back

- Edge/UX dashboard: aggregate request-rate vs rejection-rate panel (gateway_rate_limited_total by class; no per-user labels). - ARCHITECTURE §2/§11/§12/§13: body cap + explicit h2c sizing, the rate-limit observability pipeline and auto-flag policy, the admin-limiter note (and the caddy-path gap), the landing container topology; fixed the stale 120/min per-user figure. - FUNCTIONAL (+_ru): the Throttled view and the reversible high-rate flag. - gateway/backend/deploy READMEs, TESTING.md, root CLAUDE.md updated. - PRERELEASE.md: R3 interview decisions + implementation refinements logged; tracker R3 -> done (this PR implements it; CI gates the merge).
2026-06-10 05:12:17 +02:00
parent f20a4b49ff
commit 7e75c32d07
10 changed files with 144 additions and 29 deletions
@@ -98,6 +98,15 @@ dropped). Horizontal scaling is explicit future work.
  response was lost — its button is disabled while offline and the player re-issues it on
  reconnect). A reachability watcher (a lightweight `profile.get` probe) clears the signal when no
  other traffic is in flight; the live `Subscribe` stream's drop/recovery feeds the same signal.
+  **Edge hardening (R3):** every request body on the public listener is capped at
+  `GATEWAY_MAX_BODY_BYTES` (default 1 MiB — far above any legitimate payload), both at the HTTP
+  layer (`http.MaxBytesReader`) and as the Connect per-message read limit, so an oversized
+  `Execute` is refused (`resource_exhausted`) without buffering. The h2c server carries explicit
+  sizing: `MaxConcurrentStreams` 250 (the x/net default made visible — a real client holds one
+  `Subscribe` stream plus a few unary calls) and a 3-minute connection `IdleTimeout` (a live
+  `Subscribe` stream keeps its connection active, so only abandoned connections are reaped); the
+  `http.Server` sets only `ReadHeaderTimeout` (10 s) — Read/WriteTimeout would kill the stream.
+  R7 revisits the exact values under load.
 - **Alphabet on the wire (Stage 13)**: live play exchanges **alphabet indices**, not
  concrete letters. The rack (`StateView.rack`), the `SubmitPlay`/`Evaluate` tiles, the
  `Exchange` tiles and the `CheckWord` word are `ubyte` indices into the variant's alphabet
@@ -572,6 +581,21 @@ promotions) is future work and would deliver short markdown messages (text + lin
  distinct accounts that performed an authenticated edge action in the window. The
  gauge is single-process by design (single-instance MVP, §10): it is correct for one
  gateway, resets on restart, and is a live operational figure, not a billing count.
+- **Rate-limit observability (R3):** every limiter rejection increments the gateway
+  counter `gateway_rate_limited_total` (`class` = user/public/email/admin — aggregate
+  only, honouring the no-per-user-label discipline above) and logs one **Debug** line;
+  a gateway reporter drains the per-key rejection tracker every 30 s, emits one **Warn**
+  summary per throttled key and posts the report to the backend
+  (`POST /api/v1/internal/ratelimit/report`, network-trusted like `sessions/resolve`).
+  The backend's `ratewatch` keeps a bounded in-memory episode window (single-instance,
+  resets on restart, like `active_users`) surfaced on the admin console's **Throttled**
+  page next to the flagged-account review queue, and applies the **conservative
+  auto-flag**: an account sustaining `BACKEND_HIGHRATE_FLAG_THRESHOLD` rejected calls
+  (default 1000) within `BACKEND_HIGHRATE_FLAG_WINDOW` (default 10 min) gets the soft,
+  reversible `accounts.flagged_high_rate_at` marker — set once, shown in the user
+  list/detail, cleared by the operator, **never an automatic ban** and never a request
+  gate. The Edge/UX dashboard graphs the aggregate request rate against the rejection
+  rate by class.
 - Unauthenticated `GET /healthz` (liveness) and `GET /readyz` (readiness — the
  database answers a bounded ping and the session cache is warmed).
 - The backend serves a **second listener** — a gRPC server
@@ -582,12 +606,12 @@ promotions) is future work and would deliver short markdown messages (text + lin

 | Concern | Enforced by |
 | --- | --- |
-| Public rate limiting / anti-abuse | gateway |
+| Public rate limiting / anti-abuse | gateway (per-IP public/email/admin classes, per-user authenticated class; a request body cap of `GATEWAY_MAX_BODY_BYTES`; rejections are metered, summarised to the backend and surfaced in the admin console with a conservative reversible auto-flag — R3, §11) |
 | Telegram initData validation (bot-token HMAC) | the Telegram connector; the gateway delegates it over gRPC, so the bot token lives only in the connector |
 | Session minting; email-code / guest validation | gateway (with backend) |
 | Session → `user_id` resolution, `X-User-ID` injection | gateway |
 | Authorisation, ownership, state transitions | backend (`X-User-ID` is the sole identity input) |
-| Admin authentication | a single Basic-Auth gate on `/_gm/*`, forwarded **verbatim** to the backend's server-rendered admin console (and, in the deployed contour, routing `/_gm/grafana/*` to Grafana). In the deploy the **caddy** owns this gate (§13); a local non-caddy run uses the gateway's own `GATEWAY_ADMIN_*` proxy. The backend trusts the proxy (no admin principal) and guards its state-changing POSTs with a **same-origin** check — the console's CSRF defence. No operator identity is tracked |
+| Admin authentication | a single Basic-Auth gate on `/_gm/*`, forwarded **verbatim** to the backend's server-rendered admin console (and, in the deployed contour, routing `/_gm/grafana/*` to Grafana). In the deploy the **caddy** owns this gate (§13); a local non-caddy run uses the gateway's own `GATEWAY_ADMIN_*` proxy, which the per-IP admin limiter class guards ahead of its Basic-Auth (R3) — the caddy-fronted path has no limiter (stock caddy), an accepted gap. The backend trusts the proxy (no admin principal) and guards its state-changing POSTs with a **same-origin** check — the console's CSRF defence. No operator identity is tracked |
 | backend ↔ gateway ↔ connector trust | the network (only gateway may reach backend; the connector serves unauthenticated gRPC on the internal segment) |

 This is an explicit, accepted MVP risk: compromise of the gateway↔backend
@@ -597,7 +621,7 @@ mutual auth is a future hardening step.
 **Short numeric codes** (email confirm-codes and Stage 8 friend codes) are stored
 only as SHA-256 hashes and are short-lived and single-use. The unauthenticated
 email path carries a tight per-IP sub-limit (5 / 10 min); the **friend-code redeem**
-is authenticated, so it rides the per-user limit (120 / min) and is further bounded
+is authenticated, so it rides the per-user limit (300 / min) and is further bounded
 by the code's 12 h TTL, single use, and **one live code per issuer** (which caps the
 valid-code population). Brute-forcing a 6-digit friend code within these limits is an
 accepted MVP risk with low blast radius (an unwanted friendship is removable/blockable);
@@ -605,22 +629,27 @@ a dedicated redeem sub-limit or a longer code is the hardening step if abuse app

 ## 13. Deployment (informational)

-Single public origin, path-routed. The gateway **embeds** the static UI build
-(`go:embed`, baked in by a node stage in `gateway/Dockerfile`). The Vite build has two
-entries: a lightweight **landing page** served at `/`, and the game **SPA** served at
+Single public origin, path-routed. The Vite build has two entries: a lightweight
+**landing page** and the game **SPA**. The gateway **embeds** the SPA build
+(`go:embed`, baked in by a node stage in `gateway/Dockerfile`) and serves it at
 `/app/` (web) and `/telegram/` (the Telegram Mini App; outside Telegram that path
-redirects to the root — the client-side guard). Hash-named `/assets/*` are served
+redirects to the root — the client-side guard); a stray hit on the gateway's `/`
+308-redirects to `/app/`. The **landing** ships in its own static container (R3): the
+`landing` target of `gateway/Dockerfile` (caddy:2-alpine + the same Vite build,
+`deploy/landing/Caddyfile`) serves it at `/`, so stray public traffic is absorbed by
+static file serving and never reaches the Go edge. Hash-named `/assets/*` are served
 `immutable` (a relaunch is a cache hit, not a re-download); the HTML shells are
-`no-cache` so a new deploy is picked up. An in-compose **caddy** is the
-contour's edge: it owns a single `/_gm` Basic-Auth and routes `/_gm/grafana/*` to
-**Grafana** (anonymous-admin, so the one shared login gates it with no per-user
-Grafana accounts) and the rest of `/_gm/*` to the backend-rendered **admin console**;
-everything else (`/`, `/app/`, `/telegram/`, the Connect edge) goes to the gateway. The
+`no-cache` so a new deploy is picked up — both containers apply the same caching. An
+in-compose **caddy** is the contour's edge: it owns a single `/_gm` Basic-Auth and
+routes `/_gm/grafana/*` to **Grafana** (anonymous-admin, so the one shared login gates
+it with no per-user Grafana accounts) and the rest of `/_gm/*` to the backend-rendered
+**admin console**; `/app/`, `/telegram/` and the Connect path go to the gateway; the
+catch-all — notably the landing at `/` — goes to the landing container. The
 **Telegram connector** runs as a separate container with **no public ingress** — it
 long-polls Telegram and egresses through a VPN sidecar, answering only internal gRPC.

 The full contour (`deploy/docker-compose.yml`) runs one `gateway`, one `backend`,
-one Postgres, the connector (+ its VPN sidecar) and the **observability stack** —
+one Postgres, the static `landing`, the connector (+ its VPN sidecar) and the **observability stack** —
 OTel Collector (OTLP/gRPC ingest → Prometheus metrics + Tempo traces) and Grafana
 with provisioned datasources and dashboards. All three services export OTLP to the
 collector; the connector shares the VPN sidecar's netns, so its `AWG_CONF` must not
@@ -633,7 +662,8 @@ network (project-scoped DNS); only caddy joins the shared external `edge` networ
 Two contours, two secret/variable prefixes (`TEST_` / `PROD_`):
 - **Test** (Stage 16): auto-deploys on a PR into — or a push to — `development`
  (`.gitea/workflows/ci.yaml` → `docker compose up -d --build` on the Gitea runner
-  host, then a `GET /` probe through caddy). The host caddy terminates TLS and
+  host, then `GET /` + `GET /app/` probes through caddy — the landing container and
+  the gateway, R3). The host caddy terminates TLS and
  forwards the domain to `scrabble:80`, so the in-compose caddy serves plain HTTP
  (`CADDY_SITE_ADDRESS=:80`). The in-compose caddy **trusts X-Forwarded-For from
  private-range upstreams** (`trusted_proxies private_ranges`), so the real client IP —