Files
scrabble-game/gateway/README.md
T
Ilia Denisov 8878711cf3 R3: gateway edge hardening — body cap, h2c sizing, rate-limit observability
- GATEWAY_MAX_BODY_BYTES (1 MiB): connect WithReadMaxBytes + http.MaxBytesReader
  on the public mux; explicit http2.Server MaxConcurrentStreams/IdleTimeout and
  an http.Server ReadHeaderTimeout (R2 report follow-up).
- gateway_rate_limited_total{class} counter, Debug per rejection, a rejection
  tracker drained every 30 s into a Warn summary per key and a report POST to
  /api/v1/internal/ratelimit/report (feeds the admin view + auto-flag).
- The dead AdminPerMinute/AdminBurst policy now guards the /_gm mount (429),
  ahead of its Basic-Auth.
- resolve() logs the cause of infra session-resolve failures at Warn (the
  transient unauthenticated dips from the R2 run); unknown tokens stay silent.
2026-06-10 01:58:48 +02:00

127 lines
6.9 KiB
Markdown

# gateway
The Scrabble platform's only public ingress (module `scrabble/gateway`). It
terminates the client's **Connect-RPC + FlatBuffers** traffic over HTTP/2
cleartext (`h2c`), authenticates the originating credential, mints/resolves a
thin opaque session, rate-limits, injects `X-User-ID` when forwarding to the
backend over REST/JSON, and bridges the backend's gRPC push stream to each
client's in-app live channel. It **embeds the static UI build** (`go:embed`, baked
in by the gateway image's node stage) and serves a **landing page** at `/` and the game
**SPA** at `/app/` (web) and `/telegram/` (the Mini App) — the single-origin model.
Hash-named `/assets/*` are served `immutable`; the HTML shells are `no-cache`. It can also serve the
backend's admin console at `/_gm` behind HTTP Basic-Auth for a local non-caddy run;
in the deployed contour the front caddy owns `/_gm` (see
[`../deploy`](../deploy)). See
[`../docs/ARCHITECTURE.md`](../docs/ARCHITECTURE.md) §2, §3, §10, §12, §13.
## Package layout
```
cmd/gateway/ # main: config -> backend client -> session cache ->
# push hub -> Connect h2c server (+ admin) -> serve
proto/edge/v1/ # Connect envelope contract (committed generated Go)
internal/config/ # GATEWAY_* env config
internal/backendclient/ # typed REST client (+ X-User-ID) and push gRPC client
internal/session/ # in-memory session cache (LRU/TTL, backend fallback)
internal/ratelimit/ # token-bucket limiter (golang.org/x/time/rate) + the rejection tracker (R3)
internal/connector/ # gRPC client to the Telegram connector (initData validate, out-of-app push) + routing
internal/push/ # live-event fan-out hub (per-user client streams)
internal/transcode/ # FlatBuffers<->REST bridge + message_type registry
internal/connectsrv/ # the Connect Gateway service over h2c (+ the in-memory active_users gauge)
internal/admin/ # Basic-Auth reverse proxy mounting the backend admin console at /_gm (verbatim)
internal/webui/ # embedded UI build (go:embed dist): landing at /, SPA at /app/ + /telegram/
```
The FlatBuffers payloads and the backend push proto are the shared wire
contracts in [`../pkg`](../pkg).
## Transport contract
A single `Gateway` Connect service: `Execute(message_type, payload, request_id)`
for unary operations and `Subscribe` for the live stream. The `payload` bytes are
FlatBuffers tables (`scrabble/pkg/fbs`); the gateway transcodes them to and from
the backend's JSON. The session token rides in `Authorization: Bearer`; `auth.*`
operations are unauthenticated and return the minted token. A unary domain
outcome rides back in `ExecuteResponse.result_code` (HTTP 200); only edge
failures become Connect error codes.
`auth.telegram` validates the Mini App `initData` by calling the **Telegram connector**
(`GATEWAY_CONNECTOR_ADDR`), which holds the bot token; the gateway also routes
out-of-app push to that connector for recipients with no live in-app stream
(ARCHITECTURE.md §10). When `GATEWAY_CONNECTOR_ADDR` is unset, both are disabled.
The Stage 6 message-type slice: `auth.telegram`, `auth.guest`,
`auth.email.request`, `auth.email.login`, `profile.get`, `game.submit_play`,
`game.state`, `lobby.enqueue`, `lobby.poll`, `chat.post`; live events
`your_turn`, `opponent_moved`, `chat_message`, `nudge`, `match_found`. Stage 7
added the play-loop ops; **Stage 8** added the social/account/history ops —
`friends.*` (list/incoming/request/respond/cancel/unfriend/code.issue/code.redeem),
`blocks.*`, `invitation.*` (list/create/accept/decline/cancel), `profile.update`,
`stats.get`, `game.gcg`, and the `notify` live event — all via the identical
transcode pattern (`transcode_social.go`). **Stage 11** added account linking & merge
`link.email.request/confirm/merge` and `link.telegram.confirm/merge`
(`transcode_link.go`); the telegram ops validate the **Login Widget** payload via the
connector (`ValidateLoginWidget`) and forward the trusted `external_id`. These
**superseded** the Stage 8 `email.bind.*` ops, which were removed.
## Configuration
| Variable | Default | Notes |
| --- | --- | --- |
| `GATEWAY_HTTP_ADDR` | `:8081` | public Connect/h2c listener (also serves the admin console at `/_gm`) |
| `GATEWAY_LOG_LEVEL` | `info` | zap level |
| `GATEWAY_BACKEND_HTTP_URL` | `http://localhost:8080` | backend REST base URL |
| `GATEWAY_BACKEND_GRPC_ADDR` | `localhost:9090` | backend push gRPC address |
| `GATEWAY_BACKEND_TIMEOUT` | `5s` | per backend REST call |
| `GATEWAY_ADMIN_USER` / `GATEWAY_ADMIN_PASSWORD` | unset | enable + guard the admin console at `/_gm` |
| `GATEWAY_CONNECTOR_ADDR` | unset | Telegram connector gRPC address (enables initData validation + out-of-app push) |
| `GATEWAY_DEFAULT_SUPPORTED_LANGUAGES` | `en,ru` | New Game variant gating set placed on the Session for non-platform logins (web / email / guest); a deployment may narrow it |
| `GATEWAY_SESSION_TTL` | `10m` | cached session lifetime |
| `GATEWAY_SESSION_CACHE_MAX` | `50000` | cached session cap |
| `GATEWAY_PUSH_HEARTBEAT_INTERVAL` | `10s` | live-stream keep-alive (an immediate heartbeat also fires on open, under the ~15s edge idle timeout) |
| `GATEWAY_MAX_BODY_BYTES` | `1048576` | caps one request body and one Connect message read; an oversized Execute is refused with `resource_exhausted` (R3) |
| `GATEWAY_SERVICE_NAME` | `scrabble-gateway` | OpenTelemetry `service.name` |
| `GATEWAY_OTEL_TRACES_EXPORTER` | `none` | `none`, `stdout` or `otlp` (gRPC; endpoint from `OTEL_EXPORTER_OTLP_*`) |
| `GATEWAY_OTEL_METRICS_EXPORTER` | `none` | `none`, `stdout` or `otlp` |
Rate-limit defaults (built-in): public 30/min·IP (burst 10), authenticated
300/min·user (burst 80, raised in Stage 17 for multi-device play), admin
60/min·IP (burst 20, guarding the `/_gm` mount ahead of its Basic-Auth),
email-code 5/10 min·IP (burst 2).
Every rejection increments `gateway_rate_limited_total{class}`
(`user`/`public`/`email`/`admin`) and logs one Debug line; a reporter drains the
per-key rejection tracker every 30 s, emits a Warn summary per throttled key and
posts the report to the backend (`/api/v1/internal/ratelimit/report`), feeding
the admin console's throttled view and the high-rate auto-flag (R3).
## Run
```sh
GATEWAY_BACKEND_HTTP_URL=http://localhost:8080 \
GATEWAY_BACKEND_GRPC_ADDR=localhost:9090 \
go run ./gateway/cmd/gateway # Connect edge on :8081
```
## Generated code
The Connect envelope Go is committed under `proto/edge/v1`. Regenerate after
editing the `.proto` (dev-time, like `backend/cmd/jetgen`):
```sh
make -C gateway tools # go install protoc-gen-go + protoc-gen-connect-go
make -C gateway gen # buf generate (local plugins)
```
The FlatBuffers payloads are generated in [`../pkg`](../pkg) (`make -C pkg fbs`).
## Tests
```sh
go test -count=1 ./gateway/...
```
All gateway tests are hermetic: no real network, a fake backend (`httptest`) and
credential fixtures. There is no integration (Docker) suite — the gateway holds
no database.