ui: plan 01-27 done #1

Merged
developer merged 120 commits from ai/ui-client into main 2026-05-13 18:55:14 +00:00
6 changed files with 3247 additions and 3012 deletions
Showing only changes of commit 08f1917bc1 - Show all commits
File diff suppressed because it is too large Load Diff
+39 -13
View File
@@ -30,19 +30,29 @@ This repository hosts the Galaxy Game project.
- `galaxy/<service>/PLAN.md` — staged implementation plan for the service.
May be already complete and resides for historical reasons.
- `galaxy/<service>/docs/`per-stage decision records
(one file per decision, re-organized after full implementation
of `PLAN.md`).
- `galaxy/<service>/docs/`live topic-based documentation that's
deeper than what fits in `README.md` (per-feature design notes,
protocol specs, runbooks). Not stage-by-stage history.
## Decision records when implementing stages from PLAN.md
## Decisions during stage implementation
- Stage-related discussion and decisions do NOT live in `README.md` or
`docs/ARCHITECTURE.md`. Those files describe the current state, not the history.
- Each non-trivial decision gets its own `.md` under the module's `docs/`,
referenced from the relevant `README.md`.
- Any agreement reached during interactive planning that is not obvious from
the code must be captured — either as a decision record or as an entry in
the module's README.
Stages from `PLAN.md` produce decisions. Those decisions never live in a
separate per-decision history file. Instead, every non-obvious decision is
baked back into the live state in three places:
1. **The plan itself.** Update the relevant stage's text, acceptance
criteria, or targeted tests so it reflects what was decided. If
earlier already-implemented stages need to follow the new agreement,
correct their code, tests, and live docs in the same patch.
2. **Later, not-yet-implemented stages.** When a decision affects later
stages — scope, dependencies, deliverables, or tests — update those
stages now, do not leave the future to re-derive them.
3. **Live documentation.** Module `README.md`, project
`docs/ARCHITECTURE.md`, `docs/FUNCTIONAL.md` (with its
`docs/FUNCTIONAL_ru.md` mirror), the affected service `openapi.yaml`
or `*.proto`, and any topic doc under `galaxy/<service>/docs/` that
the decision touches. `README.md` and `ARCHITECTURE.md` always
describe current state, not the history of how it was reached.
## Scope of PLAN.md changes
@@ -82,8 +92,8 @@ details.
The same behaviour is described in several parallel sources: code,
`docs/ARCHITECTURE.md`, `docs/FUNCTIONAL.md` (with its Russian mirror
`docs/FUNCTIONAL_ru.md`), the affected service `README.md`, the
relevant `openapi.yaml` or `*.proto`, and the per-stage decision
records under `galaxy/<service>/docs/`. They must never disagree.
relevant `openapi.yaml` or `*.proto`, and the topic-based docs under
`galaxy/<service>/docs/`. They must never disagree.
- Any patch that changes user-visible behaviour, an API contract, or a
cross-service flow updates every affected source in the same change
@@ -103,6 +113,22 @@ records under `galaxy/<service>/docs/`. They must never disagree.
`docs/FUNCTIONAL_ru.md` (translate only the touched paragraphs).
Skipping the mirror is treated as an incomplete patch.
## Code compactness
- Prefer compact code over speculative universality. Three similar
occurrences are not yet a pattern — wait for the third real caller
before extracting an abstraction.
- Do not add seams, hooks, or configuration knobs for hypothetical
future requirements. If the next stage of `PLAN.md` will need
something, the next stage will add it.
- A bug fix does not need surrounding cleanup; a one-shot operation
does not need a helper function; a single concrete value does not
need a parameter.
- When the plan can be satisfied by reusing an existing function or
type, do that instead of introducing a new one.
- This rule is about scope, not laziness — well-named identifiers,
precise types, and full test coverage stay non-negotiable.
## Dependencies
- Before adding a new module, check its upstream repository for the latest
-868
View File
@@ -1,868 +0,0 @@
# backend — Implementation Plan
This plan has been already implemented and stays here for historical reasons.
It should NOT be threated as source of truth for service functionality.
---
## Summary
This plan is the technical specification for implementing the
consolidated Galaxy `backend` service. It is read together with
`../docs/ARCHITECTURE.md` (architecture and security model) and
`README.md` (module layout, configuration, operations).
After reading those two documents and this plan, an implementing
engineer should not need to ask architectural questions. Every stage is
self-contained inside its domain area; stages run in order; each stage
has explicit Critical files.
The plan does not invent new domain concepts. It catalogues the work
required to assemble what the architecture document already defines.
## ~~Stage 1~~ — Repository cleanup
This stage was implemented and marked as done.
Goal: remove every module whose responsibility moves into `backend`,
and prepare the workspace for the new module.
Actions:
1. `git rm -r authsession/ lobby/ mail/ notification/ gamemaster/
rtmanager/ geoprofile/ user/ integration/ pkg/redisconn/
pkg/notificationintent/`.
2. Edit `go.work`:
- Remove `use` lines for the deleted modules.
- Remove `replace` lines for `galaxy/redisconn` and
`galaxy/notificationintent`.
- Do not add `./backend` yet — the module is created in Stage 2.
3. Confirm that surviving modules still build:
`go build ./gateway/... ./game/... ./client/... ./pkg/...`.
Any compile error here means a surviving module imported a
removed package and must be patched (the only realistic culprit is
`gateway`, which references `pkg/redisconn` and the deleted streams;
patches there belong to Stage 6, not Stage 1 — for Stage 1 it is
acceptable to leave gateway broken if and only if the only failures
come from imports of removed packages).
4. Run `go vet ./pkg/...` and confirm no diagnostic.
Out of scope: any code change inside surviving modules. Stage 1 is
purely deletion plus `go.work` edits.
Critical files:
- `go.work`
- the deletion of `authsession/`, `lobby/`, `mail/`, `notification/`,
`gamemaster/`, `rtmanager/`, `geoprofile/`, `user/`, `integration/`,
`pkg/redisconn/`, `pkg/notificationintent/`.
Done criteria:
- `git status` shows only deletions plus the `go.work` edit.
- `go build ./pkg/...` is clean.
- `go vet ./pkg/...` is clean.
## ~~Stage 2~~ — Backend skeleton & shared infrastructure
This stage was implemented and marked as done.
Goal: stand up the new module with its boot path, configuration,
telemetry, logger, HTTP listener, Postgres pool, and gRPC listener — all
with empty handlers. After this stage `go run ./backend/cmd/backend`
must boot to a state where probes return 200 and migrations run (with an
empty migration file).
Actions:
1. Create `backend/go.mod` with module path `galaxy/backend` and Go
version matching `go.work`. Add direct dependencies:
`github.com/gin-gonic/gin`, `github.com/jackc/pgx/v5`,
`github.com/go-jet/jet/v2`, `github.com/pressly/goose/v3`,
`go.uber.org/zap`, `go.opentelemetry.io/otel` and the OTLP
trace/metric exporters used by other services, and the `galaxy/*`
pkg modules (`postgres`, `model`, `geoip`, `cronutil`, `error`,
`util`).
2. Add `./backend` to `go.work` `use(...)`.
3. `backend/cmd/backend/main.go` — boot order:
1. Load `config.LoadFromEnv()`; `cfg.Validate()`.
2. Initialise telemetry (`telemetry.NewProcess(cfg.Telemetry)`). Set
global tracer and meter providers.
3. Construct the zap logger; inject trace fields helper.
4. Open Postgres pool. Apply embedded migrations with goose. Fail
fast on any error.
5. Construct module wiring (empty for now; populated in Stage 5).
6. Start the HTTP server (gin engine with empty route groups, plus
`/healthz` and `/readyz`).
7. Start the gRPC push server (no streams accepted yet — Stage 6).
8. Block on `signal.NotifyContext(ctx, SIGINT, SIGTERM)`; on signal,
drain in the order described in `README.md` §16.
4. `backend/internal/config/config.go` — env-loader following the
pattern used by surviving services. Cover every variable listed in
`README.md` §4. Provide `DefaultConfig()` and `Validate()`.
5. `backend/internal/telemetry/runtime.go` — port the existing service
pattern verbatim: configurable OTLP gRPC/HTTP exporter, optional
stdout exporter, Prometheus pull endpoint when configured. Expose
`TraceFieldsFromContext(ctx) []zap.Field`.
6. `backend/internal/server/server.go` — gin engine, three empty route
groups, request id middleware, panic recovery middleware, otel
middleware. Probe handlers in `server/probes.go`.
7. `backend/internal/postgres/pool.go` — pgx pool factory using the
shared `galaxy/postgres` helper.
8. `backend/internal/postgres/migrations/00001_init.sql` — empty file
containing the `-- +goose Up` and `-- +goose Down` markers and a
single `CREATE SCHEMA IF NOT EXISTS backend;` statement so the
migration is non-empty and can be verified.
9. `backend/internal/postgres/migrations/embed.go` — `embed.FS` and
exported `Migrations() fs.FS` helper.
10. `backend/internal/push/server.go` — gRPC server skeleton bound to
`cfg.GRPCPushListenAddr`. No service registered yet.
11. `backend/Makefile` — at minimum a `jet` target stub that prints
"not generated yet"; will be filled in Stage 4.
Critical files:
- `backend/go.mod`, `go.work`
- `backend/cmd/backend/main.go`
- `backend/internal/config/config.go`
- `backend/internal/telemetry/runtime.go`
- `backend/internal/server/server.go`, `backend/internal/server/probes.go`
- `backend/internal/postgres/pool.go`,
`backend/internal/postgres/migrations/00001_init.sql`,
`backend/internal/postgres/migrations/embed.go`
- `backend/internal/push/server.go`
- `backend/Makefile`
Done criteria:
- `go build ./backend/...` is clean.
- `go run ./backend/cmd/backend` starts, applies the placeholder
migration, opens HTTP and gRPC listeners, and serves `/healthz` 200
and `/readyz` 200.
- Telemetry output (stdout exporter) shows trace and metric activity on
a probe hit.
## ~~Stage~~ 3 — API contract & routing
This stage was implemented and marked as done.
Goal: define the entire backend REST contract in `openapi.yaml` and
register every handler as a placeholder that returns
`501 Not Implemented`. Wire the middleware stack for each route group.
The contract test suite must validate every endpoint round-trip against
the OpenAPI document and pass on the placeholders.
Actions:
1. Author `backend/openapi.yaml` — single document with three tags
(`Public`, `User`, `Admin`) and the endpoint set below. Reuse
schemas from `pkg/model` where possible; keep the rest under
`components/schemas/*`.
2. Implement middleware in `backend/internal/server/middleware/`:
- `requestid` — assigns and propagates a request id (Stage 2 may
have already done this; consolidate here).
- `logging` — emits an access log entry with trace fields.
- `metrics` — counters and histograms per route group.
- `panicrecovery` — converts panics to 500 with structured logging.
- `userid` — required on `/api/v1/user/*`. Reads `X-User-ID`,
parses as UUID, places it in the request context. Rejects with
400 if missing or malformed. Backend trusts the value (see
architecture trust note).
- `basicauth` — required on `/api/v1/admin/*`. Stage 3 uses a stub
verifier that accepts any non-empty username and a fixed password
read from a test-only env var so contract tests can pass; Stage
5.3 replaces the verifier with the real Postgres-backed one.
3. Implement handlers per endpoint in
`backend/internal/server/handlers_<group>_<topic>.go`. Every handler
returns `501 Not Implemented` with the standard error body
`{"error":{"code":"not_implemented","message":"..."}}`.
4. Implement the contract test:
`backend/internal/server/contract_test.go`. Loads
`backend/openapi.yaml` via `kin-openapi`, builds the gin engine,
walks every operation, sends a representative request, and
validates both the request and response against the OpenAPI
document.
5. Document `openapi.yaml` location and contract test pattern in
`backend/docs/api-contract.md` (a brief decision record).
### Endpoint inventory
Public (`/api/v1/public/*`):
- `POST /auth/send-email-code` — request body `{email, locale?}`;
response `{challenge_id}`.
- `POST /auth/confirm-email-code` — request body
`{challenge_id, code, client_public_key, time_zone}`; response
`{device_session_id}`.
Probes (root):
- `GET /healthz` — `200` always when the process is alive.
- `GET /readyz` — `200` once Postgres reachable, migrations applied,
gRPC listener bound; `503` otherwise.
User (`/api/v1/user/*`, all require `X-User-ID`):
- `GET /account` — current account view (profile + settings +
entitlements).
- `PATCH /account/profile` — update mutable profile fields
(`display_name`).
- `PATCH /account/settings` — update `preferred_language`, `time_zone`.
- `POST /account/delete` — soft delete; cascade is in process.
- `GET /lobby/games` — public list with paging.
- `POST /lobby/games` — create.
- `GET /lobby/games/{game_id}`.
- `PATCH /lobby/games/{game_id}`.
- `POST /lobby/games/{game_id}/open-enrollment`.
- `POST /lobby/games/{game_id}/ready-to-start`.
- `POST /lobby/games/{game_id}/start`.
- `POST /lobby/games/{game_id}/pause`.
- `POST /lobby/games/{game_id}/resume`.
- `POST /lobby/games/{game_id}/cancel`.
- `POST /lobby/games/{game_id}/retry-start`.
- `POST /lobby/games/{game_id}/applications`.
- `POST /lobby/games/{game_id}/applications/{application_id}/approve`.
- `POST /lobby/games/{game_id}/applications/{application_id}/reject`.
- `POST /lobby/games/{game_id}/invites`.
- `POST /lobby/games/{game_id}/invites/{invite_id}/redeem`.
- `POST /lobby/games/{game_id}/invites/{invite_id}/decline`.
- `POST /lobby/games/{game_id}/invites/{invite_id}/revoke`.
- `GET /lobby/games/{game_id}/memberships`.
- `POST /lobby/games/{game_id}/memberships/{membership_id}/remove`.
- `POST /lobby/games/{game_id}/memberships/{membership_id}/block`.
- `GET /lobby/my/games`.
- `GET /lobby/my/applications`.
- `GET /lobby/my/invites`.
- `GET /lobby/my/race-names`.
- `POST /lobby/race-names/register` — promote a `pending_registration`
to `registered` within the 30-day window.
- `POST /games/{game_id}/commands` — proxy to engine command path.
- `POST /games/{game_id}/orders` — proxy to engine order validation.
- `GET /games/{game_id}/reports/{turn}` — proxy to engine report path.
Admin (`/api/v1/admin/*`, all require Basic Auth):
- `GET /admin-accounts`, `POST /admin-accounts`,
`GET /admin-accounts/{username}`,
`POST /admin-accounts/{username}/disable`,
`POST /admin-accounts/{username}/enable`,
`POST /admin-accounts/{username}/reset-password`.
- `GET /users`, `GET /users/{user_id}`,
`POST /users/{user_id}/sanctions`,
`POST /users/{user_id}/limits`,
`POST /users/{user_id}/entitlements`,
`POST /users/{user_id}/soft-delete`.
- `GET /games`, `GET /games/{game_id}`,
`POST /games/{game_id}/force-start`,
`POST /games/{game_id}/force-stop`,
`POST /games/{game_id}/ban-member`.
- `GET /runtimes/{game_id}`,
`POST /runtimes/{game_id}/restart`,
`POST /runtimes/{game_id}/patch`,
`POST /runtimes/{game_id}/force-next-turn`,
`GET /engine-versions`, `POST /engine-versions`,
`PATCH /engine-versions/{id}`,
`POST /engine-versions/{id}/disable`.
- `GET /mail/deliveries`,
`GET /mail/deliveries/{delivery_id}`,
`GET /mail/deliveries/{delivery_id}/attempts`,
`POST /mail/deliveries/{delivery_id}/resend`,
`GET /mail/dead-letters`.
- `GET /notifications`, `GET /notifications/{notification_id}`,
`GET /notifications/dead-letters`,
`GET /notifications/malformed`.
- `GET /geo/users/{user_id}/countries` — counter listing.
Internal (gateway-only, `/api/v1/internal/*`):
- `GET /sessions/{device_session_id}` — gateway session lookup.
- `POST /sessions/{device_session_id}/revoke` — admin or self revoke
passthrough; backend emits `session_invalidation`.
- `POST /sessions/users/{user_id}/revoke-all`.
- `GET /users/{user_id}/account-internal` — server-to-server fetch
used by gateway flows that need account state alongside the session.
The internal group is on `/api/v1/internal/*`. The trust model treats
it as part of the user surface (no extra auth in MVP).
Critical files:
- `backend/openapi.yaml`
- `backend/internal/server/router.go`
- `backend/internal/server/middleware/{requestid,logging,metrics,panicrecovery,userid,basicauth}.go`
- `backend/internal/server/handlers_*.go`
- `backend/internal/server/contract_test.go`
- `backend/docs/api-contract.md`
Done criteria:
- `go test ./backend/internal/server/...` is green; the contract test
exercises every endpoint and validates against `openapi.yaml`.
- Every endpoint returns `501 Not Implemented` with the standard error
body.
- gin route table at startup matches the OpenAPI inventory exactly.
## ~~Stage 4~~ — Persistence layer
This stage was implemented and marked as done.
Goal: define every `backend` schema table, generate jet code, and make
the wiring of the persistence layer ready for the domain modules.
Actions:
1. Replace `backend/internal/postgres/migrations/00001_init.sql` with
the full DDL. The schema is `backend`. The expected tables and
their primary purposes:
Auth:
- `device_sessions(device_session_id uuid pk, user_id uuid not null,
client_public_key bytea not null, status text not null,
created_at, revoked_at, last_seen_at)` plus indexes on
`user_id` and `status`.
- `auth_challenges(challenge_id uuid pk, email text not null,
code_hash bytea not null, created_at, expires_at, consumed_at,
attempts int not null default 0)`. Index on `email`.
- `blocked_emails(email text pk, blocked_at, reason text)`.
User:
- `accounts(user_id uuid pk, email text unique not null,
user_name text unique not null, display_name text not null,
preferred_language text not null, time_zone text not null,
declared_country text, permanent_block bool not null default false,
created_at, updated_at, deleted_at)`.
- `entitlement_records(record_id uuid pk, user_id uuid not null,
tier text not null, source text not null, created_at)`.
- `entitlement_snapshots(user_id uuid pk, tier text not null,
max_registered_race_names int not null, taken_at timestamptz)`.
Updated on every entitlement change.
- `sanction_records`, `sanction_active`, `limit_records`,
`limit_active` — same shape as the previous `user` service had
(record + active rollup pattern).
Admin:
- `admin_accounts(username text pk, password_hash bytea not null,
created_at, last_used_at, disabled_at)`.
Lobby:
- `games(game_id uuid pk, owner_user_id uuid not null,
visibility text not null, status text not null, ...)` covering
enrollment state machine fields documented in
`ARCHITECTURE_deprecated.md` § Game Lobby.
- `applications(application_id uuid pk, game_id uuid not null,
applicant_user_id uuid not null, status text not null, ...)`.
- `invites(invite_id uuid pk, game_id uuid not null,
invited_user_id uuid, code text unique, status text, ...)`.
- `memberships(membership_id uuid pk, game_id uuid not null,
user_id uuid not null, race_name text not null, status text,
...)` plus `unique(game_id, user_id)`.
- `race_names(name text not null, canonical text not null,
status text not null, owner_user_id uuid, game_id uuid,
expires_at, registered_at, ...)` plus
`unique(canonical) where status in ('registered','reservation','pending_registration')`.
Runtime:
- `runtime_records(game_id uuid pk, current_container_id text,
status text not null, image_ref text, started_at, last_observed_at,
...)`.
- `engine_versions(version text pk, image_ref text not null,
enabled bool not null default true, created_at, ...)`.
- `player_mappings(game_id uuid not null, user_id uuid not null,
race_name text not null, engine_player_uuid uuid not null,
primary key(game_id, user_id))`.
- `runtime_operation_log(operation_id uuid pk, game_id uuid,
op text, status text, started_at, finished_at, error text)`.
- `runtime_health_snapshots(snapshot_id uuid pk, game_id uuid,
observed_at, payload jsonb)`.
Mail:
- `mail_deliveries(delivery_id uuid pk, template_id text not null,
idempotency_key text not null, status text not null,
attempts int not null default 0, next_attempt_at timestamptz,
payload_id uuid not null, created_at, ...)` plus
`unique(template_id, idempotency_key)`.
- `mail_recipients(recipient_id uuid pk, delivery_id uuid not null,
address text not null, kind text not null)`.
- `mail_attempts(attempt_id uuid pk, delivery_id uuid, attempt_no int,
started_at, finished_at, outcome text, error text)`.
- `mail_dead_letters(dead_letter_id uuid pk, delivery_id uuid,
archived_at, reason text)`.
- `mail_payloads(payload_id uuid pk, content_type text not null,
subject text, body bytea not null)`.
Notification:
- `notifications(notification_id uuid pk, kind text not null,
idempotency_key text not null, user_id uuid, payload jsonb,
created_at)` plus `unique(kind, idempotency_key)`.
- `notification_routes(route_id uuid pk, notification_id uuid,
channel text not null, status text not null, last_attempt_at,
...)`.
- `notification_dead_letters(dead_letter_id uuid pk, notification_id
uuid, archived_at, reason text)`.
- `notification_malformed_intents(id uuid pk, received_at, payload
jsonb, reason text)`.
Geo:
- `user_country_counters(user_id uuid not null, country text not null,
count bigint not null default 0, last_seen_at timestamptz,
primary key(user_id, country))`.
2. Add `created_at TIMESTAMPTZ DEFAULT now()` to every table; add
`updated_at` and `deleted_at` where the domain reasons in
`ARCHITECTURE_deprecated.md` apply. UTC normalisation is performed
in Go on read and write (the existing `pkg/postgres` helpers cover
this).
3. `backend/cmd/jetgen/main.go` — port the existing pattern from a
surviving reference (the previous services' `cmd/jetgen` is a good
template; adjust import paths to `galaxy/backend`). The tool spins
up a transient Postgres container, applies the embedded migrations,
and runs `jet -dsn=...` writing into `internal/postgres/jet/`.
4. `backend/Makefile` — fill in the `jet` target.
5. Run `make jet` and commit `internal/postgres/jet/`.
6. Add `backend/internal/postgres/jet/jet.go` — package doc and
`//go:generate` comment pointing to `cmd/jetgen`.
7. Sanity test in `backend/internal/postgres/migrations_test.go`:
spin up a Postgres testcontainer, apply migrations, assert that
the `backend` schema exists and that every expected table is
present.
Critical files:
- `backend/internal/postgres/migrations/00001_init.sql`
- `backend/internal/postgres/jet/**`
- `backend/cmd/jetgen/main.go`
- `backend/Makefile`
- `backend/internal/postgres/migrations_test.go`
Done criteria:
- `go test ./backend/internal/postgres/...` is green.
- `make jet` regenerates without diff.
- All tables listed above exist after a fresh migration.
## ~~Stage 5~~ — Domain implementation
Goal: implement domain modules in dependency order. After each substage
the backend is functional for the substage's slice of behaviour. The
contract tests from Stage 3 progressively flip from `501` to actual
responses as each substage replaces placeholders.
Substages run strictly in order. Each substage:
- Implements package code in `backend/internal/<domain>/`.
- Replaces the corresponding `501` handler bodies in
`backend/internal/server/handlers_*.go` with real logic that calls
the domain package.
- Adds focused unit and contract coverage for the substage's
endpoints.
- Wires the new package into `backend/cmd/backend/main.go`.
### ~~5.1~~ — auth
This substage was implemented and marked as done. See
[`docs/stage05_1-auth.md`](docs/stage05_1-auth.md) for the decisions
taken during implementation.
Behaviour:
- `POST /api/v1/public/auth/send-email-code` — generates a challenge,
hashes the code, persists in `auth_challenges`, calls
`mail.EnqueueLoginCode(email, code)`. Returns `{challenge_id}` for
every non-blocked email (existing user, new user, throttled — all
return identical shape; blocked email rejects with 400 only when the
block is permanent).
- `POST /api/v1/public/auth/confirm-email-code` — looks up the
challenge, verifies the code (constant-time), enforces attempt
ceiling, marks consumed, calls `user.EnsureByEmail(email,
preferred_language, time_zone)` to obtain the user_id, stores the
Ed25519 public key, creates a `device_session` row, populates the
in-memory cache, calls
`geo.SetDeclaredCountryAtRegistration(user_id, source_ip)`, and
returns `{device_session_id}`.
- `GET /api/v1/internal/sessions/{device_session_id}` — sync session
lookup for gateway.
- `POST /api/v1/internal/sessions/{device_session_id}/revoke` and
`POST /api/v1/internal/sessions/users/{user_id}/revoke-all` — mark
sessions revoked, evict from in-memory cache, emit
`session_invalidation` push event (Stage 6 wires the actual
emission; until then `auth` calls a no-op publisher injected at
wiring).
Cache: full session table read at startup; write-through on every
mutation.
### ~~5.2~~ — user
This substage was implemented and marked as done. See
[`docs/stage05_2-user.md`](docs/stage05_2-user.md) for the decisions
taken during implementation.
Behaviour:
- Account CRUD limited to allowed mutations on profile and settings.
- `EnsureByEmail` and `ResolveByEmail` for `auth`.
- Entitlement records and snapshots; tier downgrades never revoke
already-registered race names.
- Sanctions and limits using the record + active rollup pattern.
- Soft delete: writes `deleted_at` and triggers in-process cascade —
`lobby.OnUserDeleted(user_id)`, `notification.OnUserDeleted(user_id)`,
`geo.OnUserDeleted(user_id)`. Permanent block triggers
`lobby.OnUserBlocked(user_id)`.
- Cache: latest entitlement snapshot per user; warmed on startup;
write-through on entitlement mutation.
### ~~5.3~~ — admin
This substage was implemented and marked as done. See
[`docs/stage05_3-admin.md`](docs/stage05_3-admin.md) for the decisions
taken during implementation.
Behaviour:
- `admin_accounts` CRUD with bcrypt hashing.
- Bootstrap on startup via env vars (`BACKEND_ADMIN_BOOTSTRAP_USER`,
`BACKEND_ADMIN_BOOTSTRAP_PASSWORD`); idempotent.
- Replace the Stage 3 stub `basicauth` middleware with the real
Postgres-backed verifier. Constant-time comparison via bcrypt.
- Admin CRUD endpoints across users, games, runtime, mail,
notification, geo. Each admin endpoint delegates to the domain
package's admin-facing methods.
Cache: full admin table at startup; write-through on mutation.
### ~~5.4~~ — lobby
This substage was implemented and marked as done. See
[`docs/stage05_4-lobby.md`](docs/stage05_4-lobby.md) for the decisions
taken during implementation.
Behaviour:
- Games CRUD with the enrollment state machine.
- Applications and invites with their lifecycles.
- Memberships with race name binding.
- Race Name Directory: registered, reservation, and
pending_registration tiers; canonical key via `disciplinedware/go-confusables`;
uniqueness across all three tiers; capability promotion based on
`max_planets > initial AND max_population > initial` from the
runtime snapshot.
- Pending-registration sweeper: scheduled job, releases entries past
the 30-day window; uses `pkg/cronutil`. The same sweeper auto-closes
enrollment-expired games whose `approved_count >= min_players`.
- Hooks consumed from other modules:
- `OnUserBlocked(user_id)` — release all RND/applications/invites/
memberships in one transaction.
- `OnUserDeleted(user_id)` — same.
- `OnRuntimeSnapshot(snapshot)` — update denormalised runtime view
on the game (current_turn, status, per-member max stats).
- `OnGameFinished(game_id)` — drive race name promotion logic and
move game to `finished`.
Cache: active games and memberships, RND canonical set; warmed on
startup; write-through on mutation.
### ~~5.5~~ — runtime (with dockerclient and engineclient)
This substage was implemented and marked as done. See
[`docs/stage05_5-runtime.md`](docs/stage05_5-runtime.md) for the
decisions taken during implementation.
Behaviour:
- Engine version registry CRUD.
- `engineclient` is a thin `net/http` client over `pkg/model` types,
one method per engine endpoint listed in `README.md` §8.
- `dockerclient` wraps `github.com/docker/docker` for: pull, create,
start, stop, remove, inspect, list (filtered by the
`galaxy.backend=1` label), patch (semver-only, validated against
`engine_versions`).
- Per-game serialisation: a `sync.Map[game_id]*sync.Mutex` ensures
concurrent ops on the same game are sequential.
- Worker pool for long-running operations: started in Stage 5.5; jobs
enqueued on a buffered channel; bounded concurrency.
- `runtime_operation_log` records every op (start time, finish time,
outcome, error).
- Reconciliation: on startup and on a `pkg/cronutil` schedule, list
containers labelled `galaxy.backend=1`, match against
`runtime_records`, adopt unrecorded labelled containers, mark
recorded but missing as removed. Emit
`lobby.OnRuntimeJobResult` for each removed.
- Snapshot publication: after every successful engine read or a
health-probe transition, synthesise a snapshot and call
`lobby.OnRuntimeSnapshot(snapshot)` synchronously.
- Turn scheduler: `pkg/cronutil` schedule per running game; each tick
invokes the engine `admin/turn`, on success snapshots and publishes;
force-next-turn sets a one-shot skip flag stored in
`runtime_records`.
Cache: active runtime records, engine version registry; warmed on
startup; write-through on mutation.
### ~~5.6~~ — mail
This substage was implemented and marked as done. See
[`docs/stage05_6-mail.md`](docs/stage05_6-mail.md) for the decisions
taken during implementation.
Behaviour:
- Outbox tables defined in Stage 4.
- Worker goroutine: scans `mail_deliveries` with
`SELECT ... FOR UPDATE SKIP LOCKED` ordered by `next_attempt_at`,
attempts SMTP delivery via `wneessen/go-mail`, records in
`mail_attempts`, updates status, schedules backoff with jitter, or
dead-letters past the configured maximum attempts.
- Drain on startup: replays all `pending` and `retrying` rows.
- Public API for producers: `EnqueueLoginCode(email, code, ttl)`,
`EnqueueTemplate(template_id, recipient, payload, idempotency_key)`.
- Admin endpoints implemented: list, view, resend.
### ~~5.7~~ — notification
This substage was implemented and marked as done. See
[`docs/stage05_7-notification.md`](docs/stage05_7-notification.md) for
the decisions taken during implementation.
Behaviour:
- `Submit(intent)` — validate intent shape, enforce idempotency,
persist `notifications`, materialise `notification_routes`, fan out
to push (Stage 6 wires the actual push emission; until then a no-op
publisher) and email (`mail.EnqueueTemplate`).
- Each kind has a fixed channel set documented in `README.md` §10.
- Malformed intents go to `notification_malformed_intents` and never
block the producer.
- Dead-letter handling: a failed route past max attempts moves to
`notification_dead_letters`.
- Producers (lobby, runtime, geo, auth) are wired via direct function
calls.
### ~~5.8~~ — geo
This substage was implemented and marked as done. See
[`docs/stage05_8-geo.md`](docs/stage05_8-geo.md) for the decisions
taken during implementation.
Behaviour:
- Load GeoLite2 Country DB at startup from `BACKEND_GEOIP_DB_PATH`.
- `SetDeclaredCountryAtRegistration(user_id, ip)` — sync; lookup,
update `accounts.declared_country`. No-op on lookup error.
- `IncrementCounterAsync(user_id, ip)` — fire-and-forget goroutine;
upsert `user_country_counters` with `count = count + 1`,
`last_seen_at = now()`.
- Middleware on `/api/v1/user/*` extracts the source IP from
`X-Forwarded-For` (or `RemoteAddr`) and calls
`IncrementCounterAsync` after the handler returns successfully.
- `OnUserDeleted(user_id)` — delete the user's counter rows.
Critical files (Stage 5 as a whole):
- `backend/internal/auth/**`
- `backend/internal/user/**`
- `backend/internal/admin/**`
- `backend/internal/lobby/**`
- `backend/internal/runtime/**`
- `backend/internal/dockerclient/**`
- `backend/internal/engineclient/**`
- `backend/internal/mail/**`
- `backend/internal/notification/**`
- `backend/internal/geo/**`
- `backend/internal/server/handlers_*.go` (replacing 501 stubs)
- `backend/cmd/backend/main.go` (wiring expansion)
Done criteria:
- All Stage 3 contract tests pass against real responses.
- Each substage adds focused unit tests (`testify`, mocks where
external boundaries justify them).
- `go run ./backend/cmd/backend` boots, all caches warm, all workers
start.
## ~~Stage 6~~ — Push gRPC interface and gateway adaptation
Goal: stand up the bidirectional control channel between backend and
gateway. Backend pushes `client_event` and `session_invalidation`;
gateway opens the stream, signs and forwards client events, immediately
acts on session invalidations. Remove every Redis dependency from
gateway except anti-replay reservations.
### ~~6.1~~ — Backend push server
This substage was implemented and marked as done. See
[`docs/stage06_1-push.md`](docs/stage06_1-push.md) for the decisions
taken during implementation.
Actions:
1. Author `backend/proto/push/v1/push.proto` with
`service Push { rpc SubscribePush(GatewaySubscribeRequest) returns
(stream PushEvent); }` and the message types defined in
`README.md` §7. Include a `cursor` field (string).
2. `backend/buf.yaml`, `backend/buf.gen.yaml` mirroring the gateway
pattern; generate Go bindings into `backend/proto/push/v1/`.
3. `backend/internal/push/server.go` — gRPC service implementation:
- Maintains a connection registry keyed by gateway client id (the
`GatewaySubscribeRequest` provides one; if multiple gateway
instances connect, each gets its own queue).
- Holds an in-memory ring buffer keyed by cursor, with TTL equal to
`BACKEND_FRESHNESS_WINDOW`. Cursors past TTL are discarded.
- Resume: if the client's cursor is still in the buffer, replay
from there; otherwise replay nothing and start fresh.
- Backpressure: per-connection buffered channel; on overflow, drop
the oldest events for that connection and log.
4. Provide a publisher API consumed by `auth`, `lobby`, `notification`,
and `runtime`:
- `push.PublishClientEvent(user_id, device_session_id?, payload, kind)`.
- `push.PublishSessionInvalidation(device_session_id|user_id, reason)`.
### ~~6.2~~ — Gateway adaptation
This substage was implemented and marked as done. See
[`docs/stage06_2-gateway.md`](docs/stage06_2-gateway.md) for the
decisions taken during implementation.
Actions:
1. Remove `redisconn` usage for session projection and for the two
stream consumers. Keep `redisconn` only for anti-replay
reservations.
2. Remove `gateway/internal/config` env vars
`GATEWAY_SESSION_EVENTS_REDIS_STREAM` and
`GATEWAY_CLIENT_EVENTS_REDIS_STREAM`. Add
`GATEWAY_BACKEND_HTTP_URL` and `GATEWAY_BACKEND_GRPC_PUSH_URL`.
3. Add `gateway/internal/backendclient/` with:
- `RESTClient` — HTTP client for `/api/v1/internal/sessions/...` and
for forwarding public/user requests.
- `PushClient` — gRPC client to `SubscribePush` with reconnect
loop, exponential backoff with jitter, and cursor persistence in
process memory.
4. Replace gateway session validation with a sync REST call to
backend per request.
5. Replace gateway client-events Redis consumer with the
`SubscribePush` consumer. On `client_event`: sign envelope (Ed25519)
and deliver to the matching client subscription. On
`session_invalidation`: look up active subscriptions for the target
sessions, close them, and reject any in-flight authenticated
request bound to those sessions.
6. Anti-replay request_id reservations remain in Redis (unchanged).
7. Update gateway tests to use a mocked backend HTTP and gRPC server.
Critical files:
- `backend/proto/push/v1/push.proto`
- `backend/buf.yaml`, `backend/buf.gen.yaml`
- `backend/internal/push/server.go`,
`backend/internal/push/publisher.go`
- `gateway/internal/backendclient/*.go`
- `gateway/internal/config/config.go` (env var changes)
- `gateway/internal/handlers/*.go` (route forwarding to backend)
- `gateway/internal/auth/*.go` (session lookup → REST)
- `gateway/internal/eventfanout/*.go` (replace Redis consumer with
gRPC consumer; rename if helpful)
Done criteria:
- `go run ./backend/cmd/backend` and `go run ./gateway/cmd/gateway`
cooperate end-to-end with no Redis stream usage.
- A revocation through the admin surface causes immediate stream
closure on the affected client.
- Gateway anti-replay still rejects duplicates.
- gateway test suite green.
## ~~Stage 7~~ — Integration testing
This stage was implemented and marked as done. See
[`docs/stage07-integration.md`](docs/stage07-integration.md) for the
decisions taken during implementation, including the testenv layout,
the signed-envelope gRPC client, and the per-scenario coverage notes.
Goal: end-to-end coverage of the platform with real binaries and real
infrastructure where practical.
Actions:
1. Recreate the top-level `integration/` module, registered in
`go.work`. The module hosts black-box test suites that drive
`gateway` from outside and verify behaviour at the public boundary
(with `backend` and `game` running in containers).
2. Add testcontainers fixtures: Postgres, an SMTP capture server (for
example `axllent/mailpit`), the `galaxy/game` engine image, the
`galaxy/backend` image (built from this repo), and the
`galaxy/gateway` image. The Docker daemon used by testcontainers
is the same one backend will use to manage engines.
3. Add a synthetic GeoLite2 mmdb (use `pkg/geoip/test-data/`).
4. Cover scenarios:
- Registration flow: send-email-code → confirm-email-code →
`declared_country` populated from synthetic mmdb.
- User account fetch: `X-User-ID` path returns the expected
account; geo counter increments per request.
- Lobby flow: create game → invite → application → ready-to-start
→ start (engine container starts, healthz green, status read) →
command → force-next-turn → finish → race name promotion.
- Mail flow: trigger an email-bound notification → SMTP capture
receives it → admin resend works.
- Notification flow: lobby invite triggers a push event reaching
the test client's gateway subscription, plus an email captured
by SMTP.
- Admin flow: bootstrap admin authenticates; CRUD admin creates a
second admin; second admin disables the first.
- Soft delete flow: user soft-delete cascades; their RND entries,
memberships, applications, invites, geo counters are released
or removed.
- Session revocation: admin revokes a session → push
`session_invalidation` arrives at gateway → active subscription
closes; subsequent requests with that `device_session_id`
rejected by gateway.
- Anti-replay: same `request_id` replayed within freshness window
is rejected by gateway.
5. CI: run `go test ./integration/... -tags=integration` (or whichever
flag the team prefers). Tests requiring real Docker run only when
a Docker daemon is available; otherwise they skip with a clear
message.
Critical files:
- `integration/go.mod`
- `integration/auth_flow_test.go`
- `integration/lobby_flow_test.go`
- `integration/mail_flow_test.go`
- `integration/notification_flow_test.go`
- `integration/admin_flow_test.go`
- `integration/soft_delete_test.go`
- `integration/session_revoke_test.go`
- `integration/anti_replay_test.go`
- `integration/testenv/*.go` (shared fixtures)
Done criteria:
- `go test ./integration/...` runs the full suite.
- All listed scenarios pass green on a developer machine with Docker
available.
- Failures produce actionable diagnostics (logs from each component
attached to the test report).
## Stage acceptance and decision records
After each stage, the implementing engineer writes a short decision
record under `backend/docs/stage<NN>-<topic>.md` capturing any
non-trivial choice made during implementation that is not obvious from
the code or from this plan. Records that contradict this plan must be
brought to the architecture conversation before merge — the plan and
the architecture document are the agreed contract.
+1448
View File
File diff suppressed because it is too large Load Diff
-552
View File
@@ -1,552 +0,0 @@
# Edge Gateway Implementation Plan
This plan has been already implemented and stays here for historical reasons.
It should NOT be threated as source of truth for service functionality.
---
## Summary
This plan breaks implementation into small, reviewable phases.
Each phase has a single primary goal, clear deliverables, explicit dependencies,
acceptance criteria, and focused tests.
The intended v1 architecture is:
- unauthenticated public ingress over REST/JSON;
- authenticated ingress over gRPC on HTTP/2;
- FlatBuffers payloads for authenticated business commands;
- protobuf-based gRPC control envelopes;
- authenticated server-streaming push through gRPC;
- separate public traffic classes and isolated anti-abuse counters.
## Assumptions and Defaults
- `message_type` is the stable downstream routing key.
- `protocol_version` covers transport and envelope compatibility, not business
payload schema compatibility.
- FlatBuffers are used for business payload bytes only.
- Phase 3 public auth uses a challenge-token REST flow:
`send-email-code(email) -> challenge_id` and
`confirm-email-code(challenge_id, code, client_public_key) -> device_session_id`.
- Phase 3 uses a consumer-side `AuthServiceClient` inside `gateway`; the
default process wiring keeps public auth routes mounted and returns
`503 service_unavailable` until a concrete upstream adapter is added.
- Browser bootstrap and asset traffic are within gateway scope, even when backed
by a pluggable proxy or handler.
- Long-polling is out of scope for v1.
## ~~Phase 1.~~ Module Skeleton
Status: implemented.
Goal: create the runnable gateway process skeleton.
Artifacts:
- `cmd/gateway`
- `internal/app`
- base configuration types
- startup and shutdown wiring
Dependencies: none.
Acceptance criteria:
- the process starts with config;
- the process shuts down cleanly on signal;
- lifecycle wiring is testable.
Targeted tests:
- startup with valid config;
- shutdown without leaked goroutines.
## ~~Phase 2.~~ Public REST Server
Status: implemented.
Goal: add the unauthenticated HTTP server shell.
Artifacts:
- public REST listener
- `GET /healthz`
- `GET /readyz`
- base error serialization
- request classification hook
Dependencies: Phase 1.
Acceptance criteria:
- health endpoints respond deterministically;
- public requests are classified at least into `public_auth` and `browser_*`.
Targeted tests:
- health endpoint responses;
- request classification smoke tests.
## ~~Phase 3.~~ Public Auth REST Handlers
Status: implemented.
Goal: expose unauthenticated auth commands through REST/JSON.
Artifacts:
- `POST /api/v1/public/auth/send-email-code`
- `POST /api/v1/public/auth/confirm-email-code`
- request and response DTOs
- adapter calls into `AuthServiceClient`
Dependencies: Phase 2.
Acceptance criteria:
- no session authentication is required for these routes;
- handlers delegate only through the auth service adapter.
Targeted tests:
- success and validation errors for both routes;
- no session lookup on public auth paths.
## ~~Phase 4.~~ Public Traffic Classification
Status: implemented.
Goal: isolate public traffic into stable anti-abuse classes.
Artifacts:
- `PublicTrafficClassifier`
- classes `public_auth`, `browser_bootstrap`, `browser_asset`, `public_misc`
- isolated rate-limit bucket keys
Dependencies: Phase 2.
Acceptance criteria:
- browser traffic does not share buckets with public auth;
- auth counters remain unaffected by asset bursts.
Targeted tests:
- per-class routing tests;
- bucket isolation tests.
## ~~Phase 5.~~ Public REST Anti-Abuse
Status: implemented.
Goal: add coarse protection to unauthenticated REST traffic.
Artifacts:
- body size limits
- method allow-lists
- malformed request counters
- per-class rate-limit thresholds
Dependencies: Phase 4.
Acceptance criteria:
- first-load browser bursts are not marked hostile because of burst pattern
alone;
- malformed or oversized requests are rejected predictably.
Targeted tests:
- bootstrap burst stays outside auth abuse counters;
- invalid methods and oversized bodies are rejected.
## ~~Phase 6.~~ gRPC Server and Public Contracts
Status: implemented.
Goal: bring up authenticated transport over gRPC and HTTP/2.
Artifacts:
- gRPC listener
- protobuf service definitions
- `ExecuteCommand`
- `SubscribeEvents`
Dependencies: Phase 1.
Acceptance criteria:
- unary and server-streaming RPCs are reachable;
- the server runs only over HTTP/2.
Targeted tests:
- unary transport smoke test;
- stream transport smoke test.
## ~~Phase 7.~~ Envelope Parsing and Protocol Gate
Status: implemented.
Goal: validate the gRPC control envelope before security checks continue.
Artifacts:
- envelope parser
- required-field validation
- protocol version gate
Dependencies: Phase 6.
Acceptance criteria:
- unsupported or malformed envelopes are rejected before routing.
Targeted tests:
- missing field rejection;
- unsupported `protocol_version` rejection.
## ~~Phase 8.~~ Session Cache Lookup
Status: implemented.
Goal: resolve authenticated identity from cache.
Artifacts:
- `SessionCache`
- session lookup pipeline
- revoked versus active session handling
Dependencies: Phase 7.
Acceptance criteria:
- unknown and revoked sessions are blocked before signature verification.
Targeted tests:
- cache hit with active session;
- cache miss reject;
- revoked session reject.
## ~~Phase 9.~~ Payload Hash and Signing Input
Status: implemented.
Goal: verify payload integrity before signature verification.
Artifacts:
- `payload_hash` verification
- canonical signing input builder
Dependencies: Phase 8.
Acceptance criteria:
- changing payload bytes or envelope fields breaks the signing input.
Targeted tests:
- payload hash mismatch reject;
- canonical bytes differ when signed fields change.
## ~~Phase 10.~~ Client Signature Verification
Status: implemented.
Goal: authenticate the request origin using the session public key.
Artifacts:
- signature verifier
- deterministic auth reject mapping
Dependencies: Phase 9.
Acceptance criteria:
- wrong key and invalid signature produce stable rejects.
Targeted tests:
- success case with valid signature;
- bad signature reject;
- wrong-key reject.
## ~~Phase 11.~~ Freshness and Anti-Replay
Status: implemented.
Goal: enforce transport freshness and replay protection.
Artifacts:
- timestamp freshness window
- `ReplayStore`
- replay reservation and rejection logic
Dependencies: Phase 10.
Acceptance criteria:
- stale requests and duplicate `request_id` values are rejected.
Targeted tests:
- stale timestamp reject;
- replay reject for same session and request ID;
- distinct sessions do not collide.
## ~~Phase 12.~~ Authenticated Rate Limits and Policy
Status: implemented.
Goal: apply edge policy after transport authenticity is established.
Artifacts:
- rate-limit keys for IP, session, user, and message class
- authenticated policy evaluation hook
Dependencies: Phase 11.
Acceptance criteria:
- authenticated buckets are independent from public REST buckets.
Targeted tests:
- per-dimension throttling;
- bucket isolation from public traffic.
## ~~Phase 13.~~ Internal Authenticated Command and Routing
Status: implemented.
Note: delivered together with Phase 14 signed unary responses.
Goal: forward only verified context to downstream services.
Artifacts:
- `AuthenticatedCommand`
- `DownstreamRouter`
- `DownstreamClient`
Dependencies: Phase 12.
Acceptance criteria:
- downstream services receive verified context only;
- raw transport details do not leak as authoritative input.
Targeted tests:
- route selection by `message_type`;
- downstream receives the expected authenticated context.
## ~~Phase 14.~~ Signed Unary Responses
Status: implemented as part of Phase 13 delivery.
Goal: return verifiable server responses to authenticated clients.
Artifacts:
- response envelope builder
- payload hash generation
- `ResponseSigner`
Dependencies: Phase 13.
Acceptance criteria:
- unary responses always carry the original `request_id`, `payload_hash`, and
server signature.
Targeted tests:
- response correlation test;
- server signature generation test.
## ~~Phase 15.~~ Session Update and Revocation Events
Status: implemented.
Goal: keep gateway session state current without synchronous hot-path lookups.
Artifacts:
- `EventSubscriber`
- session update handlers
- session revoke handlers
Dependencies: Phase 8.
Acceptance criteria:
- session updates change gateway behavior without per-request sync calls to the
auth service.
Targeted tests:
- cache update from event;
- revocation event invalidates cached session.
## ~~Phase 16.~~ Authenticated Push Stream
Status: implemented.
Goal: open a verified server-streaming channel for client-facing delivery.
Artifacts:
- `SubscribeEvents` handler
- stream binding to `user_id` and `device_session_id`
- initial server time event
Dependencies: Phase 15.
Acceptance criteria:
- the stream opens only after the full auth pipeline succeeds.
Targeted tests:
- authorized stream open;
- rejected stream open for invalid session;
- first event contains server time.
## ~~Phase 17.~~ Event Fan-Out
Status: implemented.
Goal: deliver client-facing events from internal pub/sub to active streams.
Artifacts:
- `PushHub`
- event fan-out logic
- user and session targeting rules
Dependencies: Phase 16.
Acceptance criteria:
- events are delivered to the correct active streams only.
Targeted tests:
- single-session delivery;
- multi-device delivery for one user;
- unrelated sessions do not receive the event.
## ~~Phase 18.~~ Revocation-Driven Stream Teardown
Status: implemented.
Goal: terminate active delivery channels when a session is revoked.
Artifacts:
- stream teardown on revoke
- connection cleanup logic
Dependencies: Phase 17.
Acceptance criteria:
- revocation blocks new unary requests and closes active streams for the same
session.
Targeted tests:
- revoke closes active stream;
- revoked session cannot reopen the stream.
## ~~Phase 19.~~ Observability and Shutdown Hardening
Status: implemented.
Note: delivered with `zap` structured logging, OpenTelemetry tracing and
metrics, the optional private admin `/metrics` listener, timeout budgets, and
shutdown-driven push-stream teardown.
Goal: make the service operable in production.
Artifacts:
- structured logs
- metrics
- trace propagation
- timeout budgets
- graceful shutdown for unary and streaming traffic
Dependencies: Phase 18.
Acceptance criteria:
- shutdown is deterministic;
- logs and metrics expose stable edge outcomes without leaking secrets.
Targeted tests:
- shutdown closes listeners and active streams;
- secret and signature values are not logged.
## ~~Phase 20.~~ Acceptance Pass
Status: implemented.
Note: acceptance pass reconciled README/OpenAPI/root architecture
documentation, fixed the documented public-auth projected-error contract, and
added focused regression coverage including OpenAPI validation.
Goal: reconcile implementation, documentation, and regression coverage.
Artifacts:
- updated README and PLAN
- final protocol and interface review
- focused regression test run
Dependencies: Phases 1 through 19.
Acceptance criteria:
- implementation matches documented contracts and ordering guarantees;
- docs describe the actual gateway behavior.
Targeted tests:
- run focused package tests for gateway packages;
- rerun cross-cutting regression scenarios.
## Cross-Cutting Regression Scenarios
- `send_email_code` and `confirm_email_code` are available without session auth
and are still limited by public auth policy.
- Public browser bootstrap and asset bursts do not increase auth abuse counters
and are not rejected as hostile because of intensity alone.
- Any gRPC command without a valid session is rejected before routing.
- Unknown and revoked sessions are handled predictably and consistently where
policy requires identical behavior.
- Signature verification fails when `payload_bytes`, `payload_hash`,
`message_type`, `request_id`, or the signing key changes.
- `payload_hash` is verified before downstream execution.
- Requests outside the freshness window are rejected.
- Reused `request_id` values are rejected within the session replay window.
- Public REST and authenticated gRPC traffic use independent buckets and
independent abuse telemetry.
- Downstream services receive `AuthenticatedCommand`, not raw REST or gRPC
transport requests.
- Unary responses preserve `request_id` correlation and are server-signed.
- Streaming connections open only after the auth pipeline and close on revoke.
- Session cache updates from events change gateway behavior without synchronous
auth-service lookups per request.
- Graceful shutdown terminates unary and streaming traffic cleanly.
+1760
View File
File diff suppressed because it is too large Load Diff