galaxy-game/docs/FUNCTIONAL.md

# Galaxy Functional Specification

This document describes what the Galaxy platform does, in terms of
user-visible operations and the per-service logic that implements them.
Each section walks through one domain scenario: who initiates an
operation, what `gateway` checks and forwards, what `backend` validates
and persists, what is returned to the client, and what side effects
fire (mail, push, container ops).

This is the starting point for any change request that touches
behaviour. The exact wire shape, error code vocabulary, environment
variables, default values, throttle limits, table and column names,
and field-level validation live in the lower-level sources:

- [`ARCHITECTURE.md`](ARCHITECTURE.md) — global architecture, security
  model, transport contract.
- `galaxy/<service>/README.md` — service layout, configuration,
  operations.
- `galaxy/<service>/openapi.yaml`, `*.proto` — wire contracts.
- `galaxy/<service>/docs/flows.md` — sequence diagrams.

This file deliberately omits those details. When this file and a
lower-level source disagree, see the synchronisation rule in the
project `CLAUDE.md`.

A Russian translation lives in
[`FUNCTIONAL_ru.md`](FUNCTIONAL_ru.md). It is a convenience mirror for
the project owner, **not a source of truth** — this English file is
authoritative. Every point edit to this file must be mirrored into the
Russian version in the same patch (translate only the touched
paragraphs); a full re-translation happens only on explicit owner
request.

The document is organised by domain scenario, not by HTTP route group.
Public, user-authenticated, and admin operations may all appear in the
same scenario when they participate in the same business flow.

## Table of Contents

1. [Authentication and device session](#1-authentication-and-device-session)
2. [Account management](#2-account-management)
3. [Lobby game lifecycle](#3-lobby-game-lifecycle)
4. [Lobby participation](#4-lobby-participation)
5. [Race Name Directory](#5-race-name-directory)
6. [In-game session](#6-in-game-session)
7. [Push channel](#7-push-channel)
8. [Notifications and mail](#8-notifications-and-mail)
9. [Geo signal](#9-geo-signal)
10. [Administration](#10-administration)

---

## 1. Authentication and device session

This scenario covers how an anonymous client becomes authenticated and
stays authenticated until a server-side action revokes that authority.

### 1.1 Scope

In scope: issuing an e-mail login challenge, confirming it (with
first-sign-in account creation and registration of the client's
public key), creating a device session, the per-request session
lookup that grounds every authenticated call, and server-initiated
revocation.

Out of scope: the wire envelope and signature scheme used by every
authenticated request — defined once in
[ARCHITECTURE.md §15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary) and reused by every later
section; client-side key storage; how push events are routed inside
gateway to a specific subscriber stream.

### 1.2 Issuing a login challenge

The client posts an e-mail address to the public auth surface on
gateway. The route is unauthenticated — there is no device session
yet to bind to.

Gateway treats this as a stricter "public auth" route class: it
applies per-IP and per-identity (per-email) anti-abuse, a body-size
cap, and a method allow-list, then forwards the request to backend.
Failures of the upstream adapter are projected back to the client
with the same status and error envelope; transport-level failures
become a generic unavailable response.

Backend produces an opaque challenge identifier and emits a
verification e-mail through the durable mail outbox. The response
shape is **identical regardless of whether the e-mail belongs to an
existing account, a fresh account, or a throttled one**, so the
endpoint cannot be used to enumerate accounts.

Branches inside backend:

- **Permanent block.** If the address is permanently blocked at the
  account level, the request is rejected. This is the only
  account-state branch that surfaces a distinct error code; every
  other branch returns the standard challenge-id response.
- **Throttle.** If too many un-consumed, non-expired challenges
  already exist for the same e-mail inside the throttle window,
  backend reuses the latest existing challenge instead of creating a
  new one. The client gets the same response shape and is unaware of
  the reuse.
- **Otherwise.** Backend creates a new challenge with the resolved
  preferred language (derived from the optional `locale` body field
  the caller sends — which takes priority — or, if absent or blank,
  from the `Accept-Language` header forwarded by gateway, falling
  back to a default), and enqueues the auth-mail row directly into
  the outbox in the same transaction. SMTP delivery is asynchronous;
  the auth response returns as soon as the challenge and outbox rows
  are durably committed. The body field is the canonical channel
  because Safari silently drops JS-set `Accept-Language` headers;
  non-Safari clients can still rely on the header alone.

### 1.3 Confirming the challenge

The client posts the challenge id, the code received by mail, a fresh
Ed25519 public key, and the chosen IANA time zone. Gateway applies
the same public-auth anti-abuse class, with the per-identity bucket
keyed by the challenge id rather than the e-mail. `Accept-Language` is
not consulted on this endpoint — the preferred language was captured
at send-time and is replayed from the challenge row.

Backend validates the challenge under a row lock: it rejects unknown,
expired, or already-consumed ids, increments the attempt counter, and
burns the challenge once the per-challenge attempt ceiling is reached.
After the code matches, backend re-checks the permanent-block flag —
catching the case where an admin applied the block between send and
confirm — and rejects the request when set. On the success path backend
ensures the account exists (synthesising an immutable display handle on
first sign-in only and populating the declared country from the source
IP), then marks the challenge consumed and creates a device session
bound to the caller's public key in the same transaction. The response
carries the new device session id.

A challenge is single-use. A second confirm on the same id returns the
same opaque `invalid_request` shape as confirming an unknown or expired
id; the API deliberately does not differentiate between the three so an
attacker cannot mine challenge state. Throttle reuse on the send side
means a client hitting the throttle gets the latest existing
`challenge_id` back instead of a fresh one, but every id is still
consumed exactly once.

### 1.4 Per-request session lookup

Once the client holds a device session id and a private key, every
authenticated call is a signed request to gateway over the
authenticated edge listener (Connect / gRPC / gRPC-Web on a single
HTTP/h2c port). Gateway is the only component that ever sees the
request signature; backend trusts gateway's verdict.

Gateway needs the session's public key to verify the signature, so each
authenticated request resolves the device session through an in-memory
LRU cache (bounded entry count plus a safety-net TTL). On miss the
cache calls backend's per-request session lookup endpoint and seeds the
entry. Gateway rejects the request when the cache returns "session
unknown" or "revoked"; otherwise it verifies the envelope per
[ARCHITECTURE.md §15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary) and forwards the verified
payload to backend over plain REST, injecting the resolved user id in a
header. Backend never re-derives identity from the request body.

Backend updates `last_seen_at` on the session row on every successful
lookup so admin operators can observe when each cached session was
last resolved at the edge. The update is part of the lookup
transaction; failures are logged but do not surface to the caller.

The cache is invalidated through the push channel rather than a
periodic refresh: a `session_invalidation` event flips the cached
entry's status to revoked, so subsequent requests bound to the
session are rejected without another backend round-trip. The TTL is
the safety net for missed events (cursor aged out, gateway restart) —
in steady state the push events are the authoritative source of
invalidation.

### 1.5 Revocation

Revocation makes a device session unable to authenticate any future
request and forces in-flight push streams bound to it to close.
Triggers fall in two groups.

**User-driven (logout).** The user surface exposes three operations:
list the caller's active sessions, revoke a single one, and revoke
all of them. Gateway forwards these to backend as ordinary
authenticated requests. Backend verifies the target session belongs
to the caller (otherwise responds with the same shape as a missing
session, so foreign session ids are not probeable), atomically flips
`device_sessions.status` to `revoked` and inserts a row into
`session_revocations`, then publishes one `session_invalidation`
event per revoked session.

**Admin-driven and lifecycle.** Sanctions that imply session
revocation (currently `permanent_block`), admin-driven soft delete,
and user-self soft delete all run an in-process call inside backend.
The same atomic UPDATE + audit-insert + push emission applies; the
audit row carries a different `actor_kind`
(`admin_sanction` / `soft_delete_admin` / `soft_delete_user`).

Once backend has emitted the push event, gateway flips the cached
session entry to revoked and closes any active push streams bound to
it. The per-request internal lookup against backend remains the
durable safety net: if a push event is lost, the next lookup (after
the cache TTL) returns the revoked record.

`session_revocations` is the audit ledger. Each row carries
`revocation_id`, `device_session_id`, `user_id`, `actor_kind`, the
actor pair (`actor_user_id` for user-driven kinds, `actor_username`
for admin-driven kinds — exactly one is non-NULL per row), `reason`,
and `revoked_at`. Operators can query it to answer "who and why
revoked this session"; the table is append-only.

Backend's `/api/v1/internal/sessions/{id}` is read-only — it carries
the per-request session lookup gateway needs to verify signed
envelopes. Internal revoke endpoints no longer exist; revoke is
either user-driven (through the user surface) or admin-driven
(through in-process calls inside backend).

### 1.6 Cross-references

- Wire envelope, signing, freshness window, anti-replay:
  [ARCHITECTURE.md §15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary).
- Backend module responsibilities for `auth`, `user`, `geo`, `mail`,
  `push`: [ARCHITECTURE.md §4](ARCHITECTURE.md#4-backend-domain-modules) and
  `backend/README.md`.
- Mail outbox semantics for the auth login-code template:
  [ARCHITECTURE.md §11](ARCHITECTURE.md#11-mail-outbox).
- Push channel framing and reconnect rules:
  [ARCHITECTURE.md §8](ARCHITECTURE.md#8-backend--gateway-communication). User-facing push semantics
  appear in [Section 7](#7-push-channel) of this document.

---

## 2. Account management

This scenario covers what an authenticated user can read or change
about their own account, and how a user removes the account.

### 2.1 Scope

In scope: reading the account aggregate, updating the mutable profile
slice, updating settings (preferred language, time zone, declared
country), and user-initiated soft delete.

Out of scope: admin-side mutation of the same account (sanctions,
limits, entitlement changes, admin soft delete) — covered in
[Section 10](#10-administration). Permanent block flag toggling is admin-only.

### 2.2 The account aggregate

Backend exposes a single read endpoint that returns the caller's
account aggregate: the durable identifying fields (immutable display
handle, e-mail), the mutable profile and settings slices, the
current entitlement snapshot, and any active sanctions and per-user
limit overrides. The aggregate is the authoritative client-side view
of "what the platform knows about me".

The display handle is synthesised at first sign-in ([Section 1.3](#13-confirming-the-challenge)) and
is never overwritten on subsequent sign-ins or on profile updates.
Clients should treat it as a stable identifier rather than a display
preference.

### 2.3 Profile and settings updates

Two distinct mutating endpoints split user-controlled fields by the
nature of the change. Both follow PATCH semantics — omitted fields
are not touched, present fields replace the stored value — and both
return the updated aggregate.

Profile carries one display-oriented field: `display_name`. An
explicit empty value clears the stored name; omitting the field
leaves it untouched.

Settings carries locale and timezone preferences:
`preferred_language` (BCP 47 tag) and `time_zone` (IANA identifier).
Both must be non-empty after trim when present; the timezone is
validated against the IANA database before commit.

`declared_country` is **not** part of either patch. Backend writes it
once at registration from the source IP ([Section 9](#9-geo-signal)) and treats it as
immutable thereafter; there is no user-facing path to change it.

### 2.4 User-initiated soft delete

The user can ask backend to soft-delete their own account. Backend
marks the account row deleted, then runs the in-process cascade
documented in [ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns). Concretely:

- Every device session for the user is revoked ([Section 1.5](#15-revocation)), with
  one audit row per session and one `session_invalidation` push
  event per session.
- Active memberships flip to `removed` (admin-driven block flips
  them to `blocked`); pending applications get `rejected`; incoming
  invites get `declined`; outgoing invites get `revoked`.
- Race name entries owned by the user — registered, reservation, or
  pending_registration — are deleted in a single cascade write.
- Owned games in non-running statuses (`draft`, `enrollment_open`,
  `ready_to_start`, `start_failed`, `paused`) are cancelled. Owned
  games already in `running` are **not** cancelled by the cascade —
  the engine container keeps producing turns until it finishes
  naturally; only the membership cleanup detaches the user.
- A single `lobby.membership.removed` notification fans out to the
  user with `reason=removed` (or `reason=blocked` for the admin
  block path).

The endpoint returns no body. The cascade is best-effort within a
single process: if a downstream module fails, the failure is logged
but the account stays marked deleted.

### 2.5 Cross-references

- Admin-side counterparts (sanction, limit, entitlement, soft delete):
  [Section 10](#10-administration).
- The cascade contract for "user blocked / user deleted":
  [ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns).
- Notification kinds emitted during the cascade:
  [`backend/README.md` §10](../backend/README.md#10-notification-catalog).

---

## 3. Lobby game lifecycle

This scenario covers a single game's life from creation to terminal
state. [Section 4](#4-lobby-participation) covers how players join an existing game; this
section focuses on the game itself.

### 3.1 Scope

In scope: creating a game (private vs public), updating its mutable
configuration, transitioning it through the lobby state machine,
cancellation, retry of a failed start, and the terminal transitions
(`finished`, `cancelled`).

Out of scope: applications, invites, memberships ([Section 4](#4-lobby-participation)), Race
Name Directory promotions on finish ([Section 5](#5-race-name-directory)), engine commands
during the running phase ([Section 6](#6-in-game-session)).

### 3.2 The state machine

The lobby state machine is the closed graph documented in
[ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns):

```text
draft → enrollment_open → ready_to_start → starting → running ↔ paused → finished
                                                       ↳ start_failed → ready_to_start (retry)
cancelled is reachable from every pre-finished state.
```

Two ground rules:

- **Ownership decides the surface.** Private games carry a
  `owner_user_id`; transitions are driven by the owner through the
  user surface. Public games are owned collectively by administrators
  (`owner_user_id IS NULL`); their transitions and configuration
  changes go through the admin surface.
- **The runtime callback owns one transition.** `starting → running`
  and `starting → start_failed` are the only transitions that the
  runtime module produces, after the engine container is fully up or
  has confirmed failure. Every other transition is a user or admin
  action.

### 3.3 Creation

A user creates a private game through the user surface. Backend
records the new game with `owner_user_id` set to the caller and
visibility `private`, in state `draft`, with the request body's
configuration as initial values.

Public games are created exclusively through the admin surface
([Section 10](#10-administration)). The user surface never produces a public game; this
asymmetry is enforced in backend, not at the route level.

### 3.4 Forward transitions

Owners drive forward transitions via dedicated endpoints
(`open-enrollment`, `ready-to-start`, `start`, `pause`, `resume`,
`retry-start`). Each endpoint:

- checks ownership of the game (or admin scope for public games);
- checks the source state matches the transition's precondition,
  rejecting with a conflict if not;
- updates the lobby record and publishes any user-facing
  notifications attached to the transition.

`start` queues a runtime job (long-running container pull / start /
init) and immediately returns "queued". Final state movement
(`starting → running` or `starting → start_failed`) arrives later
through the runtime callback. `retry-start` re-arms a `start_failed`
game back to `ready_to_start` and lets the owner trigger `start`
again.

`pause` and `resume` flip between `running` and `paused`. The
running engine container is not torn down on pause; only the lobby
schedule and command-acceptance flags change.

`ready-to-start` is always an explicit owner (or admin) action,
never auto-fired. The transition checks that the approved member
count is at least `min_players` and rejects with a conflict
otherwise.

### 3.5 Cancellation and finish

`cancel` is reachable from every pre-finished state. Owners can
cancel their own games; admins can cancel any. Cancellation
reconciles outstanding applications, invites, and memberships; it
does not promote race-name reservations.

`finished` is produced inside backend after the engine reports the
game finished. The transition tears down the engine container,
freezes the lobby record, and triggers Race Name Directory
promotions for capable finishes ([Section 5](#5-race-name-directory)). Both terminal states
are absorbing.

### 3.6 Admin overrides

Administrators can `force-start`, `force-stop`, and `ban-member` on
any game (public or private) regardless of state. `force-stop`
transitions the game to a stopped state and tears down the engine
container; `ban-member` removes a membership and prevents the user
from re-joining ([Section 4](#4-lobby-participation)).

### 3.7 Cross-references

- State machine vocabulary and transition rules:
  [ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns).
- Runtime job lifecycle (the asynchronous work behind `start`):
  [ARCHITECTURE.md §13](ARCHITECTURE.md#13-container-lifecycle-in-process) and `backend/docs/flows.md`.
- Public-vs-private invariants and the partial index that supports
  them: [ARCHITECTURE.md §4](ARCHITECTURE.md#4-backend-domain-modules).

---

## 4. Lobby participation

This scenario covers everything around joining and leaving an
existing game: applications (public), invites (private), and
memberships (after the join succeeds).

### 4.1 Scope

In scope: submitting an application to a public game, owner / admin
approval or rejection of an application, issuing and redeeming
invites, recipient decline and issuer revocation, listing
memberships per game, and member removal or block.

Out of scope: the game state machine itself ([Section 3](#3-lobby-game-lifecycle)) and the
in-game commands once a member is playing ([Section 6](#6-in-game-session)).

### 4.2 Applications (public games)

A user submits an application to a game by id. Applications are
**only accepted on public games**; an attempt against a private game
is rejected with a conflict. The game must additionally be in
`enrollment_open` (the only enrolment-accepting state for
applications). Backend also rejects the request if the user is
already a member or on the game's block list (via `ban-member`).
Otherwise it stores the application as `pending` and emits a
notification to the admin channel.

The owner — or an administrator for public games — approves or
rejects the application through dedicated endpoints. Approval
creates a membership for the applicant and emits the corresponding
notification. Rejection just records the terminal state; no
membership appears.

### 4.3 Invites (private games)

Invites are **only accepted on private games**; an attempt to issue
one for a public game is rejected with a conflict. The owner issues
an invite while the game is in `draft`, `enrollment_open`, or
`ready_to_start`.

Two flavours coexist:

- **User-bound** — `invited_user_id` is set; only that user may
  redeem. A `lobby.invite.received` notification is emitted to the
  recipient.
- **Code-based** — `invited_user_id` is empty; backend mints a hex
  code at issue time and any caller who knows the code may redeem.
  No notification is emitted at issue time (no recipient is bound
  yet).

Each invite carries an expiry (defaulted from configuration when
the body omits `expires_at`). The recipient redeems (creates a
membership) or declines; the issuer can revoke an outstanding
invite at any time before redemption.

### 4.4 Memberships

Memberships list the players currently attached to a game. Owners
can remove or block a member; a member can also remove themselves.
Removal terminates participation cleanly; block additionally
prevents the same user from re-applying or redeeming a future
invite for the same game.

The admin surface offers `ban-member` as the cross-game-policy
counterpart to the owner's block.

### 4.5 Listing the caller's view

The user surface exposes three "my" listings (games, applications,
invites). They project the caller's involvement across all games
without requiring the client to know game ids in advance, which
makes the dashboard and inbox views possible.

### 4.6 Notifications

Every state change in this scenario emits a notification kind from
the catalog: `lobby.invite.received`, `lobby.invite.revoked`,
`lobby.application.submitted`, `lobby.application.approved`,
`lobby.application.rejected`, `lobby.membership.removed`,
`lobby.membership.blocked`. [Section 8](#8-notifications-and-mail) documents the fan-out.

### 4.7 Cross-references

- Game lifecycle: [Section 3](#3-lobby-game-lifecycle).
- Notification catalog and fan-out: [Section 8](#8-notifications-and-mail) and
  [`backend/README.md` §10](../backend/README.md#10-notification-catalog).

---

## 5. Race Name Directory

This scenario covers how a player picks the name of their in-game
race and, eventually, gets that name registered platform-wide.

### 5.1 Scope

In scope: the three-tier directory (registered, reservation,
pending_registration), promotion through "capable finish",
user-driven promotion of a pending registration to registered,
sweeper-driven release on TTL expiry, and uniqueness through the
canonical-key model.

Out of scope: how the engine actually consumes the chosen name —
that lives in [Section 6](#6-in-game-session).

### 5.2 Three tiers

- **Registered** is platform-unique. A canonical key has at most one
  live binding to a single user.
- **Reservation** is per-game. The same canonical key can be
  reserved by the same user across several active games at the same
  time, but two different users cannot reserve the same canonical
  key in the same game.
- **Pending registration** is the transient tier between
  reservation and registered. It is issued automatically after a
  "capable finish" (the game ended with the player having grown
  their initial planet count and population), and it gives the user
  a bounded window to convert the reservation into a permanent
  registration.

### 5.3 Canonicalisation

Every name (typed by a user or registered by the platform) is
folded into a canonical key. Canonicalisation is confusable-aware
(latin-cyrillic look-alikes, digit-letter substitutions) and is
applied uniformly across the directory; uniqueness is enforced on
the canonical key, not on the displayed name. Cross-tier conflicts
on the same canonical key are blocked at write time through a
per-canonical advisory lock.

### 5.4 Promotion path

A reservation appears when a player names their race during a game.
When the game finishes capably, backend automatically converts the
reservation into a pending_registration with a TTL. While the
pending entry is alive, the user can call the registration endpoint
to promote the entry to `registered`. If the TTL expires first, a
periodic sweeper releases the entry; the canonical key becomes
available again.

A pending registration can be claimed only by the user who earned
it; backend rejects an attempt by a different user even if the
canonical key matches.

### 5.5 Notifications

The directory emits `lobby.race_name.registered`,
`lobby.race_name.pending`, and `lobby.race_name.expired` to the
owning user. [Section 8](#8-notifications-and-mail) covers fan-out.

### 5.6 Cross-references

- Canonicalisation library and glossary entries
  ("canonical key", "capable finish"):
  [ARCHITECTURE.md §19](ARCHITECTURE.md#19-glossary).
- The promotion trigger inside the lobby module:
  [ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns) (`lobby.OnGameFinished`)
  and `backend/docs/flows.md`.

---

## 6. In-game session

This scenario covers what an active player does while a game is
running: submit commands and orders, read turn reports.

### 6.1 Scope

In scope: command submission, order submission, report reading, and
the turn-cutoff behaviour that closes the command window during
generation.

Out of scope: how the engine container itself is started, scheduled,
or stopped — those are runtime concerns covered in [Section 3](#3-lobby-game-lifecycle) (start
/ stop) and [Section 10](#10-administration) (admin runtime overrides). The wire format of
commands, orders, and reports is the engine's own contract and is
not duplicated here.

### 6.2 Backend's role: pass-through with authorisation

The signed authenticated-edge pipeline for in-game traffic uses three
message types on the authenticated surface — `user.games.command`,
`user.games.order`, `user.games.report` — each with a typed
FlatBuffers payload. Gateway transcodes the FB request into the JSON
shape backend expects, forwards over plain REST to the corresponding
`/api/v1/user/games/{game_id}/*` endpoint, then transcodes the JSON
response back into FB before signing the reply.

For every in-game endpoint the user surface acts as an authorised
pass-through to the engine container. Backend:

- verifies the caller is an active member of the target game and
  that the game is in a state that accepts the operation;
- rebinds the actor field in the body to the caller's race name from
  the runtime player mapping (clients never supply a trusted actor);
- resolves the engine endpoint (the running container for the
  `game_id`) and forwards the call;
- returns the engine's response payload back to the client without
  re-interpretation.

Backend does not parse command or order payload contents beyond
what authorisation requires. The engine is the source of truth for
validity and ordering of in-game decisions. Gateway needs to know
the typed FB shape only to transcode the wire format; the per-command
semantics live in the engine.

### 6.3 Turn cutoff

A running game continuously alternates between a command-accepting
window and a generation phase. The transition `running →
generation_in_progress` is the cutoff: any command or order that
arrives after the cutoff is rejected by backend before forwarding,
because the engine no longer accepts writes for the closing turn.
After generation finishes, backend re-opens the window for the next
turn.

`force-next-turn` (admin) schedules a one-shot extra tick that
advances the next scheduled turn by one cron step.

### 6.4 Reports

Per-turn reports are read-only views fetched from the engine on
demand. Backend authorises the caller and forwards the request;
there is no caching or denormalisation in this path.

### 6.5 Side effects

A successful turn generation publishes a runtime snapshot into the
lobby module, which updates the denormalised view (current turn,
runtime status, per-player stats). The engine's "game finished"
report drives the `running → finished` transition ([Section 3.5](#35-cancellation-and-finish))
and triggers Race Name Directory promotions ([Section 5](#5-race-name-directory)).

The `game.*` notification kinds (`game.started`, `game.turn.ready`,
`game.generation.failed`, `game.finished`) are reserved in the
documentation but have **no producer** in the codebase today; the
notification catalog explicitly omits them (`backend/internal/notification/catalog.go`).
Adding a producer is purely additive: register the kind in the
catalog, populate `MailTemplateID` if email fan-out is desired, and
have the appropriate domain module call `notification.Submit`.

### 6.6 Cross-references

- Backend ↔ engine wire contract (`pkg/model/{order,report,rest}`):
  [ARCHITECTURE.md §9](ARCHITECTURE.md#9-backend--game-engine-communication).
- Container lifecycle, label discipline, reconciliation:
  [ARCHITECTURE.md §13](ARCHITECTURE.md#13-container-lifecycle-in-process) and `backend/docs/flows.md`.

---

## 7. Push channel

This scenario covers how the platform pushes real-time events to
authenticated clients (turn-ready signals, lobby state changes,
session invalidations).

### 7.1 Scope

In scope: the server-streaming subscription a client opens against
gateway (Connect / gRPC / gRPC-Web framing all map to the same
endpoint), the bootstrap event, the framing of forwarded events, and
the backend → gateway control channel that produces those events.

Out of scope: the catalog of event kinds — see [Section 8](#8-notifications-and-mail) for the
notification side and [`backend/README.md` §10](../backend/README.md#10-notification-catalog) for the closed list.

### 7.2 Client subscription

An authenticated client opens a `SubscribeEvents` server-streaming
call on gateway. Gateway runs the same envelope verification as for
unary requests ([Section 1.4](#14-per-request-session-lookup)), then registers the stream with its
internal hub. The first frame the client receives is a
gateway-signed bootstrap event carrying the current server time, so
the client can calibrate its local clock without a separate request.

### 7.3 Backend → gateway control

Backend hosts a single gRPC service `Push.SubscribePush`, consumed
by gateway. There is exactly one logical subscription per gateway
client identity at a time; a reconnect with the same id replaces
the old subscription. Each frame on the stream carries a monotonic
cursor and one of two payload shapes:

- **Client event.** A typed payload destined for one user (and
  optionally one device session). Producers pass a `push.Event`
  (Kind + Marshal) to `push.Service`; the service invokes Marshal
  and places the bytes into `pushv1.ClientEvent.Payload`. Gateway
  forwards the bytes inside a signed client envelope without
  re-interpreting them. Producers attach correlation ids that
  gateway carries verbatim. New kinds ship with a FlatBuffers-backed
  Event implementation; kinds that have not migrated yet use the
  `push.JSONEvent` fallback so the pipeline can keep emitting them.
- **Session invalidation.** Tells gateway to drop active streams and
  reject in-flight requests for the affected session(s) — the
  revocation propagation path described in [Section 1.5](#15-revocation).

### 7.4 Reliability and reconnect

Backend keeps an in-memory ring buffer of recent events. On
reconnect, gateway sends its last consumed cursor; backend resumes
from the next event when the cursor is still inside the
freshness-window TTL or restarts from the head when the cursor has
aged out. Per-connection backpressure is drop-oldest: a slow
gateway connection loses its oldest events first, with a log line
on each drop so both sides can correlate the gap.

The push channel is best-effort. The durable record of "we tried to
tell this user about this thing" lives in `notifications` /
`notification_routes` ([Section 8](#8-notifications-and-mail)); a missed push event does not
mean the platform forgets the event.

### 7.5 Producers

Backend producers that emit onto the push channel are: the
notification dispatcher (push routes from the catalog) and the
session module (revocation events). No domain module emits client
events outside of the notification dispatcher.

### 7.6 Cross-references

- Wire envelope used for push frames:
  [ARCHITECTURE.md §15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary).
- Reconnect and ring-buffer semantics:
  [ARCHITECTURE.md §8](ARCHITECTURE.md#8-backend--gateway-communication) and
  `backend/docs/flows.md` "Push gRPC".
- Notification dispatcher: [Section 8](#8-notifications-and-mail).

---

## 8. Notifications and mail

This scenario covers how the platform tells a user about an event
through push or e-mail (or both).

### 8.1 Scope

In scope: the notification intent submission flow, fan-out across
push and email channels, the durable mail outbox, dead-letter
handling, and operator-driven resend.

Out of scope: per-event semantics — when each kind fires is
documented in the relevant feature section ([Section 4](#4-lobby-participation) for lobby
kinds, [Section 5](#5-race-name-directory) for race-name kinds, [Section 6](#6-in-game-session) for game kinds).

### 8.2 Notification intent and fan-out

Domain producers (lobby, runtime, geo) submit a typed intent to the
notification module rather than handing the message off to a
specific channel. The module then:

- enforces idempotency on the intent kind plus a producer-supplied
  idempotency key;
- resolves recipients;
- materialises one route per recipient per channel, based on the
  type-specific policy in the catalog (push only, email only, both,
  or admin email);
- emits push routes onto the gRPC push stream consumed by gateway;
- inserts email routes directly into the mail outbox.

Malformed intents are quarantined to a dedicated table and never
block the producer.

### 8.3 The catalog

The catalog is a closed set of kinds. Each kind specifies its
channels and the payload fields the templates and clients consume.
Three kinds of entries deserve a callout:

- **`auth.login_code`.** This is the only kind that bypasses the
  notification pipeline entirely. Auth writes the email row
  directly to the outbox so the challenge commit is atomic with the
  mail enqueue.
- **`runtime.*` kinds.** They deliver to a configured admin email.
  When the admin email is unset, routes land with a `skipped`
  status and an operator log line — the request never fails because
  of missing operator config.
- **Reserved kinds without a producer.** `game.*` and
  `mail.dead_lettered` are listed in the catalog but no current
  module emits them. Adding a producer is purely additive.

### 8.4 Mail outbox

Email is a Postgres-backed durable outbox. Producers (notification
routes and the auth login-code path) write the delivery row plus
the rendered payload bytes in a single transaction. A worker
goroutine drains the outbox: it picks rows under a row lock,
attempts SMTP delivery, records the attempt, and either marks the
row sent or schedules the next attempt with exponential backoff and
jitter.

A delivery that exceeds the configured attempt budget moves to the
dead-letter table; the dead-lettering itself emits an admin
notification intent. On startup the worker drains everything that
is still pending or retrying — there is no separate recovery flow.

Operators can resend a non-`sent` delivery from the admin surface
([Section 10](#10-administration)). Resending a `sent` delivery is rejected so an
operator cannot accidentally re-deliver mail that has already left
the relay.

### 8.5 Operator visibility

The admin surface lists deliveries, attempts per delivery,
dead-letters, notifications, notification dead-letters, and
malformed notification intents. None of these listings are reachable
from the user surface.

### 8.6 Cross-references

- Notification catalog table (kinds, channels, payloads):
  [`backend/README.md` §10](../backend/README.md#10-notification-catalog).
- Mail outbox internals (tables, attempt log, worker pickup):
  [ARCHITECTURE.md §11](ARCHITECTURE.md#11-mail-outbox) and
  `backend/docs/flows.md` "Mail outbox".
- Push transport for client_event routes: [Section 7](#7-push-channel).

---

## 9. Geo signal

This scenario covers what backend records about the source IP of an
authenticated request, and what it deliberately does not do with it.

### 9.1 Scope

In scope: the one-shot declared country at registration, the
fire-and-forget per-request country counter, and the operator-only
inspection endpoint.

Out of scope: any kind of automatic flagging, account-takeover
detection, geo-fencing, sanctions enforcement, or version history.
The geo signal is a passive record, not an enforcement mechanism.

### 9.2 What backend records

At registration ([Section 1.3](#13-confirming-the-challenge)), backend looks up the source IP
against the GeoLite2 country database and stores the resulting ISO
country code on the account. This value is written exactly once per
account; subsequent sign-ins from a different country do not
overwrite it.

On every authenticated request through the user surface, a
fire-and-forget goroutine performs the same lookup against the
request IP and increments a per-(user, country) counter. The
request itself never blocks on this work; the goroutine runs after
the handler returns.

Both paths fail open: a geoip lookup error is logged but never
blocks the user.

### 9.3 What backend does NOT do

- No aggregation across users.
- No automatic flagging when the country changes.
- No notifications, ever, derived from the geo signal.
- No version history of `declared_country`.
- No correlation with sanctions, limits, or entitlements.

### 9.4 Operator access

The admin surface exposes a single read endpoint that lists per-user
country counters. The data is intended for manual inspection during
operator triage; there is no UI workflow built on top of it.

### 9.5 Source IP discipline

Backend reads the source IP from the leftmost `X-Forwarded-For`
entry, falling back to the connection peer when the header is
absent. Backend trusts the value because the network segment
between gateway and backend is the platform trust boundary — the
edge has already sanitised it. This is intentional and is restated
in [ARCHITECTURE.md §10](ARCHITECTURE.md#10-geo-profile-reduced) and [§16](ARCHITECTURE.md#16-security-boundaries-summary).

E-mail addresses are never written to logs verbatim. Backend logs a
process-scoped HMAC-truncated hash so operators can correlate log
lines within a single process lifetime without persisting PII.

### 9.6 Cross-references

- Trust-boundary rationale:
  [ARCHITECTURE.md §10](ARCHITECTURE.md#10-geo-profile-reduced),
  [§15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary),
  [§16](ARCHITECTURE.md#16-security-boundaries-summary).
- One-shot registration write vs. per-request counter contract:
  [`backend/README.md` §11](../backend/README.md#11-geo-profile).

---

## 10. Administration

This scenario covers every admin-only operation. Many of them have
been referenced in earlier sections (admin overrides for lobby,
admin-side soft delete and sanctions, mail and notification
inspection); this section is the consolidated view.

### 10.1 Scope

In scope: admin authentication, the cross-domain admin operations,
their side effects on the rest of the platform.

Out of scope: end-user-driven workflows that share a domain with an
admin operation — those live in their owning section.

### 10.2 Authentication and bootstrap

The admin surface uses HTTP Basic Auth against a backend-owned
admin-account table; passwords are bcrypt-hashed. On startup, if a
bootstrap admin username and password are configured and the table
does not yet contain a row with that username, backend inserts one.
The insert is idempotent: subsequent restarts do nothing.

A failed Basic Auth response prompts the operator's tooling for
credentials in the standard way; the realm string is fixed so an
operator's password manager can match it across deployments.

After the first deployment, the bootstrap password should be
rotated through the admin surface.

### 10.3 Admin account management

Existing admins can list other admins, create new ones, look up a
specific admin, disable or re-enable an admin, and reset an admin's
password. A disabled admin row cannot authenticate; the row is kept
to preserve audit references rather than deleted.

Reset-password takes the new password in the request body. Backend
bcrypt-hashes it, replaces `admin_accounts.password_hash`, and
returns the updated `AdminAccount` shape — the new password itself
is never echoed back. "Delivered out-of-band" therefore means: the
admin who initiates the reset is the one who must communicate the
new value to the target through some channel outside the platform
(secure messenger, voice, etc.); the platform does not e-mail or
otherwise auto-deliver it.

### 10.4 User administration

For any user account, an admin can:

- list and inspect accounts;
- apply a sanction;
- apply a per-user limit override that adjusts a specific quota;
- update the entitlement (plan, paid flag, source, validity);
- soft-delete the account (the same in-process cascade as
  [Section 2.4](#24-user-initiated-soft-delete)).

The sanction catalogue is intentionally minimal in the MVP: the
only supported `sanction_code` is `permanent_block`. Applying it
flips `accounts.permanent_block`, revokes every active session
([Section 1.5](#15-revocation)), and runs the same lobby cascade as soft-delete with
membership status `blocked` ([Section 2.4](#24-user-initiated-soft-delete)). The openapi schema
encodes this as a closed enum so future additions are an explicit,
breaking change. Soft-delete always revokes sessions; sanctions
revoke only when the kind documents that side effect (today: only
`permanent_block`).

### 10.5 Game administration

Admins create public games, list and inspect any game, force-start
or force-stop a game, and ban a member. Force-stop tears down the
running engine container for the game; ban-member adds the user to
the game's block list and removes any active membership
([Section 4.4](#44-memberships)).

Public-game ownership is collective: the row carries
`owner_user_id IS NULL` and any admin can act on it. The user
surface never produces or transitions a public game.

### 10.6 Runtime administration

Admins inspect the runtime record for a game, restart the engine
container, patch its image to a newer semver-patch within the same
major / minor line, and force a one-shot extra turn tick.

Patch is intentionally restricted to the patch component. A major
or minor version change requires the explicit stop / start of the
game, not an in-place upgrade. Engine version registration and
disable live next door.

### 10.7 Engine version registry

The engine version registry is the source of allowed engine images.
Producers (start, restart, patch) never pick image references on
their own; they read from the registry. Disabling a version is a
forward-looking decision: existing running containers keep their
current image until a stop / start, but the disabled version is no
longer eligible for new starts or patches.

### 10.8 Mail and notifications administration

Operators can list and inspect mail deliveries, attempts per
delivery, dead-letters, notifications, notification dead-letters,
and malformed notification intents. They can also resend a non-sent
mail delivery ([Section 8.4](#84-mail-outbox)).

These views are the only path to mail and notification observability
outside of telemetry.

### 10.9 Geo administration

The single geo admin endpoint lists per-user country counters
([Section 9.4](#94-operator-access)). There is no admin write access to geo data; the
declared country is set once at registration and never changes,
counters are populated by the runtime, and operators can only read.

### 10.10 Cross-references

- Cascade contract for soft delete:
  [ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns).
- Container lifecycle and version arbitration:
  [ARCHITECTURE.md §13](ARCHITECTURE.md#13-container-lifecycle-in-process).
- Mail outbox and notification dispatcher:
  [ARCHITECTURE.md §11](ARCHITECTURE.md#11-mail-outbox),
  [§12](ARCHITECTURE.md#12-notification-pipeline) and [Section 8](#8-notifications-and-mail).