14b65389ef
Tests · UI / test (push) Successful in 2m35s
Tests · Go / test (push) Successful in 1m56s
Tests · UI / test (pull_request) Has been cancelled
Tests · Integration / integration (pull_request) Successful in 1m42s
Tests · Go / test (pull_request) Successful in 2m0s
Browser fetch-streaming layers close response bodies they consider
idle after roughly 15-30 s without incoming bytes. Safari is the
most aggressive, but the symptom matters everywhere: a quiet
SubscribeEvents stream (lobby, between turns, mailbox empty) gets
torn down by the browser, the EventStream singleton reconnects with
backoff, and any push event that fires inside the reconnect window
is lost because `push.Hub` queues are not persisted across
subscription closes. The user-visible failure mode is the
intermittent "Fetch API cannot load … due to access control checks"
console error (a misleading WebKit symptom — CORS headers are
actually present) plus missed turn-ready / mail-received toasts.
Server-side fix: a silence-based heartbeat at the
`authenticatedPushStreamService` wrapper layer. After the signed
`gateway.server_time` bootstrap event, gateway wraps the bound
stream with `heartbeatingStream`. Every tail Send (fan-out, future
variants) resets the silence timer; when the timer elapses, a
goroutine emits `gateway.heartbeat` with only `EventType` set —
everything else stays at proto3 defaults, so the wire frame is
~45 bytes amortised. A `sendMu` serialises the heartbeat goroutine
with tail Sends because grpc.ServerStream.Send is not goroutine-safe.
The heartbeat is intentionally UNSIGNED: heartbeats carry no
payload, dispatch to no handler on the client, and an injected
heartbeat trivially causes no user-visible state change. TLS still
protects the wire and real events keep the signed envelope
unchanged. Documented in `docs/ARCHITECTURE.md` § 15 alongside the
per-scale bandwidth projection (100…100 000 clients × 15…60 s).
Config: new `GATEWAY_PUSH_HEARTBEAT_INTERVAL` (default `15s`,
`0s` disables). Telemetry: new
`gateway.push.heartbeats_sent{outcome}` counter so operators can
budget bandwidth and spot a sudden `outcome=error` bump as an
upstream-failing-before-flush signal.
Client (`ui/frontend/src/api/events.svelte.ts`): early `continue`
on `event.eventType === "gateway.heartbeat"` before `verifyEvent`,
`verifyPayloadHash`, or dispatch — empty signature would otherwise
trip SignatureError and reconnect. A leading heartbeat still flips
`connectionStatus` to `connected` and resets backoff, because
receiving one is proof the stream is healthy.
Tests:
- `push_heartbeat_test.go`: unit tests for the wrapper — zero
interval returns nil, heartbeat fires after silence, real Send
resets the timer, Stop / context-cancel halt the goroutine,
Send errors propagate.
- `server_test.go`: integration tests through the full gateway
pipeline — heartbeat fires after the configured silence window,
zero interval keeps the stream silent.
- `config_test.go`: default applied, env-override parsed,
negative value rejected.
- `events.test.ts`: heartbeat skipped before verification + not
dispatched to handlers; leading heartbeat still flips
`connectionStatus` to `connected`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1407 lines
62 KiB
Markdown
1407 lines
62 KiB
Markdown
# Galaxy Functional Specification
|
|
|
|
This document describes what the Galaxy platform does, in terms of
|
|
user-visible operations and the per-service logic that implements them.
|
|
Each section walks through one domain scenario: who initiates an
|
|
operation, what `gateway` checks and forwards, what `backend` validates
|
|
and persists, what is returned to the client, and what side effects
|
|
fire (mail, push, container ops).
|
|
|
|
This is the starting point for any change request that touches
|
|
behaviour. The exact wire shape, error code vocabulary, environment
|
|
variables, default values, throttle limits, table and column names,
|
|
and field-level validation live in the lower-level sources:
|
|
|
|
- [`ARCHITECTURE.md`](ARCHITECTURE.md) — global architecture, security
|
|
model, transport contract.
|
|
- `galaxy/<service>/README.md` — service layout, configuration,
|
|
operations.
|
|
- `galaxy/<service>/openapi.yaml`, `*.proto` — wire contracts.
|
|
- `galaxy/<service>/docs/flows.md` — sequence diagrams.
|
|
|
|
This file deliberately omits those details. When this file and a
|
|
lower-level source disagree, see the synchronisation rule in the
|
|
project `CLAUDE.md`.
|
|
|
|
A Russian translation lives in
|
|
[`FUNCTIONAL_ru.md`](FUNCTIONAL_ru.md). It is a convenience mirror for
|
|
the project owner, **not a source of truth** — this English file is
|
|
authoritative. Every point edit to this file must be mirrored into the
|
|
Russian version in the same patch (translate only the touched
|
|
paragraphs); a full re-translation happens only on explicit owner
|
|
request.
|
|
|
|
The document is organised by domain scenario, not by HTTP route group.
|
|
Public, user-authenticated, and admin operations may all appear in the
|
|
same scenario when they participate in the same business flow.
|
|
|
|
## Table of Contents
|
|
|
|
1. [Authentication and device session](#1-authentication-and-device-session)
|
|
2. [Account management](#2-account-management)
|
|
3. [Lobby game lifecycle](#3-lobby-game-lifecycle)
|
|
4. [Lobby participation](#4-lobby-participation)
|
|
5. [Race Name Directory](#5-race-name-directory)
|
|
6. [In-game session](#6-in-game-session)
|
|
7. [Push channel](#7-push-channel)
|
|
8. [Notifications and mail](#8-notifications-and-mail)
|
|
9. [Geo signal](#9-geo-signal)
|
|
10. [Administration](#10-administration)
|
|
11. [Diplomatic mail](#11-diplomatic-mail)
|
|
|
|
---
|
|
|
|
## 1. Authentication and device session
|
|
|
|
This scenario covers how an anonymous client becomes authenticated and
|
|
stays authenticated until a server-side action revokes that authority.
|
|
|
|
### 1.1 Scope
|
|
|
|
In scope: issuing an e-mail login challenge, confirming it (with
|
|
first-sign-in account creation and registration of the client's
|
|
public key), creating a device session, the per-request session
|
|
lookup that grounds every authenticated call, and server-initiated
|
|
revocation.
|
|
|
|
Out of scope: the wire envelope and signature scheme used by every
|
|
authenticated request — defined once in
|
|
[ARCHITECTURE.md §15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary) and reused by every later
|
|
section; client-side key storage; how push events are routed inside
|
|
gateway to a specific subscriber stream.
|
|
|
|
### 1.2 Issuing a login challenge
|
|
|
|
The client posts an e-mail address to the public auth surface on
|
|
gateway. The route is unauthenticated — there is no device session
|
|
yet to bind to.
|
|
|
|
Gateway treats this as a stricter "public auth" route class: it
|
|
applies per-IP and per-identity (per-email) anti-abuse, a body-size
|
|
cap, and a method allow-list, then forwards the request to backend.
|
|
Failures of the upstream adapter are projected back to the client
|
|
with the same status and error envelope; transport-level failures
|
|
become a generic unavailable response.
|
|
|
|
Backend produces an opaque challenge identifier and emits a
|
|
verification e-mail through the durable mail outbox. The response
|
|
shape is **identical regardless of whether the e-mail belongs to an
|
|
existing account, a fresh account, or a throttled one**, so the
|
|
endpoint cannot be used to enumerate accounts.
|
|
|
|
Branches inside backend:
|
|
|
|
- **Permanent block.** If the address is permanently blocked at the
|
|
account level, the request is rejected. This is the only
|
|
account-state branch that surfaces a distinct error code; every
|
|
other branch returns the standard challenge-id response.
|
|
- **Throttle.** If too many un-consumed, non-expired challenges
|
|
already exist for the same e-mail inside the throttle window,
|
|
backend reuses the latest existing challenge instead of creating a
|
|
new one. The client gets the same response shape and is unaware of
|
|
the reuse.
|
|
- **Otherwise.** Backend creates a new challenge with the resolved
|
|
preferred language (derived from the optional `locale` body field
|
|
the caller sends — which takes priority — or, if absent or blank,
|
|
from the `Accept-Language` header forwarded by gateway, falling
|
|
back to a default), and enqueues the auth-mail row directly into
|
|
the outbox in the same transaction. SMTP delivery is asynchronous;
|
|
the auth response returns as soon as the challenge and outbox rows
|
|
are durably committed. The body field is the canonical channel
|
|
because Safari silently drops JS-set `Accept-Language` headers;
|
|
non-Safari clients can still rely on the header alone.
|
|
|
|
### 1.3 Confirming the challenge
|
|
|
|
The client posts the challenge id, the code received by mail, a fresh
|
|
Ed25519 public key, and the chosen IANA time zone. Gateway applies
|
|
the same public-auth anti-abuse class, with the per-identity bucket
|
|
keyed by the challenge id rather than the e-mail. `Accept-Language` is
|
|
not consulted on this endpoint — the preferred language was captured
|
|
at send-time and is replayed from the challenge row.
|
|
|
|
Backend validates the challenge under a row lock: it rejects unknown,
|
|
expired, or already-consumed ids, increments the attempt counter, and
|
|
burns the challenge once the per-challenge attempt ceiling is reached.
|
|
After the code matches, backend re-checks the permanent-block flag —
|
|
catching the case where an admin applied the block between send and
|
|
confirm — and rejects the request when set. On the success path backend
|
|
ensures the account exists (synthesising an immutable display handle on
|
|
first sign-in only and populating the declared country from the source
|
|
IP), then marks the challenge consumed and creates a device session
|
|
bound to the caller's public key in the same transaction. The response
|
|
carries the new device session id.
|
|
|
|
A challenge is single-use. A second confirm on the same id returns the
|
|
same opaque `invalid_request` shape as confirming an unknown or expired
|
|
id; the API deliberately does not differentiate between the three so an
|
|
attacker cannot mine challenge state. Throttle reuse on the send side
|
|
means a client hitting the throttle gets the latest existing
|
|
`challenge_id` back instead of a fresh one, but every id is still
|
|
consumed exactly once.
|
|
|
|
### 1.4 Per-request session lookup
|
|
|
|
Once the client holds a device session id and a private key, every
|
|
authenticated call is a signed request to gateway over the
|
|
authenticated edge listener (Connect / gRPC / gRPC-Web on a single
|
|
HTTP/h2c port). Gateway is the only component that ever sees the
|
|
request signature; backend trusts gateway's verdict.
|
|
|
|
Gateway needs the session's public key to verify the signature, so each
|
|
authenticated request resolves the device session through an in-memory
|
|
LRU cache (bounded entry count plus a safety-net TTL). On miss the
|
|
cache calls backend's per-request session lookup endpoint and seeds the
|
|
entry. Gateway rejects the request when the cache returns "session
|
|
unknown" or "revoked"; otherwise it verifies the envelope per
|
|
[ARCHITECTURE.md §15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary) and forwards the verified
|
|
payload to backend over plain REST, injecting the resolved user id in a
|
|
header. Backend never re-derives identity from the request body.
|
|
|
|
Backend updates `last_seen_at` on the session row on every successful
|
|
lookup so admin operators can observe when each cached session was
|
|
last resolved at the edge. The update is part of the lookup
|
|
transaction; failures are logged but do not surface to the caller.
|
|
|
|
The cache is invalidated through the push channel rather than a
|
|
periodic refresh: a `session_invalidation` event flips the cached
|
|
entry's status to revoked, so subsequent requests bound to the
|
|
session are rejected without another backend round-trip. The TTL is
|
|
the safety net for missed events (cursor aged out, gateway restart) —
|
|
in steady state the push events are the authoritative source of
|
|
invalidation.
|
|
|
|
### 1.5 Revocation
|
|
|
|
Revocation makes a device session unable to authenticate any future
|
|
request and forces in-flight push streams bound to it to close.
|
|
Triggers fall in two groups.
|
|
|
|
**User-driven (logout).** The user surface exposes three operations:
|
|
list the caller's active sessions, revoke a single one, and revoke
|
|
all of them. Gateway forwards these to backend as ordinary
|
|
authenticated requests. Backend verifies the target session belongs
|
|
to the caller (otherwise responds with the same shape as a missing
|
|
session, so foreign session ids are not probeable), atomically flips
|
|
`device_sessions.status` to `revoked` and inserts a row into
|
|
`session_revocations`, then publishes one `session_invalidation`
|
|
event per revoked session.
|
|
|
|
**Admin-driven and lifecycle.** Sanctions that imply session
|
|
revocation (currently `permanent_block`), admin-driven soft delete,
|
|
and user-self soft delete all run an in-process call inside backend.
|
|
The same atomic UPDATE + audit-insert + push emission applies; the
|
|
audit row carries a different `actor_kind`
|
|
(`admin_sanction` / `soft_delete_admin` / `soft_delete_user`).
|
|
|
|
Once backend has emitted the push event, gateway flips the cached
|
|
session entry to revoked and closes any active push streams bound to
|
|
it. The per-request internal lookup against backend remains the
|
|
durable safety net: if a push event is lost, the next lookup (after
|
|
the cache TTL) returns the revoked record.
|
|
|
|
`session_revocations` is the audit ledger. Each row carries
|
|
`revocation_id`, `device_session_id`, `user_id`, `actor_kind`, the
|
|
actor pair (`actor_user_id` for user-driven kinds, `actor_username`
|
|
for admin-driven kinds — exactly one is non-NULL per row), `reason`,
|
|
and `revoked_at`. Operators can query it to answer "who and why
|
|
revoked this session"; the table is append-only.
|
|
|
|
Backend's `/api/v1/internal/sessions/{id}` is read-only — it carries
|
|
the per-request session lookup gateway needs to verify signed
|
|
envelopes. Internal revoke endpoints no longer exist; revoke is
|
|
either user-driven (through the user surface) or admin-driven
|
|
(through in-process calls inside backend).
|
|
|
|
### 1.6 Cross-references
|
|
|
|
- Wire envelope, signing, freshness window, anti-replay:
|
|
[ARCHITECTURE.md §15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary).
|
|
- Backend module responsibilities for `auth`, `user`, `geo`, `mail`,
|
|
`push`: [ARCHITECTURE.md §4](ARCHITECTURE.md#4-backend-domain-modules) and
|
|
`backend/README.md`.
|
|
- Mail outbox semantics for the auth login-code template:
|
|
[ARCHITECTURE.md §11](ARCHITECTURE.md#11-mail-outbox).
|
|
- Push channel framing and reconnect rules:
|
|
[ARCHITECTURE.md §8](ARCHITECTURE.md#8-backend--gateway-communication). User-facing push semantics
|
|
appear in [Section 7](#7-push-channel) of this document.
|
|
|
|
---
|
|
|
|
## 2. Account management
|
|
|
|
This scenario covers what an authenticated user can read or change
|
|
about their own account, and how a user removes the account.
|
|
|
|
### 2.1 Scope
|
|
|
|
In scope: reading the account aggregate, updating the mutable profile
|
|
slice, updating settings (preferred language, time zone, declared
|
|
country), and user-initiated soft delete.
|
|
|
|
Out of scope: admin-side mutation of the same account (sanctions,
|
|
limits, entitlement changes, admin soft delete) — covered in
|
|
[Section 10](#10-administration). Permanent block flag toggling is admin-only.
|
|
|
|
### 2.2 The account aggregate
|
|
|
|
Backend exposes a single read endpoint that returns the caller's
|
|
account aggregate: the durable identifying fields (immutable display
|
|
handle, e-mail), the mutable profile and settings slices, the
|
|
current entitlement snapshot, and any active sanctions and per-user
|
|
limit overrides. The aggregate is the authoritative client-side view
|
|
of "what the platform knows about me".
|
|
|
|
The display handle is synthesised at first sign-in ([Section 1.3](#13-confirming-the-challenge)) and
|
|
is never overwritten on subsequent sign-ins or on profile updates.
|
|
Clients should treat it as a stable identifier rather than a display
|
|
preference.
|
|
|
|
### 2.3 Profile and settings updates
|
|
|
|
Two distinct mutating endpoints split user-controlled fields by the
|
|
nature of the change. Both follow PATCH semantics — omitted fields
|
|
are not touched, present fields replace the stored value — and both
|
|
return the updated aggregate.
|
|
|
|
Profile carries one display-oriented field: `display_name`. An
|
|
explicit empty value clears the stored name; omitting the field
|
|
leaves it untouched.
|
|
|
|
Settings carries locale and timezone preferences:
|
|
`preferred_language` (BCP 47 tag) and `time_zone` (IANA identifier).
|
|
Both must be non-empty after trim when present; the timezone is
|
|
validated against the IANA database before commit.
|
|
|
|
`declared_country` is **not** part of either patch. Backend writes it
|
|
once at registration from the source IP ([Section 9](#9-geo-signal)) and treats it as
|
|
immutable thereafter; there is no user-facing path to change it.
|
|
|
|
### 2.4 User-initiated soft delete
|
|
|
|
The user can ask backend to soft-delete their own account. Backend
|
|
marks the account row deleted, then runs the in-process cascade
|
|
documented in [ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns). Concretely:
|
|
|
|
- Every device session for the user is revoked ([Section 1.5](#15-revocation)), with
|
|
one audit row per session and one `session_invalidation` push
|
|
event per session.
|
|
- Active memberships flip to `removed` (admin-driven block flips
|
|
them to `blocked`); pending applications get `rejected`; incoming
|
|
invites get `declined`; outgoing invites get `revoked`.
|
|
- Race name entries owned by the user — registered, reservation, or
|
|
pending_registration — are deleted in a single cascade write.
|
|
- Owned games in non-running statuses (`draft`, `enrollment_open`,
|
|
`ready_to_start`, `start_failed`, `paused`) are cancelled. Owned
|
|
games already in `running` are **not** cancelled by the cascade —
|
|
the engine container keeps producing turns until it finishes
|
|
naturally; only the membership cleanup detaches the user.
|
|
- A single `lobby.membership.removed` notification fans out to the
|
|
user with `reason=removed` (or `reason=blocked` for the admin
|
|
block path).
|
|
|
|
The endpoint returns no body. The cascade is best-effort within a
|
|
single process: if a downstream module fails, the failure is logged
|
|
but the account stays marked deleted.
|
|
|
|
### 2.5 Cross-references
|
|
|
|
- Admin-side counterparts (sanction, limit, entitlement, soft delete):
|
|
[Section 10](#10-administration).
|
|
- The cascade contract for "user blocked / user deleted":
|
|
[ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns).
|
|
- Notification kinds emitted during the cascade:
|
|
[`backend/README.md` §10](../backend/README.md#10-notification-catalog).
|
|
|
|
---
|
|
|
|
## 3. Lobby game lifecycle
|
|
|
|
This scenario covers a single game's life from creation to terminal
|
|
state. [Section 4](#4-lobby-participation) covers how players join an existing game; this
|
|
section focuses on the game itself.
|
|
|
|
### 3.1 Scope
|
|
|
|
In scope: creating a game (private vs public), updating its mutable
|
|
configuration, transitioning it through the lobby state machine,
|
|
cancellation, retry of a failed start, and the terminal transitions
|
|
(`finished`, `cancelled`).
|
|
|
|
Out of scope: applications, invites, memberships ([Section 4](#4-lobby-participation)), Race
|
|
Name Directory promotions on finish ([Section 5](#5-race-name-directory)), engine commands
|
|
during the running phase ([Section 6](#6-in-game-session)).
|
|
|
|
### 3.2 The state machine
|
|
|
|
The lobby state machine is the closed graph documented in
|
|
[ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns):
|
|
|
|
```text
|
|
draft → enrollment_open → ready_to_start → starting → running ↔ paused → finished
|
|
↳ start_failed → ready_to_start (retry)
|
|
cancelled is reachable from every pre-finished state.
|
|
```
|
|
|
|
Two ground rules:
|
|
|
|
- **Ownership decides the surface.** Private games carry a
|
|
`owner_user_id`; transitions are driven by the owner through the
|
|
user surface. Public games are owned collectively by administrators
|
|
(`owner_user_id IS NULL`); their transitions and configuration
|
|
changes go through the admin surface.
|
|
- **The runtime callback owns one transition.** `starting → running`
|
|
and `starting → start_failed` are the only transitions that the
|
|
runtime module produces, after the engine container is fully up or
|
|
has confirmed failure. Every other transition is a user or admin
|
|
action.
|
|
|
|
### 3.3 Creation
|
|
|
|
A user creates a private game through the user surface. Backend
|
|
records the new game with `owner_user_id` set to the caller and
|
|
visibility `private`, in state `draft`, with the request body's
|
|
configuration as initial values.
|
|
|
|
Public games are created exclusively through the admin surface
|
|
([Section 10](#10-administration)). The user surface never produces a public game; this
|
|
asymmetry is enforced in backend, not at the route level.
|
|
|
|
### 3.4 Forward transitions
|
|
|
|
Owners drive forward transitions via dedicated endpoints
|
|
(`open-enrollment`, `ready-to-start`, `start`, `pause`, `resume`,
|
|
`retry-start`). Each endpoint:
|
|
|
|
- checks ownership of the game (or admin scope for public games);
|
|
- checks the source state matches the transition's precondition,
|
|
rejecting with a conflict if not;
|
|
- updates the lobby record and publishes any user-facing
|
|
notifications attached to the transition.
|
|
|
|
`start` queues a runtime job (long-running container pull / start /
|
|
init) and immediately returns "queued". Final state movement
|
|
(`starting → running` or `starting → start_failed`) arrives later
|
|
through the runtime callback. `retry-start` re-arms a `start_failed`
|
|
game back to `ready_to_start` and lets the owner trigger `start`
|
|
again.
|
|
|
|
`pause` and `resume` flip between `running` and `paused`. The
|
|
running engine container is not torn down on pause; only the lobby
|
|
schedule and command-acceptance flags change.
|
|
|
|
`ready-to-start` is always an explicit owner (or admin) action,
|
|
never auto-fired. The transition checks that the approved member
|
|
count is at least `min_players` and rejects with a conflict
|
|
otherwise.
|
|
|
|
### 3.5 Cancellation and finish
|
|
|
|
`cancel` is reachable from every pre-finished state. Owners can
|
|
cancel their own games; admins can cancel any. Cancellation
|
|
reconciles outstanding applications, invites, and memberships; it
|
|
does not promote race-name reservations.
|
|
|
|
`finished` is produced inside backend after the engine reports the
|
|
game finished. The transition tears down the engine container,
|
|
freezes the lobby record, and triggers Race Name Directory
|
|
promotions for capable finishes ([Section 5](#5-race-name-directory)). Both terminal states
|
|
are absorbing.
|
|
|
|
### 3.6 Admin overrides
|
|
|
|
Administrators can `force-start`, `force-stop`, and `ban-member` on
|
|
any game (public or private) regardless of state. `force-stop`
|
|
transitions the game to a stopped state and tears down the engine
|
|
container; `ban-member` removes a membership and prevents the user
|
|
from re-joining ([Section 4](#4-lobby-participation)).
|
|
|
|
### 3.7 Cross-references
|
|
|
|
- State machine vocabulary and transition rules:
|
|
[ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns).
|
|
- Runtime job lifecycle (the asynchronous work behind `start`):
|
|
[ARCHITECTURE.md §13](ARCHITECTURE.md#13-container-lifecycle-in-process) and `backend/docs/flows.md`.
|
|
- Public-vs-private invariants and the partial index that supports
|
|
them: [ARCHITECTURE.md §4](ARCHITECTURE.md#4-backend-domain-modules).
|
|
|
|
---
|
|
|
|
## 4. Lobby participation
|
|
|
|
This scenario covers everything around joining and leaving an
|
|
existing game: applications (public), invites (private), and
|
|
memberships (after the join succeeds).
|
|
|
|
### 4.1 Scope
|
|
|
|
In scope: submitting an application to a public game, owner / admin
|
|
approval or rejection of an application, issuing and redeeming
|
|
invites, recipient decline and issuer revocation, listing
|
|
memberships per game, and member removal or block.
|
|
|
|
Out of scope: the game state machine itself ([Section 3](#3-lobby-game-lifecycle)) and the
|
|
in-game commands once a member is playing ([Section 6](#6-in-game-session)).
|
|
|
|
### 4.2 Applications (public games)
|
|
|
|
A user submits an application to a game by id. Applications are
|
|
**only accepted on public games**; an attempt against a private game
|
|
is rejected with a conflict. The game must additionally be in
|
|
`enrollment_open` (the only enrolment-accepting state for
|
|
applications). Backend also rejects the request if the user is
|
|
already a member or on the game's block list (via `ban-member`).
|
|
Otherwise it stores the application as `pending` and emits a
|
|
notification to the admin channel.
|
|
|
|
The owner — or an administrator for public games — approves or
|
|
rejects the application through dedicated endpoints. Approval
|
|
creates a membership for the applicant and emits the corresponding
|
|
notification. Rejection just records the terminal state; no
|
|
membership appears.
|
|
|
|
### 4.3 Invites (private games)
|
|
|
|
Invites are **only accepted on private games**; an attempt to issue
|
|
one for a public game is rejected with a conflict. The owner issues
|
|
an invite while the game is in `draft`, `enrollment_open`, or
|
|
`ready_to_start`.
|
|
|
|
Two flavours coexist:
|
|
|
|
- **User-bound** — `invited_user_id` is set; only that user may
|
|
redeem. A `lobby.invite.received` notification is emitted to the
|
|
recipient.
|
|
- **Code-based** — `invited_user_id` is empty; backend mints a hex
|
|
code at issue time and any caller who knows the code may redeem.
|
|
No notification is emitted at issue time (no recipient is bound
|
|
yet).
|
|
|
|
Each invite carries an expiry (defaulted from configuration when
|
|
the body omits `expires_at`). The recipient redeems (creates a
|
|
membership) or declines; the issuer can revoke an outstanding
|
|
invite at any time before redemption.
|
|
|
|
### 4.4 Memberships
|
|
|
|
Memberships list the players currently attached to a game. Owners
|
|
can remove or block a member; a member can also remove themselves.
|
|
Removal terminates participation cleanly; block additionally
|
|
prevents the same user from re-applying or redeeming a future
|
|
invite for the same game.
|
|
|
|
The admin surface offers `ban-member` as the cross-game-policy
|
|
counterpart to the owner's block.
|
|
|
|
### 4.5 Listing the caller's view
|
|
|
|
The user surface exposes three "my" listings (games, applications,
|
|
invites). They project the caller's involvement across all games
|
|
without requiring the client to know game ids in advance, which
|
|
makes the dashboard and inbox views possible.
|
|
|
|
### 4.6 Notifications
|
|
|
|
Every state change in this scenario emits a notification kind from
|
|
the catalog: `lobby.invite.received`, `lobby.invite.revoked`,
|
|
`lobby.application.submitted`, `lobby.application.approved`,
|
|
`lobby.application.rejected`, `lobby.membership.removed`,
|
|
`lobby.membership.blocked`. [Section 8](#8-notifications-and-mail) documents the fan-out.
|
|
|
|
### 4.7 Cross-references
|
|
|
|
- Game lifecycle: [Section 3](#3-lobby-game-lifecycle).
|
|
- Notification catalog and fan-out: [Section 8](#8-notifications-and-mail) and
|
|
[`backend/README.md` §10](../backend/README.md#10-notification-catalog).
|
|
|
|
---
|
|
|
|
## 5. Race Name Directory
|
|
|
|
This scenario covers how a player picks the name of their in-game
|
|
race and, eventually, gets that name registered platform-wide.
|
|
|
|
### 5.1 Scope
|
|
|
|
In scope: the three-tier directory (registered, reservation,
|
|
pending_registration), promotion through "capable finish",
|
|
user-driven promotion of a pending registration to registered,
|
|
sweeper-driven release on TTL expiry, and uniqueness through the
|
|
canonical-key model.
|
|
|
|
Out of scope: how the engine actually consumes the chosen name —
|
|
that lives in [Section 6](#6-in-game-session).
|
|
|
|
### 5.2 Three tiers
|
|
|
|
- **Registered** is platform-unique. A canonical key has at most one
|
|
live binding to a single user.
|
|
- **Reservation** is per-game. The same canonical key can be
|
|
reserved by the same user across several active games at the same
|
|
time, but two different users cannot reserve the same canonical
|
|
key in the same game.
|
|
- **Pending registration** is the transient tier between
|
|
reservation and registered. It is issued automatically after a
|
|
"capable finish" (the game ended with the player having grown
|
|
their initial planet count and population), and it gives the user
|
|
a bounded window to convert the reservation into a permanent
|
|
registration.
|
|
|
|
### 5.3 Canonicalisation
|
|
|
|
Every name (typed by a user or registered by the platform) is
|
|
folded into a canonical key. Canonicalisation is confusable-aware
|
|
(latin-cyrillic look-alikes, digit-letter substitutions) and is
|
|
applied uniformly across the directory; uniqueness is enforced on
|
|
the canonical key, not on the displayed name. Cross-tier conflicts
|
|
on the same canonical key are blocked at write time through a
|
|
per-canonical advisory lock.
|
|
|
|
### 5.4 Promotion path
|
|
|
|
A reservation appears when a player names their race during a game.
|
|
When the game finishes capably, backend automatically converts the
|
|
reservation into a pending_registration with a TTL. While the
|
|
pending entry is alive, the user can call the registration endpoint
|
|
to promote the entry to `registered`. If the TTL expires first, a
|
|
periodic sweeper releases the entry; the canonical key becomes
|
|
available again.
|
|
|
|
A pending registration can be claimed only by the user who earned
|
|
it; backend rejects an attempt by a different user even if the
|
|
canonical key matches.
|
|
|
|
### 5.5 Notifications
|
|
|
|
The directory emits `lobby.race_name.registered`,
|
|
`lobby.race_name.pending`, and `lobby.race_name.expired` to the
|
|
owning user. [Section 8](#8-notifications-and-mail) covers fan-out.
|
|
|
|
### 5.6 Cross-references
|
|
|
|
- Canonicalisation library and glossary entries
|
|
("canonical key", "capable finish"):
|
|
[ARCHITECTURE.md §19](ARCHITECTURE.md#19-glossary).
|
|
- The promotion trigger inside the lobby module:
|
|
[ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns) (`lobby.OnGameFinished`)
|
|
and `backend/docs/flows.md`.
|
|
|
|
---
|
|
|
|
## 6. In-game session
|
|
|
|
This scenario covers what an active player does while a game is
|
|
running: submit commands and orders, read turn reports.
|
|
|
|
### 6.1 Scope
|
|
|
|
In scope: command submission, order submission, report reading, and
|
|
the turn-cutoff behaviour that closes the command window during
|
|
generation.
|
|
|
|
Out of scope: how the engine container itself is started, scheduled,
|
|
or stopped — those are runtime concerns covered in [Section 3](#3-lobby-game-lifecycle) (start
|
|
/ stop) and [Section 10](#10-administration) (admin runtime overrides). The wire format of
|
|
commands, orders, and reports is the engine's own contract and is
|
|
not duplicated here.
|
|
|
|
### 6.2 Backend's role: pass-through with authorisation
|
|
|
|
The signed authenticated-edge pipeline for in-game traffic uses four
|
|
message types on the authenticated surface — `user.games.command`,
|
|
`user.games.order`, `user.games.order.get`, `user.games.report` —
|
|
each with a typed FlatBuffers payload. Gateway transcodes the FB
|
|
request into the JSON shape backend expects, forwards over plain
|
|
REST to the corresponding `/api/v1/user/games/{game_id}/*` endpoint,
|
|
then transcodes the JSON response back into FB before signing the
|
|
reply. `user.games.order.get` is the read-back companion to
|
|
`user.games.order`: clients use it to hydrate the local order draft
|
|
after a cache loss (fresh install, cleared storage, new device).
|
|
|
|
For every in-game endpoint the user surface acts as an authorised
|
|
pass-through to the engine container. Backend:
|
|
|
|
- verifies the caller is an active member of the target game and
|
|
that the game is in a state that accepts the operation;
|
|
- rebinds the actor field in the body to the caller's race name from
|
|
the runtime player mapping (clients never supply a trusted actor);
|
|
- resolves the engine endpoint (the running container for the
|
|
`game_id`) and forwards the call;
|
|
- returns the engine's response payload back to the client without
|
|
re-interpretation.
|
|
|
|
Backend does not parse command or order payload contents beyond
|
|
what authorisation requires. The engine is the source of truth for
|
|
validity and ordering of in-game decisions. Gateway needs to know
|
|
the typed FB shape only to transcode the wire format; the per-command
|
|
semantics live in the engine.
|
|
|
|
### 6.3 Turn cutoff and auto-pause
|
|
|
|
A running game continuously alternates between a command-accepting
|
|
window and a generation phase, driven by the cron expression stored
|
|
in `runtime_records.turn_schedule`. The backend scheduler
|
|
(`backend/internal/runtime/scheduler.go`) wraps each engine
|
|
`/admin/turn` call between two `runtime_status` flips:
|
|
|
|
- Before the engine call: `running → generation_in_progress`.
|
|
The user-games command/order handlers
|
|
(`backend/internal/server/handlers_user_games.go`) consult the
|
|
per-game runtime record on every request and reject with
|
|
HTTP 409 + `code = turn_already_closed` while the runtime sits in
|
|
`generation_in_progress`. The error envelope mirrors backend's
|
|
standard `httperr` shape: `{"error": {"code":
|
|
"turn_already_closed", "message": "..."}}`.
|
|
- After a successful tick: `generation_in_progress → running`.
|
|
The order window re-opens for the new turn and the next
|
|
scheduled tick continues normally.
|
|
- After a failed tick (`engine_unreachable` /
|
|
`generation_failed`): the lobby's `OnRuntimeSnapshot` flips the
|
|
game from `running` to `paused` and publishes a `game.paused`
|
|
push event (see §6.6). The order handlers reject with HTTP 409
|
|
+ `code = game_paused` until an admin resume succeeds.
|
|
|
|
`force-next-turn` (admin) schedules a one-shot extra tick that
|
|
advances the next scheduled turn by one cron step; the same
|
|
status-flip and rejection rules apply.
|
|
|
|
Clients distinguish the two rejections by `code`:
|
|
`turn_already_closed` means "wait for the next `game.turn.ready`
|
|
and resubmit", whereas `game_paused` means "wait for an admin
|
|
resume". The web client implements both reactions in
|
|
`ui/docs/sync-protocol.md`.
|
|
|
|
### 6.4 Reports
|
|
|
|
Per-turn reports are read-only views fetched from the engine on
|
|
demand. Backend authorises the caller and forwards the request;
|
|
there is no caching or denormalisation in this path.
|
|
|
|
The web client renders the report as one section per FBS array
|
|
(galaxy summary, votes, player status, my / foreign sciences, my /
|
|
foreign ship classes, battles, bombings, approaching groups, my /
|
|
foreign / uninhabited / unknown planets, ships in production,
|
|
cargo routes, my fleets, my / foreign / unidentified ship groups).
|
|
Empty sections render explicit empty-state copy. Section anchors
|
|
are exposed in a sticky table of contents (a `<select>` on mobile)
|
|
and the scroll position is preserved across active-view switches
|
|
via SvelteKit's `Snapshot` API.
|
|
|
|
The Bombings section is a flat read-only table — one row per
|
|
bombing event, columns for `attacker`, `attack_power`, `wiped`
|
|
state and the post-bombing resource snapshot. The Battles section
|
|
is a list of links into the Battle Viewer (see [§6.5](#65-battle-viewer)).
|
|
|
|
### 6.5 Battle viewer
|
|
|
|
The Battle Viewer is a dedicated view that replaces the map and
|
|
renders one battle at a time. Entry points:
|
|
|
|
- A row in the Reports view's Battles section (link with the
|
|
current turn pinned via `?turn=`).
|
|
- A battle marker on the map (yellow cross drawn through the
|
|
corners of the square that circumscribes the planet circle;
|
|
stroke width scales with the protocol length).
|
|
|
|
The viewer is a logically isolated component that consumes a
|
|
`BattleReport` (shape per `pkg/model/report/battle.go`). The page
|
|
loader (`ui/frontend/src/lib/active-view/battle.svelte`) fetches
|
|
the report through the signed `user.games.battle` ConnectRPC
|
|
command on the authenticated edge: the gateway translates the
|
|
verified envelope into `GET /api/v1/user/games/{game_id}/battles/{turn}/{battle_id}`
|
|
against the backend, which in turn proxies the engine's
|
|
`GET /api/v1/battle/:turn/:uuid`. For synthetic games the loader
|
|
short-circuits to the in-memory fixture map populated by the
|
|
synthetic-report envelope (see below) and never touches the
|
|
gateway.
|
|
|
|
Visual model is radial: the planet sits at the centre, races are
|
|
placed at equal angular spacing on an outer ring, and each race is
|
|
rendered as a cloud of ship-class circles arranged on a Vogel
|
|
sunflower spiral biased toward the planet (the largest group by
|
|
NumberLeft sits closest to the planet, lighter buckets fan behind).
|
|
Tech-variants of the same `(race, className)` collapse into one
|
|
visual bucket labelled `<className>:<numLeft>`; per-class detail
|
|
stays available in the Reports view. Circle radius scales with
|
|
per-ship FullMass (range `[6, 24] px`, per-battle normalisation)
|
|
so heavy ships visually dominate. Observer groups (`inBattle:
|
|
false`) are not drawn. Eliminated races drop out and the survivors
|
|
re-spread on the next frame. The viewer is pinned to the viewport
|
|
(scene grows, log scrolls internally) so no page-level scroll
|
|
appears.
|
|
|
|
Each frame is one protocol entry; the shot is drawn as a thin line
|
|
from attacker to defender, red on `destroyed`, green otherwise.
|
|
Continuous playback offers 1x / 2x / 4x speeds (400 / 200 / 100 ms
|
|
per frame), plus play/pause, step ±, and rewind. The accessibility
|
|
text protocol below the scene mirrors the same events line-by-line.
|
|
|
|
Bombings and battles are intentionally not mixed: bombings remain a
|
|
static table in the Reports view; the bombing marker on the map is
|
|
a thin stroke-only ring around the planet (yellow when damaged, red
|
|
when wiped) and a click scrolls the corresponding row into view.
|
|
|
|
The current report wire carries a `battle: [{ id, planet, shots }]`
|
|
summary per battle so the map markers know where to anchor without
|
|
fetching every full `BattleReport`.
|
|
|
|
For DEV / e2e the legacy-report CLI
|
|
(`tools/local-dev/legacy-report/cmd/legacy-report-to-json`) emits an
|
|
envelope `{version: 1, report, battles}` where `battles` carries the
|
|
full `BattleReport`-s parsed out of legacy `Battle at (#N)` blocks.
|
|
The synthetic-report loader on the lobby unwraps the envelope and
|
|
hands every battle to `registerSyntheticBattle`, so the Battle Viewer
|
|
resolves any UUID without a network fetch.
|
|
|
|
### 6.6 Side effects
|
|
|
|
A successful turn generation publishes a runtime snapshot into the
|
|
lobby module, which updates the denormalised view (current turn,
|
|
runtime status, per-player stats). The engine's "game finished"
|
|
report drives the `running → finished` transition ([Section 3.5](#35-cancellation-and-finish))
|
|
and triggers Race Name Directory promotions ([Section 5](#5-race-name-directory)).
|
|
|
|
Among the `game.*` notification kinds, `game.turn.ready` and
|
|
`game.paused` are wired:
|
|
|
|
- `game.turn.ready` —
|
|
`lobby.Service.OnRuntimeSnapshot` (`backend/internal/lobby/runtime_hooks.go`)
|
|
emits one intent per advancing `current_turn`, addressed to every
|
|
active membership of the game, with idempotency key
|
|
`turn-ready:<game_id>:<turn>` and JSON payload `{game_id, turn}`.
|
|
- `game.paused` — the same hook publishes one intent per transition
|
|
into `paused` driven by an `engine_unreachable` /
|
|
`generation_failed` runtime snapshot, addressed to every active
|
|
membership, with idempotency key `paused:<game_id>:<turn>` and
|
|
JSON payload `{game_id, turn, reason}`. The runtime status that
|
|
triggered the transition is carried as `reason` so the UI can
|
|
differentiate the copy in a future revision.
|
|
|
|
Both kinds route through the push channel only; email is
|
|
deliberately omitted to avoid per-turn / per-pause spam.
|
|
|
|
The remaining `game.*` kinds (`game.started`, `game.generation.failed`,
|
|
`game.finished`) and `mail.dead_lettered` are reserved without a
|
|
producer; adding one is purely additive (register the kind in the
|
|
catalog, extend the migration `CHECK` constraint, and call
|
|
`notification.Submit` from the appropriate domain module).
|
|
|
|
### 6.7 Cross-references
|
|
|
|
- Backend ↔ engine wire contract (`pkg/model/{order,report,rest}`):
|
|
[ARCHITECTURE.md §9](ARCHITECTURE.md#9-backend--game-engine-communication).
|
|
- Container lifecycle, label discipline, reconciliation:
|
|
[ARCHITECTURE.md §13](ARCHITECTURE.md#13-container-lifecycle-in-process) and `backend/docs/flows.md`.
|
|
|
|
---
|
|
|
|
## 7. Push channel
|
|
|
|
This scenario covers how the platform pushes real-time events to
|
|
authenticated clients (turn-ready signals, lobby state changes,
|
|
session invalidations).
|
|
|
|
### 7.1 Scope
|
|
|
|
In scope: the server-streaming subscription a client opens against
|
|
gateway (Connect / gRPC / gRPC-Web framing all map to the same
|
|
endpoint), the bootstrap event, the framing of forwarded events, and
|
|
the backend → gateway control channel that produces those events.
|
|
|
|
Out of scope: the catalog of event kinds — see [Section 8](#8-notifications-and-mail) for the
|
|
notification side and [`backend/README.md` §10](../backend/README.md#10-notification-catalog) for the closed list.
|
|
|
|
### 7.2 Client subscription
|
|
|
|
An authenticated client opens a `SubscribeEvents` server-streaming
|
|
call on gateway. Gateway runs the same envelope verification as for
|
|
unary requests ([Section 1.4](#14-per-request-session-lookup)), then registers the stream with its
|
|
internal hub. The first frame the client receives is a
|
|
gateway-signed bootstrap event carrying the current server time, so
|
|
the client can calibrate its local clock without a separate request.
|
|
|
|
While the stream is open, gateway tracks a silence timer; if no real
|
|
event has been forwarded for `GATEWAY_PUSH_HEARTBEAT_INTERVAL`
|
|
(default `15s`, `0s` disables), gateway emits an unsigned
|
|
`gateway.heartbeat` event to keep browser fetch-streaming layers
|
|
from closing the response body as idle. Real events reset the
|
|
timer, so on busy streams the heartbeat fires rarely. The UI client
|
|
short-circuits the heartbeat type before signature verification and
|
|
never dispatches it to handlers — see
|
|
[`docs/ARCHITECTURE.md` § 15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary)
|
|
for the wire-cost projection and the security rationale of leaving
|
|
the heartbeat unsigned.
|
|
|
|
### 7.3 Backend → gateway control
|
|
|
|
Backend hosts a single gRPC service `Push.SubscribePush`, consumed
|
|
by gateway. There is exactly one logical subscription per gateway
|
|
client identity at a time; a reconnect with the same id replaces
|
|
the old subscription. Each frame on the stream carries a monotonic
|
|
cursor and one of two payload shapes:
|
|
|
|
- **Client event.** A typed payload destined for one user (and
|
|
optionally one device session). Producers pass a `push.Event`
|
|
(Kind + Marshal) to `push.Service`; the service invokes Marshal
|
|
and places the bytes into `pushv1.ClientEvent.Payload`. Gateway
|
|
forwards the bytes inside a signed client envelope without
|
|
re-interpreting them. Producers attach correlation ids that
|
|
gateway carries verbatim. New kinds ship with a FlatBuffers-backed
|
|
Event implementation; kinds that have not migrated yet use the
|
|
`push.JSONEvent` fallback so the pipeline can keep emitting them.
|
|
- **Session invalidation.** Tells gateway to drop active streams and
|
|
reject in-flight requests for the affected session(s) — the
|
|
revocation propagation path described in [Section 1.5](#15-revocation).
|
|
|
|
### 7.4 Reliability and reconnect
|
|
|
|
Backend keeps an in-memory ring buffer of recent events. On
|
|
reconnect, gateway sends its last consumed cursor; backend resumes
|
|
from the next event when the cursor is still inside the
|
|
freshness-window TTL or restarts from the head when the cursor has
|
|
aged out. Per-connection backpressure is drop-oldest: a slow
|
|
gateway connection loses its oldest events first, with a log line
|
|
on each drop so both sides can correlate the gap.
|
|
|
|
The push channel is best-effort. The durable record of "we tried to
|
|
tell this user about this thing" lives in `notifications` /
|
|
`notification_routes` ([Section 8](#8-notifications-and-mail)); a missed push event does not
|
|
mean the platform forgets the event.
|
|
|
|
### 7.5 Producers
|
|
|
|
Backend producers that emit onto the push channel are: the
|
|
notification dispatcher (push routes from the catalog) and the
|
|
session module (revocation events). No domain module emits client
|
|
events outside of the notification dispatcher.
|
|
|
|
### 7.6 Cross-references
|
|
|
|
- Wire envelope used for push frames:
|
|
[ARCHITECTURE.md §15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary).
|
|
- Reconnect and ring-buffer semantics:
|
|
[ARCHITECTURE.md §8](ARCHITECTURE.md#8-backend--gateway-communication) and
|
|
`backend/docs/flows.md` "Push gRPC".
|
|
- Notification dispatcher: [Section 8](#8-notifications-and-mail).
|
|
|
|
---
|
|
|
|
## 8. Notifications and mail
|
|
|
|
This scenario covers how the platform tells a user about an event
|
|
through push or e-mail (or both).
|
|
|
|
### 8.1 Scope
|
|
|
|
In scope: the notification intent submission flow, fan-out across
|
|
push and email channels, the durable mail outbox, dead-letter
|
|
handling, and operator-driven resend.
|
|
|
|
Out of scope: per-event semantics — when each kind fires is
|
|
documented in the relevant feature section ([Section 4](#4-lobby-participation) for lobby
|
|
kinds, [Section 5](#5-race-name-directory) for race-name kinds, [Section 6](#6-in-game-session) for game kinds).
|
|
|
|
### 8.2 Notification intent and fan-out
|
|
|
|
Domain producers (lobby, runtime, geo) submit a typed intent to the
|
|
notification module rather than handing the message off to a
|
|
specific channel. The module then:
|
|
|
|
- enforces idempotency on the intent kind plus a producer-supplied
|
|
idempotency key;
|
|
- resolves recipients;
|
|
- materialises one route per recipient per channel, based on the
|
|
type-specific policy in the catalog (push only, email only, both,
|
|
or admin email);
|
|
- emits push routes onto the gRPC push stream consumed by gateway;
|
|
- inserts email routes directly into the mail outbox.
|
|
|
|
Malformed intents are quarantined to a dedicated table and never
|
|
block the producer.
|
|
|
|
### 8.3 The catalog
|
|
|
|
The catalog is a closed set of kinds. Each kind specifies its
|
|
channels and the payload fields the templates and clients consume.
|
|
Three kinds of entries deserve a callout:
|
|
|
|
- **`auth.login_code`.** This is the only kind that bypasses the
|
|
notification pipeline entirely. Auth writes the email row
|
|
directly to the outbox so the challenge commit is atomic with the
|
|
mail enqueue.
|
|
- **`runtime.*` kinds.** They deliver to a configured admin email.
|
|
When the admin email is unset, routes land with a `skipped`
|
|
status and an operator log line — the request never fails because
|
|
of missing operator config.
|
|
- **Reserved kinds without a producer.** `game.*` and
|
|
`mail.dead_lettered` are listed in the catalog but no current
|
|
module emits them. Adding a producer is purely additive.
|
|
|
|
### 8.4 Mail outbox
|
|
|
|
Email is a Postgres-backed durable outbox. Producers (notification
|
|
routes and the auth login-code path) write the delivery row plus
|
|
the rendered payload bytes in a single transaction. A worker
|
|
goroutine drains the outbox: it picks rows under a row lock,
|
|
attempts SMTP delivery, records the attempt, and either marks the
|
|
row sent or schedules the next attempt with exponential backoff and
|
|
jitter.
|
|
|
|
A delivery that exceeds the configured attempt budget moves to the
|
|
dead-letter table; the dead-lettering itself emits an admin
|
|
notification intent. On startup the worker drains everything that
|
|
is still pending or retrying — there is no separate recovery flow.
|
|
|
|
Operators can resend a non-`sent` delivery from the admin surface
|
|
([Section 10](#10-administration)). Resending a `sent` delivery is rejected so an
|
|
operator cannot accidentally re-deliver mail that has already left
|
|
the relay.
|
|
|
|
### 8.5 Operator visibility
|
|
|
|
The admin surface lists deliveries, attempts per delivery,
|
|
dead-letters, notifications, notification dead-letters, and
|
|
malformed notification intents. None of these listings are reachable
|
|
from the user surface.
|
|
|
|
### 8.6 Cross-references
|
|
|
|
- Notification catalog table (kinds, channels, payloads):
|
|
[`backend/README.md` §10](../backend/README.md#10-notification-catalog).
|
|
- Mail outbox internals (tables, attempt log, worker pickup):
|
|
[ARCHITECTURE.md §11](ARCHITECTURE.md#11-mail-outbox) and
|
|
`backend/docs/flows.md` "Mail outbox".
|
|
- Push transport for client_event routes: [Section 7](#7-push-channel).
|
|
|
|
---
|
|
|
|
## 9. Geo signal
|
|
|
|
This scenario covers what backend records about the source IP of an
|
|
authenticated request, and what it deliberately does not do with it.
|
|
|
|
### 9.1 Scope
|
|
|
|
In scope: the one-shot declared country at registration, the
|
|
fire-and-forget per-request country counter, and the operator-only
|
|
inspection endpoint.
|
|
|
|
Out of scope: any kind of automatic flagging, account-takeover
|
|
detection, geo-fencing, sanctions enforcement, or version history.
|
|
The geo signal is a passive record, not an enforcement mechanism.
|
|
|
|
### 9.2 What backend records
|
|
|
|
At registration ([Section 1.3](#13-confirming-the-challenge)), backend looks up the source IP
|
|
against the GeoLite2 country database and stores the resulting ISO
|
|
country code on the account. This value is written exactly once per
|
|
account; subsequent sign-ins from a different country do not
|
|
overwrite it.
|
|
|
|
On every authenticated request through the user surface, a
|
|
fire-and-forget goroutine performs the same lookup against the
|
|
request IP and increments a per-(user, country) counter. The
|
|
request itself never blocks on this work; the goroutine runs after
|
|
the handler returns.
|
|
|
|
Both paths fail open: a geoip lookup error is logged but never
|
|
blocks the user.
|
|
|
|
### 9.3 What backend does NOT do
|
|
|
|
- No aggregation across users.
|
|
- No automatic flagging when the country changes.
|
|
- No notifications, ever, derived from the geo signal.
|
|
- No version history of `declared_country`.
|
|
- No correlation with sanctions, limits, or entitlements.
|
|
|
|
### 9.4 Operator access
|
|
|
|
The admin surface exposes a single read endpoint that lists per-user
|
|
country counters. The data is intended for manual inspection during
|
|
operator triage; there is no UI workflow built on top of it.
|
|
|
|
### 9.5 Source IP discipline
|
|
|
|
Backend reads the source IP from the leftmost `X-Forwarded-For`
|
|
entry, falling back to the connection peer when the header is
|
|
absent. Backend trusts the value because the network segment
|
|
between gateway and backend is the platform trust boundary — the
|
|
edge has already sanitised it. This is intentional and is restated
|
|
in [ARCHITECTURE.md §10](ARCHITECTURE.md#10-geo-profile-reduced) and [§16](ARCHITECTURE.md#16-security-boundaries-summary).
|
|
|
|
E-mail addresses are never written to logs verbatim. Backend logs a
|
|
process-scoped HMAC-truncated hash so operators can correlate log
|
|
lines within a single process lifetime without persisting PII.
|
|
|
|
### 9.6 Cross-references
|
|
|
|
- Trust-boundary rationale:
|
|
[ARCHITECTURE.md §10](ARCHITECTURE.md#10-geo-profile-reduced),
|
|
[§15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary),
|
|
[§16](ARCHITECTURE.md#16-security-boundaries-summary).
|
|
- One-shot registration write vs. per-request counter contract:
|
|
[`backend/README.md` §11](../backend/README.md#11-geo-profile).
|
|
|
|
---
|
|
|
|
## 10. Administration
|
|
|
|
This scenario covers every admin-only operation. Many of them have
|
|
been referenced in earlier sections (admin overrides for lobby,
|
|
admin-side soft delete and sanctions, mail and notification
|
|
inspection); this section is the consolidated view.
|
|
|
|
### 10.1 Scope
|
|
|
|
In scope: admin authentication, the cross-domain admin operations,
|
|
their side effects on the rest of the platform.
|
|
|
|
Out of scope: end-user-driven workflows that share a domain with an
|
|
admin operation — those live in their owning section.
|
|
|
|
### 10.2 Authentication and bootstrap
|
|
|
|
The admin surface uses HTTP Basic Auth against a backend-owned
|
|
admin-account table; passwords are bcrypt-hashed. On startup, if a
|
|
bootstrap admin username and password are configured and the table
|
|
does not yet contain a row with that username, backend inserts one.
|
|
The insert is idempotent: subsequent restarts do nothing.
|
|
|
|
A failed Basic Auth response prompts the operator's tooling for
|
|
credentials in the standard way; the realm string is fixed so an
|
|
operator's password manager can match it across deployments.
|
|
|
|
After the first deployment, the bootstrap password should be
|
|
rotated through the admin surface.
|
|
|
|
### 10.3 Admin account management
|
|
|
|
Existing admins can list other admins, create new ones, look up a
|
|
specific admin, disable or re-enable an admin, and reset an admin's
|
|
password. A disabled admin row cannot authenticate; the row is kept
|
|
to preserve audit references rather than deleted.
|
|
|
|
Reset-password takes the new password in the request body. Backend
|
|
bcrypt-hashes it, replaces `admin_accounts.password_hash`, and
|
|
returns the updated `AdminAccount` shape — the new password itself
|
|
is never echoed back. "Delivered out-of-band" therefore means: the
|
|
admin who initiates the reset is the one who must communicate the
|
|
new value to the target through some channel outside the platform
|
|
(secure messenger, voice, etc.); the platform does not e-mail or
|
|
otherwise auto-deliver it.
|
|
|
|
### 10.4 User administration
|
|
|
|
For any user account, an admin can:
|
|
|
|
- list and inspect accounts;
|
|
- apply a sanction;
|
|
- apply a per-user limit override that adjusts a specific quota;
|
|
- update the entitlement (plan, paid flag, source, validity);
|
|
- soft-delete the account (the same in-process cascade as
|
|
[Section 2.4](#24-user-initiated-soft-delete)).
|
|
|
|
The sanction catalogue is intentionally minimal in the MVP: the
|
|
only supported `sanction_code` is `permanent_block`. Applying it
|
|
flips `accounts.permanent_block`, revokes every active session
|
|
([Section 1.5](#15-revocation)), and runs the same lobby cascade as soft-delete with
|
|
membership status `blocked` ([Section 2.4](#24-user-initiated-soft-delete)). The openapi schema
|
|
encodes this as a closed enum so future additions are an explicit,
|
|
breaking change. Soft-delete always revokes sessions; sanctions
|
|
revoke only when the kind documents that side effect (today: only
|
|
`permanent_block`).
|
|
|
|
### 10.5 Game administration
|
|
|
|
Admins create public games, list and inspect any game, force-start
|
|
or force-stop a game, and ban a member. Force-stop tears down the
|
|
running engine container for the game; ban-member adds the user to
|
|
the game's block list and removes any active membership
|
|
([Section 4.4](#44-memberships)).
|
|
|
|
Public-game ownership is collective: the row carries
|
|
`owner_user_id IS NULL` and any admin can act on it. The user
|
|
surface never produces or transitions a public game.
|
|
|
|
### 10.6 Runtime administration
|
|
|
|
Admins inspect the runtime record for a game, restart the engine
|
|
container, patch its image to a newer semver-patch within the same
|
|
major / minor line, and force a one-shot extra turn tick.
|
|
|
|
Patch is intentionally restricted to the patch component. A major
|
|
or minor version change requires the explicit stop / start of the
|
|
game, not an in-place upgrade. Engine version registration and
|
|
disable live next door.
|
|
|
|
### 10.7 Engine version registry
|
|
|
|
The engine version registry is the source of allowed engine images.
|
|
Producers (start, restart, patch) never pick image references on
|
|
their own; they read from the registry. Disabling a version is a
|
|
forward-looking decision: existing running containers keep their
|
|
current image until a stop / start, but the disabled version is no
|
|
longer eligible for new starts or patches.
|
|
|
|
### 10.8 Mail and notifications administration
|
|
|
|
Operators can list and inspect mail deliveries, attempts per
|
|
delivery, dead-letters, notifications, notification dead-letters,
|
|
and malformed notification intents. They can also resend a non-sent
|
|
mail delivery ([Section 8.4](#84-mail-outbox)).
|
|
|
|
These views are the only path to mail and notification observability
|
|
outside of telemetry.
|
|
|
|
### 10.9 Geo administration
|
|
|
|
The single geo admin endpoint lists per-user country counters
|
|
([Section 9.4](#94-operator-access)). There is no admin write access to geo data; the
|
|
declared country is set once at registration and never changes,
|
|
counters are populated by the runtime, and operators can only read.
|
|
|
|
### 10.10 Cross-references
|
|
|
|
- Cascade contract for soft delete:
|
|
[ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns).
|
|
- Container lifecycle and version arbitration:
|
|
[ARCHITECTURE.md §13](ARCHITECTURE.md#13-container-lifecycle-in-process).
|
|
- Mail outbox and notification dispatcher:
|
|
[ARCHITECTURE.md §11](ARCHITECTURE.md#11-mail-outbox),
|
|
[§12](ARCHITECTURE.md#12-notification-pipeline) and [Section 8](#8-notifications-and-mail).
|
|
|
|
---
|
|
|
|
## 11. Diplomatic mail
|
|
|
|
This scenario covers the player-to-player and admin-to-player
|
|
messaging system exposed inside a game. The system is conceptually
|
|
part of the lobby (messages outlive game runtime restarts), but
|
|
they are surfaced exclusively inside the in-game UI; the lobby
|
|
surfaces only an unread counter.
|
|
|
|
### 11.1 Scope
|
|
|
|
In scope: sending personal mail between active members of the same
|
|
game; replying to personal mail; reading and marking-read /
|
|
soft-deleting one's own incoming mail; admin / owner notifications
|
|
addressed to one player or broadcast to a game; paid-tier player
|
|
broadcasts; site-admin multi-game broadcasts; bulk purge of
|
|
messages tied to terminated games; auto-translation of the body
|
|
into the recipient's `preferred_language` with a cached rendering.
|
|
|
|
Out of scope: out-of-game chat, group chats spanning multiple
|
|
games, file attachments, message editing or unsend, end-to-end
|
|
encryption.
|
|
|
|
### 11.2 The message model
|
|
|
|
Every send produces exactly one row in `diplomail_messages` plus
|
|
one row per recipient in `diplomail_recipients`. A broadcast to N
|
|
recipients is one message + N recipient rows; the translation row,
|
|
when materialised, is shared across every recipient with the same
|
|
target language.
|
|
|
|
`diplomail_messages.kind` is the closed set
|
|
`{personal, admin}`. Personal messages are replyable (the
|
|
recipient sends back a new personal message); admin messages are
|
|
non-replyable acknowledgements of a state change or operator
|
|
action. `sender_kind` is `{player, admin, system}` and identifies
|
|
the originator's role: a player owns the game (admin notification
|
|
from owner), a site administrator pushed it (admin notification
|
|
from operator), or the lobby state machine produced it
|
|
(`game.paused`, `game.cancelled`, `membership.removed`,
|
|
`membership.blocked`).
|
|
|
|
`broadcast_scope` records whether the send was a single-recipient
|
|
delivery (`single`), a one-game broadcast (`game_broadcast`), or a
|
|
cross-game admin broadcast (`multi_game_broadcast`). Recipients of
|
|
a multi-game broadcast see one independently-deletable inbox entry
|
|
per game they were addressed in.
|
|
|
|
Per-row snapshots travel with each message: `game_name`,
|
|
`sender_username`, `sender_ip`, plus on the recipient row
|
|
`recipient_user_name`, `recipient_race_name`, and
|
|
`recipient_preferred_language`. These survive game-name changes,
|
|
membership revocation, account soft-delete, and the eventual
|
|
bulk-purge cascade — they let the admin observability surface
|
|
render correctly long after the live rows have moved on.
|
|
|
|
Bodies and subjects are plain UTF-8 text. The server does not
|
|
parse, sanitise, or escape HTML; the client renders bodies through
|
|
`textContent`. Maximum body size is
|
|
`BACKEND_DIPLOMAIL_MAX_BODY_BYTES` (default `4096`); maximum
|
|
subject size is `BACKEND_DIPLOMAIL_MAX_SUBJECT_BYTES` (default
|
|
`256`).
|
|
|
|
### 11.3 Sending mail
|
|
|
|
Personal sends require active membership in the game for both the
|
|
sender and the recipient. Free-tier players send one personal
|
|
message per request. Paid-tier players additionally have access to
|
|
a game-scoped broadcast that addresses every other active member
|
|
in one call; replies fan back to the broadcast author.
|
|
|
|
Game owners (of private games) and site administrators send admin
|
|
notifications. The owner endpoint lives under the user surface
|
|
(authenticated by `X-User-ID`, owner check enforced); the admin
|
|
endpoint lives under the admin surface (HTTP Basic). Both accept
|
|
`target=user` (single recipient) or `target=all` (game broadcast).
|
|
Site administrators additionally have a multi-game endpoint that
|
|
accepts `scope=selected` with a list of game ids or
|
|
`scope=all_running` that enumerates every game with non-terminal
|
|
status.
|
|
|
|
Broadcast composition is parameterised by `recipients`: `active`
|
|
(default), `active_and_removed`, or `all_members` (includes
|
|
blocked rows for audit-style mail). The broadcast author's own
|
|
recipient row is never created.
|
|
|
|
A paid-tier broadcast is rejected with `403 forbidden` when the
|
|
caller's entitlement tier is `free`.
|
|
|
|
### 11.4 Receiving mail
|
|
|
|
The recipient sees the message in their in-game inbox once the
|
|
async translation worker has finished processing it (see
|
|
[§11.6](#116-translation)). Until then the row stays invisible:
|
|
absent from the inbox listing, not counted in the unread badge, no
|
|
push event delivered. This avoids a surprise where the inbox shows
|
|
a row with no translation and an outdated unread count.
|
|
|
|
The unread badge in the lobby aggregates by game. The
|
|
`/api/v1/user/lobby/mail/unread-counts` endpoint returns one entry
|
|
per game with non-zero unread plus the global total; the lobby UI
|
|
renders the total badge and a per-game tile counter without
|
|
exposing the messages themselves.
|
|
|
|
Marking a message as read is idempotent. Soft-deletion requires the
|
|
message to already be marked read — a client cannot erase an
|
|
unopened message. Soft-deletion is per-recipient: the underlying
|
|
message row survives until the admin bulk-purge endpoint removes
|
|
the entire game's mail tree.
|
|
|
|
The message detail response includes both the original body and,
|
|
when available, the cached translation; the client UI defaults to
|
|
the translated text and offers a "show original" toggle.
|
|
|
|
The in-game UI groups personal mail into per-race threads — every
|
|
personal message exchanged between the local player and another
|
|
race lands in one thread keyed on the other party's race. System
|
|
mail, admin notifications, and the player's own paid-tier
|
|
broadcasts render as stand-alone entries in the same list pane and
|
|
are never threaded. `read_at` and `deleted_at` drive the local
|
|
unread counter and the soft-delete affordance but are not surfaced
|
|
to the user — diplomatic mail does not promise read receipts. The
|
|
compose form picks the recipient by race name (resolved
|
|
server-side from `Memberships.ListMembers(game_id, "active")`); no
|
|
client-side memberships listing is fetched. See
|
|
[`ui/docs/diplomail-ui.md`](../ui/docs/diplomail-ui.md) for the
|
|
detailed UI breakdown.
|
|
|
|
### 11.5 Lifecycle hooks
|
|
|
|
Three lobby transitions land as system mail in the affected
|
|
players' inboxes:
|
|
|
|
- **Game paused / cancelled.** When the game state machine moves
|
|
through `paused` or `cancelled`, the lobby emits a system mail
|
|
addressed to every active member. The message explains the
|
|
transition with a server-rendered template, so even an offline
|
|
player finds the context the next time they open the inbox.
|
|
- **Membership removed / blocked.** Manual self-leave, owner-driven
|
|
removal, and admin ban each emit a system mail addressed to the
|
|
affected player only. This mail survives the membership going
|
|
to `removed` / `blocked`, so a kicked player keeps read access
|
|
to the explanation forever (soft-access rule).
|
|
|
|
Future inactivity-driven removal must call the same publisher so
|
|
the explanation reaches the affected player; the lobby package
|
|
README pins this contract for the next implementer.
|
|
|
|
### 11.6 Translation
|
|
|
|
`diplomail_messages.body_lang` is filled at send time by an
|
|
in-process language detector that operates on the body only.
|
|
Subject inherits the body's detected language for the translation
|
|
cache lookup. When detection cannot confidently label the body
|
|
(too short, empty, mixed scripts) the value is the BCP 47
|
|
`und` ("undetermined") sentinel and the translation pipeline is
|
|
short-circuited — recipients receive the original.
|
|
|
|
Translation happens asynchronously. Every recipient row stores a
|
|
snapshot of the addressee's `preferred_language` plus an
|
|
`available_at` timestamp. A recipient whose language matches the
|
|
detected `body_lang` (or whose preferred language is empty / the
|
|
body language is `und`) gets `available_at = now()` on insert and
|
|
the push event fires immediately. A recipient whose language
|
|
differs is inserted with `available_at IS NULL` and waits for the
|
|
translation worker.
|
|
|
|
The worker (`internal/diplomail.Worker`) ticks every
|
|
`BACKEND_DIPLOMAIL_WORKER_INTERVAL` (default `2s`) and processes
|
|
one `(message_id, target_lang)` pair per tick. It consults the
|
|
translation cache first; on miss it asks the configured
|
|
`Translator`. The default deployment ships the LibreTranslate HTTP
|
|
client; an empty `BACKEND_DIPLOMAIL_TRANSLATOR_URL` falls back to
|
|
the noop translator that delivers every message in the original
|
|
language.
|
|
|
|
Translation outcomes:
|
|
|
|
- **Success.** A row in `diplomail_translations` is inserted (or
|
|
reused if another worker won the race), every pending recipient
|
|
of the pair is flipped to `available_at = now()`, and one push
|
|
event per recipient is published.
|
|
- **Unsupported language pair** (HTTP 400 from LibreTranslate).
|
|
No translation row is persisted; recipients are delivered with
|
|
the original body. Subsequent reads return the original.
|
|
- **Transient failure** (timeout, 5xx, network error). The
|
|
attempt counter is bumped and the next attempt is scheduled via
|
|
exponential backoff `1s → 2s → 4s → 8s → 16s` (capped at 60s).
|
|
After `BACKEND_DIPLOMAIL_TRANSLATOR_MAX_ATTEMPTS` (default `5`)
|
|
the worker falls back to delivering the original body. A
|
|
prolonged translator outage therefore stalls delivery by at
|
|
most ~30 seconds per pair before the receiver sees the
|
|
original.
|
|
|
|
The translation cache is shared: a broadcast to N recipients with
|
|
the same preferred language produces one cache row and one
|
|
translator call, not N.
|
|
|
|
### 11.7 Storage and purge
|
|
|
|
Messages live in `diplomail_messages`; per-recipient state lives
|
|
in `diplomail_recipients` with a foreign-key cascade to the
|
|
message; translations live in `diplomail_translations` also with a
|
|
cascade. The sender IP is captured at insert time from
|
|
`X-Forwarded-For` (forwarded by gateway) for evidence preservation.
|
|
|
|
There is no automatic retention. The admin bulk-purge endpoint
|
|
removes every message whose game finished more than
|
|
`older_than_years` years ago (minimum `1`); the cascade drops the
|
|
recipient and translation rows in the same transaction.
|
|
|
|
### 11.8 Operator visibility
|
|
|
|
The admin surface exposes a paginated listing of every persisted
|
|
message (`/api/v1/admin/mail/messages`) filterable by `game_id`,
|
|
`kind`, and `sender_kind`. The bulk-purge endpoint
|
|
(`/api/v1/admin/mail/cleanup`) accepts the `older_than_years`
|
|
threshold. Per-game admin sends and multi-game broadcasts live
|
|
under `/api/v1/admin/games/{game_id}/mail` and
|
|
`/api/v1/admin/mail/broadcast`.
|
|
|
|
### 11.9 Cross-references
|
|
|
|
- Package overview and stage map:
|
|
[`backend/internal/diplomail/README.md`](../backend/internal/diplomail/README.md).
|
|
- LibreTranslate setup recipe for local development:
|
|
[`backend/docs/diplomail-translator-setup.md`](../backend/docs/diplomail-translator-setup.md).
|
|
- Storage detail:
|
|
[ARCHITECTURE.md §12.1](ARCHITECTURE.md#121-diplomatic-mail-subsystem).
|
|
- Push transport for delivery events: [Section 7](#7-push-channel).
|
|
- Notification catalog kind `diplomail.message.received`:
|
|
[`backend/README.md` §10](../backend/README.md#10-notification-catalog).
|