# Galaxy Functional Specification This document describes what the Galaxy platform does, in terms of user-visible operations and the per-service logic that implements them. Each section walks through one domain scenario: who initiates an operation, what `gateway` checks and forwards, what `backend` validates and persists, what is returned to the client, and what side effects fire (mail, push, container ops). This is the starting point for any change request that touches behaviour. The exact wire shape, error code vocabulary, environment variables, default values, throttle limits, table and column names, and field-level validation live in the lower-level sources: - [`ARCHITECTURE.md`](ARCHITECTURE.md) — global architecture, security model, transport contract. - `galaxy//README.md` — service layout, configuration, operations. - `galaxy//openapi.yaml`, `*.proto` — wire contracts. - `galaxy//docs/flows.md` — sequence diagrams. This file deliberately omits those details. When this file and a lower-level source disagree, see the synchronisation rule in the project `CLAUDE.md`. A Russian translation lives in [`FUNCTIONAL_ru.md`](FUNCTIONAL_ru.md). It is a convenience mirror for the project owner, **not a source of truth** — this English file is authoritative. Every point edit to this file must be mirrored into the Russian version in the same patch (translate only the touched paragraphs); a full re-translation happens only on explicit owner request. The document is organised by domain scenario, not by HTTP route group. Public, user-authenticated, and admin operations may all appear in the same scenario when they participate in the same business flow. ## Table of Contents 1. [Authentication and device session](#1-authentication-and-device-session) 2. [Account management](#2-account-management) 3. [Lobby game lifecycle](#3-lobby-game-lifecycle) 4. [Lobby participation](#4-lobby-participation) 5. [Race Name Directory](#5-race-name-directory) 6. [In-game session](#6-in-game-session) 7. [Push channel](#7-push-channel) 8. [Notifications and mail](#8-notifications-and-mail) 9. [Geo signal](#9-geo-signal) 10. [Administration](#10-administration) 11. [Diplomatic mail](#11-diplomatic-mail) --- ## 1. Authentication and device session This scenario covers how an anonymous client becomes authenticated and stays authenticated until a server-side action revokes that authority. ### 1.1 Scope In scope: issuing an e-mail login challenge, confirming it (with first-sign-in account creation and registration of the client's public key), creating a device session, the per-request session lookup that grounds every authenticated call, and server-initiated revocation. Out of scope: the wire envelope and signature scheme used by every authenticated request — defined once in [ARCHITECTURE.md §15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary) and reused by every later section; client-side key storage; how push events are routed inside gateway to a specific subscriber stream. ### 1.2 Issuing a login challenge The client posts an e-mail address to the public auth surface on gateway. The route is unauthenticated — there is no device session yet to bind to. Gateway treats this as a stricter "public auth" route class: it applies per-IP and per-identity (per-email) anti-abuse, a body-size cap, and a method allow-list, then forwards the request to backend. Failures of the upstream adapter are projected back to the client with the same status and error envelope; transport-level failures become a generic unavailable response. Backend produces an opaque challenge identifier and emits a verification e-mail through the durable mail outbox. The response shape is **identical regardless of whether the e-mail belongs to an existing account, a fresh account, or a throttled one**, so the endpoint cannot be used to enumerate accounts. Branches inside backend: - **Permanent block.** If the address is permanently blocked at the account level, the request is rejected. This is the only account-state branch that surfaces a distinct error code; every other branch returns the standard challenge-id response. - **Throttle.** If too many un-consumed, non-expired challenges already exist for the same e-mail inside the throttle window, backend reuses the latest existing challenge instead of creating a new one. The client gets the same response shape and is unaware of the reuse. - **Otherwise.** Backend creates a new challenge with the resolved preferred language (derived from the optional `locale` body field the caller sends — which takes priority — or, if absent or blank, from the `Accept-Language` header forwarded by gateway, falling back to a default), and enqueues the auth-mail row directly into the outbox in the same transaction. SMTP delivery is asynchronous; the auth response returns as soon as the challenge and outbox rows are durably committed. The body field is the canonical channel because Safari silently drops JS-set `Accept-Language` headers; non-Safari clients can still rely on the header alone. ### 1.3 Confirming the challenge The client posts the challenge id, the code received by mail, a fresh Ed25519 public key, and the chosen IANA time zone. Gateway applies the same public-auth anti-abuse class, with the per-identity bucket keyed by the challenge id rather than the e-mail. `Accept-Language` is not consulted on this endpoint — the preferred language was captured at send-time and is replayed from the challenge row. Backend validates the challenge under a row lock: it rejects unknown, expired, or already-consumed ids, increments the attempt counter, and burns the challenge once the per-challenge attempt ceiling is reached. After the code matches, backend re-checks the permanent-block flag — catching the case where an admin applied the block between send and confirm — and rejects the request when set. On the success path backend ensures the account exists (synthesising an immutable display handle on first sign-in only and populating the declared country from the source IP), then marks the challenge consumed and creates a device session bound to the caller's public key in the same transaction. The response carries the new device session id. A challenge is single-use. A second confirm on the same id returns the same opaque `invalid_request` shape as confirming an unknown or expired id; the API deliberately does not differentiate between the three so an attacker cannot mine challenge state. Throttle reuse on the send side means a client hitting the throttle gets the latest existing `challenge_id` back instead of a fresh one, but every id is still consumed exactly once. ### 1.4 Per-request session lookup Once the client holds a device session id and a private key, every authenticated call is a signed request to gateway over the authenticated edge listener (Connect / gRPC / gRPC-Web on a single HTTP/h2c port). Gateway is the only component that ever sees the request signature; backend trusts gateway's verdict. Gateway needs the session's public key to verify the signature, so each authenticated request resolves the device session through an in-memory LRU cache (bounded entry count plus a safety-net TTL). On miss the cache calls backend's per-request session lookup endpoint and seeds the entry. Gateway rejects the request when the cache returns "session unknown" or "revoked"; otherwise it verifies the envelope per [ARCHITECTURE.md §15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary) and forwards the verified payload to backend over plain REST, injecting the resolved user id in a header. Backend never re-derives identity from the request body. Backend updates `last_seen_at` on the session row on every successful lookup so admin operators can observe when each cached session was last resolved at the edge. The update is part of the lookup transaction; failures are logged but do not surface to the caller. The cache is invalidated through the push channel rather than a periodic refresh: a `session_invalidation` event flips the cached entry's status to revoked, so subsequent requests bound to the session are rejected without another backend round-trip. The TTL is the safety net for missed events (cursor aged out, gateway restart) — in steady state the push events are the authoritative source of invalidation. ### 1.5 Revocation Revocation makes a device session unable to authenticate any future request and forces in-flight push streams bound to it to close. Triggers fall in two groups. **User-driven (logout).** The user surface exposes three operations: list the caller's active sessions, revoke a single one, and revoke all of them. Gateway forwards these to backend as ordinary authenticated requests. Backend verifies the target session belongs to the caller (otherwise responds with the same shape as a missing session, so foreign session ids are not probeable), atomically flips `device_sessions.status` to `revoked` and inserts a row into `session_revocations`, then publishes one `session_invalidation` event per revoked session. **Admin-driven and lifecycle.** Sanctions that imply session revocation (currently `permanent_block`), admin-driven soft delete, and user-self soft delete all run an in-process call inside backend. The same atomic UPDATE + audit-insert + push emission applies; the audit row carries a different `actor_kind` (`admin_sanction` / `soft_delete_admin` / `soft_delete_user`). Once backend has emitted the push event, gateway flips the cached session entry to revoked and closes any active push streams bound to it. The per-request internal lookup against backend remains the durable safety net: if a push event is lost, the next lookup (after the cache TTL) returns the revoked record. `session_revocations` is the audit ledger. Each row carries `revocation_id`, `device_session_id`, `user_id`, `actor_kind`, the actor pair (`actor_user_id` for user-driven kinds, `actor_username` for admin-driven kinds — exactly one is non-NULL per row), `reason`, and `revoked_at`. Operators can query it to answer "who and why revoked this session"; the table is append-only. Backend's `/api/v1/internal/sessions/{id}` is read-only — it carries the per-request session lookup gateway needs to verify signed envelopes. Internal revoke endpoints no longer exist; revoke is either user-driven (through the user surface) or admin-driven (through in-process calls inside backend). ### 1.6 Cross-references - Wire envelope, signing, freshness window, anti-replay: [ARCHITECTURE.md §15](ARCHITECTURE.md#15-transport-security-model-gateway-boundary). - Backend module responsibilities for `auth`, `user`, `geo`, `mail`, `push`: [ARCHITECTURE.md §4](ARCHITECTURE.md#4-backend-domain-modules) and `backend/README.md`. - Mail outbox semantics for the auth login-code template: [ARCHITECTURE.md §11](ARCHITECTURE.md#11-mail-outbox). - Push channel framing and reconnect rules: [ARCHITECTURE.md §8](ARCHITECTURE.md#8-backend--gateway-communication). User-facing push semantics appear in [Section 7](#7-push-channel) of this document. --- ## 2. Account management This scenario covers what an authenticated user can read or change about their own account, and how a user removes the account. ### 2.1 Scope In scope: reading the account aggregate, updating the mutable profile slice, updating settings (preferred language, time zone, declared country), and user-initiated soft delete. Out of scope: admin-side mutation of the same account (sanctions, limits, entitlement changes, admin soft delete) — covered in [Section 10](#10-administration). Permanent block flag toggling is admin-only. ### 2.2 The account aggregate Backend exposes a single read endpoint that returns the caller's account aggregate: the durable identifying fields (immutable display handle, e-mail), the mutable profile and settings slices, the current entitlement snapshot, and any active sanctions and per-user limit overrides. The aggregate is the authoritative client-side view of "what the platform knows about me". The display handle is synthesised at first sign-in ([Section 1.3](#13-confirming-the-challenge)) and is never overwritten on subsequent sign-ins or on profile updates. Clients should treat it as a stable identifier rather than a display preference. ### 2.3 Profile and settings updates Two distinct mutating endpoints split user-controlled fields by the nature of the change. Both follow PATCH semantics — omitted fields are not touched, present fields replace the stored value — and both return the updated aggregate. Profile carries one display-oriented field: `display_name`. An explicit empty value clears the stored name; omitting the field leaves it untouched. Settings carries locale and timezone preferences: `preferred_language` (BCP 47 tag) and `time_zone` (IANA identifier). Both must be non-empty after trim when present; the timezone is validated against the IANA database before commit. `declared_country` is **not** part of either patch. Backend writes it once at registration from the source IP ([Section 9](#9-geo-signal)) and treats it as immutable thereafter; there is no user-facing path to change it. ### 2.4 User-initiated soft delete The user can ask backend to soft-delete their own account. Backend marks the account row deleted, then runs the in-process cascade documented in [ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns). Concretely: - Every device session for the user is revoked ([Section 1.5](#15-revocation)), with one audit row per session and one `session_invalidation` push event per session. - Active memberships flip to `removed` (admin-driven block flips them to `blocked`); pending applications get `rejected`; incoming invites get `declined`; outgoing invites get `revoked`. - Race name entries owned by the user — registered, reservation, or pending_registration — are deleted in a single cascade write. - Owned games in non-running statuses (`draft`, `enrollment_open`, `ready_to_start`, `start_failed`, `paused`) are cancelled. Owned games already in `running` are **not** cancelled by the cascade — the engine container keeps producing turns until it finishes naturally; only the membership cleanup detaches the user. - A single `lobby.membership.removed` notification fans out to the user with `reason=removed` (or `reason=blocked` for the admin block path). The endpoint returns no body. The cascade is best-effort within a single process: if a downstream module fails, the failure is logged but the account stays marked deleted. ### 2.5 Cross-references - Admin-side counterparts (sanction, limit, entitlement, soft delete): [Section 10](#10-administration). - The cascade contract for "user blocked / user deleted": [ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns). - Notification kinds emitted during the cascade: [`backend/README.md` §10](../backend/README.md#10-notification-catalog). --- ## 3. Lobby game lifecycle This scenario covers a single game's life from creation to terminal state. [Section 4](#4-lobby-participation) covers how players join an existing game; this section focuses on the game itself. ### 3.1 Scope In scope: creating a game (private vs public), updating its mutable configuration, transitioning it through the lobby state machine, cancellation, retry of a failed start, and the terminal transitions (`finished`, `cancelled`). Out of scope: applications, invites, memberships ([Section 4](#4-lobby-participation)), Race Name Directory promotions on finish ([Section 5](#5-race-name-directory)), engine commands during the running phase ([Section 6](#6-in-game-session)). ### 3.2 The state machine The lobby state machine is the closed graph documented in [ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns): ```text draft → enrollment_open → ready_to_start → starting → running ↔ paused → finished ↳ start_failed → ready_to_start (retry) cancelled is reachable from every pre-finished state. ``` Two ground rules: - **Ownership decides the surface.** Private games carry a `owner_user_id`; transitions are driven by the owner through the user surface. Public games are owned collectively by administrators (`owner_user_id IS NULL`); their transitions and configuration changes go through the admin surface. - **The runtime callback owns one transition.** `starting → running` and `starting → start_failed` are the only transitions that the runtime module produces, after the engine container is fully up or has confirmed failure. Every other transition is a user or admin action. ### 3.3 Creation A user creates a private game through the user surface. Backend records the new game with `owner_user_id` set to the caller and visibility `private`, in state `draft`, with the request body's configuration as initial values. Public games are created exclusively through the admin surface ([Section 10](#10-administration)). The user surface never produces a public game; this asymmetry is enforced in backend, not at the route level. ### 3.4 Forward transitions Owners drive forward transitions via dedicated endpoints (`open-enrollment`, `ready-to-start`, `start`, `pause`, `resume`, `retry-start`). Each endpoint: - checks ownership of the game (or admin scope for public games); - checks the source state matches the transition's precondition, rejecting with a conflict if not; - updates the lobby record and publishes any user-facing notifications attached to the transition. `start` queues a runtime job (long-running container pull / start / init) and immediately returns "queued". Final state movement (`starting → running` or `starting → start_failed`) arrives later through the runtime callback. `retry-start` re-arms a `start_failed` game back to `ready_to_start` and lets the owner trigger `start` again. `pause` and `resume` flip between `running` and `paused`. The running engine container is not torn down on pause; only the lobby schedule and command-acceptance flags change. `ready-to-start` is always an explicit owner (or admin) action, never auto-fired. The transition checks that the approved member count is at least `min_players` and rejects with a conflict otherwise. ### 3.5 Cancellation and finish `cancel` is reachable from every pre-finished state. Owners can cancel their own games; admins can cancel any. Cancellation reconciles outstanding applications, invites, and memberships; it does not promote race-name reservations. `finished` is produced inside backend after the engine reports the game finished. The transition tears down the engine container, freezes the lobby record, and triggers Race Name Directory promotions for capable finishes ([Section 5](#5-race-name-directory)). Both terminal states are absorbing. ### 3.6 Admin overrides Administrators can `force-start`, `force-stop`, and `ban-member` on any game (public or private) regardless of state. `force-stop` transitions the game to a stopped state and tears down the engine container; `ban-member` removes a membership and prevents the user from re-joining ([Section 4](#4-lobby-participation)). ### 3.7 Cross-references - State machine vocabulary and transition rules: [ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns). - Runtime job lifecycle (the asynchronous work behind `start`): [ARCHITECTURE.md §13](ARCHITECTURE.md#13-container-lifecycle-in-process) and `backend/docs/flows.md`. - Public-vs-private invariants and the partial index that supports them: [ARCHITECTURE.md §4](ARCHITECTURE.md#4-backend-domain-modules). --- ## 4. Lobby participation This scenario covers everything around joining and leaving an existing game: applications (public), invites (private), and memberships (after the join succeeds). ### 4.1 Scope In scope: submitting an application to a public game, owner / admin approval or rejection of an application, issuing and redeeming invites, recipient decline and issuer revocation, listing memberships per game, and member removal or block. Out of scope: the game state machine itself ([Section 3](#3-lobby-game-lifecycle)) and the in-game commands once a member is playing ([Section 6](#6-in-game-session)). ### 4.2 Applications (public games) A user submits an application to a game by id. Applications are **only accepted on public games**; an attempt against a private game is rejected with a conflict. The game must additionally be in `enrollment_open` (the only enrolment-accepting state for applications). Backend also rejects the request if the user is already a member or on the game's block list (via `ban-member`). Otherwise it stores the application as `pending` and emits a notification to the admin channel. The owner — or an administrator for public games — approves or rejects the application through dedicated endpoints. Approval creates a membership for the applicant and emits the corresponding notification. Rejection just records the terminal state; no membership appears. ### 4.3 Invites (private games) Invites are **only accepted on private games**; an attempt to issue one for a public game is rejected with a conflict. The owner issues an invite while the game is in `draft`, `enrollment_open`, or `ready_to_start`. Two flavours coexist: - **User-bound** — `invited_user_id` is set; only that user may redeem. A `lobby.invite.received` notification is emitted to the recipient. - **Code-based** — `invited_user_id` is empty; backend mints a hex code at issue time and any caller who knows the code may redeem. No notification is emitted at issue time (no recipient is bound yet). Each invite carries an expiry (defaulted from configuration when the body omits `expires_at`). The recipient redeems (creates a membership) or declines; the issuer can revoke an outstanding invite at any time before redemption. ### 4.4 Memberships Memberships list the players currently attached to a game. Owners can remove or block a member; a member can also remove themselves. Removal terminates participation cleanly; block additionally prevents the same user from re-applying or redeeming a future invite for the same game. The admin surface offers `ban-member` as the cross-game-policy counterpart to the owner's block. ### 4.5 Listing the caller's view The user surface exposes three "my" listings (games, applications, invites). They project the caller's involvement across all games without requiring the client to know game ids in advance, which makes the dashboard and inbox views possible. ### 4.6 Notifications Every state change in this scenario emits a notification kind from the catalog: `lobby.invite.received`, `lobby.invite.revoked`, `lobby.application.submitted`, `lobby.application.approved`, `lobby.application.rejected`, `lobby.membership.removed`, `lobby.membership.blocked`. [Section 8](#8-notifications-and-mail) documents the fan-out. ### 4.7 Cross-references - Game lifecycle: [Section 3](#3-lobby-game-lifecycle). - Notification catalog and fan-out: [Section 8](#8-notifications-and-mail) and [`backend/README.md` §10](../backend/README.md#10-notification-catalog). --- ## 5. Race Name Directory This scenario covers how a player picks the name of their in-game race and, eventually, gets that name registered platform-wide. ### 5.1 Scope In scope: the three-tier directory (registered, reservation, pending_registration), promotion through "capable finish", user-driven promotion of a pending registration to registered, sweeper-driven release on TTL expiry, and uniqueness through the canonical-key model. Out of scope: how the engine actually consumes the chosen name — that lives in [Section 6](#6-in-game-session). ### 5.2 Three tiers - **Registered** is platform-unique. A canonical key has at most one live binding to a single user. - **Reservation** is per-game. The same canonical key can be reserved by the same user across several active games at the same time, but two different users cannot reserve the same canonical key in the same game. - **Pending registration** is the transient tier between reservation and registered. It is issued automatically after a "capable finish" (the game ended with the player having grown their initial planet count and population), and it gives the user a bounded window to convert the reservation into a permanent registration. ### 5.3 Canonicalisation Every name (typed by a user or registered by the platform) is folded into a canonical key. Canonicalisation is confusable-aware (latin-cyrillic look-alikes, digit-letter substitutions) and is applied uniformly across the directory; uniqueness is enforced on the canonical key, not on the displayed name. Cross-tier conflicts on the same canonical key are blocked at write time through a per-canonical advisory lock. ### 5.4 Promotion path A reservation appears when a player names their race during a game. When the game finishes capably, backend automatically converts the reservation into a pending_registration with a TTL. While the pending entry is alive, the user can call the registration endpoint to promote the entry to `registered`. If the TTL expires first, a periodic sweeper releases the entry; the canonical key becomes available again. A pending registration can be claimed only by the user who earned it; backend rejects an attempt by a different user even if the canonical key matches. ### 5.5 Notifications The directory emits `lobby.race_name.registered`, `lobby.race_name.pending`, and `lobby.race_name.expired` to the owning user. [Section 8](#8-notifications-and-mail) covers fan-out. ### 5.6 Cross-references - Canonicalisation library and glossary entries ("canonical key", "capable finish"): [ARCHITECTURE.md §19](ARCHITECTURE.md#19-glossary). - The promotion trigger inside the lobby module: [ARCHITECTURE.md §7](ARCHITECTURE.md#7-in-process-async-patterns) (`lobby.OnGameFinished`) and `backend/docs/flows.md`. --- ## 6. In-game session This scenario covers what an active player does while a game is running: submit commands and orders, read turn reports. ### 6.1 Scope In scope: command submission, order submission, report reading, and the turn-cutoff behaviour that closes the command window during generation. Out of scope: how the engine container itself is started, scheduled, or stopped — those are runtime concerns covered in [Section 3](#3-lobby-game-lifecycle) (start / stop) and [Section 10](#10-administration) (admin runtime overrides). The wire format of commands, orders, and reports is the engine's own contract and is not duplicated here. ### 6.2 Backend's role: pass-through with authorisation The signed authenticated-edge pipeline for in-game traffic uses four message types on the authenticated surface — `user.games.command`, `user.games.order`, `user.games.order.get`, `user.games.report` — each with a typed FlatBuffers payload. Gateway transcodes the FB request into the JSON shape backend expects, forwards over plain REST to the corresponding `/api/v1/user/games/{game_id}/*` endpoint, then transcodes the JSON response back into FB before signing the reply. `user.games.order.get` is the read-back companion to `user.games.order`: clients use it to hydrate the local order draft after a cache loss (fresh install, cleared storage, new device). For every in-game endpoint the user surface acts as an authorised pass-through to the engine container. Backend: - verifies the caller is an active member of the target game and that the game is in a state that accepts the operation; - rebinds the actor field in the body to the caller's race name from the runtime player mapping (clients never supply a trusted actor); - resolves the engine endpoint (the running container for the `game_id`) and forwards the call; - returns the engine's response payload back to the client without re-interpretation. Backend does not parse command or order payload contents beyond what authorisation requires. The engine is the source of truth for validity and ordering of in-game decisions. Gateway needs to know the typed FB shape only to transcode the wire format; the per-command semantics live in the engine. ### 6.3 Turn cutoff and auto-pause A running game continuously alternates between a command-accepting window and a generation phase, driven by the cron expression stored in `runtime_records.turn_schedule`. The backend scheduler (`backend/internal/runtime/scheduler.go`) wraps each engine `/admin/turn` call between two `runtime_status` flips: - Before the engine call: `running → generation_in_progress`. The user-games command/order handlers (`backend/internal/server/handlers_user_games.go`) consult the per-game runtime record on every request and reject with HTTP 409 + `code = turn_already_closed` while the runtime sits in `generation_in_progress`. The error envelope mirrors backend's standard `httperr` shape: `{"error": {"code": "turn_already_closed", "message": "..."}}`. - After a successful tick: `generation_in_progress → running`. The order window re-opens for the new turn and the next scheduled tick continues normally. - After a failed tick (`engine_unreachable` / `generation_failed`): the lobby's `OnRuntimeSnapshot` flips the game from `running` to `paused` and publishes a `game.paused` push event (see §6.6). The order handlers reject with HTTP 409 + `code = game_paused` until an admin resume succeeds. `force-next-turn` (admin) schedules a one-shot extra tick that advances the next scheduled turn by one cron step; the same status-flip and rejection rules apply. Clients distinguish the two rejections by `code`: `turn_already_closed` means "wait for the next `game.turn.ready` and resubmit", whereas `game_paused` means "wait for an admin resume". The web client implements both reactions in `ui/docs/sync-protocol.md`. ### 6.4 Reports Per-turn reports are read-only views fetched from the engine on demand. Backend authorises the caller and forwards the request; there is no caching or denormalisation in this path. The web client renders the report as one section per FBS array (galaxy summary, votes, player status, my / foreign sciences, my / foreign ship classes, battles, bombings, approaching groups, my / foreign / uninhabited / unknown planets, ships in production, cargo routes, my fleets, my / foreign / unidentified ship groups). Empty sections render explicit empty-state copy. Section anchors are exposed in a sticky table of contents (a `