1417 lines
60 KiB
Markdown
1417 lines
60 KiB
Markdown
# Game Lobby Service
|
|
|
|
`galaxy/lobby` owns platform-level metadata and lifecycle of game sessions.
|
|
|
|
## References
|
|
|
|
- [Public REST contract](api/public-openapi.yaml)
|
|
- [Internal REST contract](api/internal-openapi.yaml)
|
|
- [System architecture](../ARCHITECTURE.md)
|
|
- [Notification catalog](../notification/README.md)
|
|
- [User Service lobby eligibility](../user/README.md)
|
|
- [Service-local docs](docs/)
|
|
|
|
## Purpose
|
|
|
|
`Game Lobby Service` is the platform source of truth for game sessions as
|
|
platform entities — from creation through enrollment, start, runtime tracking,
|
|
and finish. It mediates all player participation actions and maintains the
|
|
roster state that `Game Master` may cache for runtime authorization.
|
|
|
|
## Scope
|
|
|
|
`Game Lobby` is the source of truth for:
|
|
|
|
- opaque stable game identifiers in `game-*` form
|
|
- game metadata: name, description, type, owner, schedule, engine version
|
|
- platform-level game status from `draft` through `finished` or `cancelled`
|
|
- enrollment configuration: `min_players`, `max_players`, `start_gap_hours`,
|
|
`start_gap_players`, `enrollment_ends_at`
|
|
- applications and their approval or rejection status (public games)
|
|
- user-bound invitations and their lifecycle (private games)
|
|
- platform membership roster and participant status
|
|
- Race Name Directory state across all regular platform users: registered
|
|
race names (permanent ownership), per-game reservations, and 30-day
|
|
pending-registration windows
|
|
- per-game per-user `player_turn_stats` aggregate used at game finish for
|
|
capability evaluation
|
|
- denormalized runtime snapshot imported from `Game Master`
|
|
- user-facing lists: active games, pending applications, open invitations
|
|
|
|
`Game Lobby` is not the source of truth for:
|
|
|
|
- platform user identity or profile — owned by `User Service`
|
|
- device sessions or authentication state — owned by `Auth / Session Service`
|
|
- runtime container lifecycle or technical health — owned by `Runtime Manager`
|
|
- current turn, generation state, engine reachability — owned by `Game Master`
|
|
- full per-player game state — owned by the game engine container
|
|
- player-to-engine UUID mapping — owned by `Game Master`
|
|
|
|
## Non-Goals
|
|
|
|
- `Game Lobby` does not call game engine containers directly; all engine
|
|
interaction goes through `Game Master`.
|
|
- `Game Lobby` owns the Race Name Directory data in v1 (Redis adapter); the
|
|
contract is kept behind a port interface so a future dedicated
|
|
`Race Name Service` can replace the adapter without domain changes.
|
|
- `Game Lobby` does not compute notification audiences from roster data at
|
|
delivery time; notification intents carry explicit `recipient_user_id` values.
|
|
- `Game Lobby` does not apply sanctions or session-level access control;
|
|
`User Service` and `Auth / Session Service` remain authoritative for those.
|
|
- `Game Lobby` does not own billing or entitlement decisions; it reads the
|
|
current entitlement snapshot from `User Service`.
|
|
|
|
## Position in the System
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
Gateway["Edge Gateway"]
|
|
Lobby["Game Lobby Service"]
|
|
User["User Service"]
|
|
GM["Game Master"]
|
|
Runtime["Runtime Manager"]
|
|
Notify["Notification Service"]
|
|
Redis["Redis\nKV + Streams"]
|
|
|
|
Gateway --> Lobby
|
|
Lobby --> User
|
|
Lobby --> GM
|
|
Lobby --> Redis
|
|
Lobby --> Notify
|
|
GM --> Redis
|
|
Redis --> Lobby
|
|
Runtime --> Redis
|
|
```
|
|
|
|
`Gateway` routes authenticated platform-level commands to `Lobby` over trusted
|
|
REST.
|
|
`Lobby` reads user eligibility from `User Service` synchronously.
|
|
`Lobby` registers running games with `Game Master` synchronously at start.
|
|
`Lobby` submits start jobs to `Runtime Manager` and reads job results from a
|
|
dedicated Redis Stream.
|
|
`Game Master` publishes runtime events to a dedicated Redis Stream that `Lobby`
|
|
consumes asynchronously.
|
|
`Lobby` publishes notification intents to `notification:intents`.
|
|
|
|
## Responsibility Boundaries
|
|
|
|
`Game Lobby` is responsible for:
|
|
|
|
- accepting and validating game creation and configuration commands
|
|
- opening and managing enrollment for public and private games
|
|
- validating user eligibility before accepting applications and invite redeems
|
|
- checking race name availability through the Race Name Directory port
|
|
- enforcing enrollment deadline and roster-size auto-transitions
|
|
- orchestrating the game start sequence with `Runtime Manager` and `Game Master`
|
|
- persisting game metadata atomically and removing orphaned containers when
|
|
metadata persistence fails
|
|
- maintaining the denormalized runtime snapshot for user-facing reads
|
|
- emitting notification intents for all participant lifecycle events
|
|
- enforcing visibility rules: private games are visible only to owner and members
|
|
|
|
`Game Lobby` is not responsible for:
|
|
|
|
- verifying authenticated transport signatures — handled by `Edge Gateway`
|
|
- checking session revocation state — handled by `Edge Gateway` and `Auth`
|
|
- email delivery — handled by `Mail Service`
|
|
- push delivery — handled by `Notification Service` and `Edge Gateway`
|
|
- container start and stop mechanics — handled by `Runtime Manager`
|
|
- per-turn player command routing — handled by `Game Master`
|
|
|
|
## Runtime Surface
|
|
|
|
The service starts two HTTP listeners and one Redis Stream consumer pipeline.
|
|
|
|
### Listeners
|
|
|
|
- public authenticated REST on `LOBBY_PUBLIC_HTTP_ADDR` with default `:8094`
|
|
- internal trusted REST on `LOBBY_INTERNAL_HTTP_ADDR` with default `:8095`
|
|
|
|
### Background workers
|
|
|
|
- enrollment automation ticker — checks enrollment deadlines and roster
|
|
thresholds at a configurable interval
|
|
- Runtime Manager result consumer — reads start-job results from a Redis Stream
|
|
- Game Master event consumer — reads runtime snapshot updates and game-finish
|
|
events from a dedicated Redis Stream
|
|
|
|
### Startup dependencies
|
|
|
|
- one reachable Redis deployment at `LOBBY_REDIS_MASTER_ADDR` (mandatory
|
|
password via `LOBBY_REDIS_PASSWORD`; replicas optional via
|
|
`LOBBY_REDIS_REPLICA_ADDRS`). Used for streams, race-name directory,
|
|
per-game runtime aggregates, and stream offsets.
|
|
- one reachable PostgreSQL primary at `LOBBY_POSTGRES_PRIMARY_DSN` (DSN
|
|
must include `search_path=lobby&sslmode=disable`). Embedded goose
|
|
migrations apply at startup before any listener opens; on migration or
|
|
ping failure the service exits non-zero. The four core enrollment
|
|
entities (game / application / invite / membership) live here after
|
|
PG_PLAN.md §6A; `docs/postgres-migration.md` is the decision record.
|
|
- `User Service` reachable at `LOBBY_USER_SERVICE_BASE_URL` (startup check only;
|
|
runtime failures are surfaced as request errors, not boot failures)
|
|
- `Game Master` at `LOBBY_GM_BASE_URL` (same policy — startup check omitted;
|
|
unreachability at registration triggers the forced-pause path)
|
|
|
|
### Probes
|
|
|
|
- `GET /healthz` on both ports returns `{"status":"ok"}`
|
|
- `GET /readyz` on both ports returns `{"status":"ready"}` after successful
|
|
startup; no live Redis or PostgreSQL ping per request
|
|
|
|
## Game Record Model
|
|
|
|
### Fields
|
|
|
|
| Field | Type | Notes |
|
|
| --- | --- | --- |
|
|
| `game_id` | string | opaque, stable, `game-*` form |
|
|
| `game_name` | string | human-readable; mutable in `draft` |
|
|
| `description` | string | optional; mutable in `draft` and `enrollment_open` |
|
|
| `game_type` | enum | `public` or `private` |
|
|
| `owner_user_id` | string | private games only; empty for public |
|
|
| `status` | enum | see status table below |
|
|
| `min_players` | int | minimum approved participants to proceed to start |
|
|
| `max_players` | int | target roster size that activates the gap window |
|
|
| `start_gap_hours` | int | hours of gap window after `max_players` is reached |
|
|
| `start_gap_players` | int | additional participants admitted during the gap |
|
|
| `enrollment_ends_at` | int64 | UTC Unix seconds; deadline for automatic enrollment close |
|
|
| `turn_schedule` | string | cron expression, e.g. `0 18 * * *`; passed to GM at registration |
|
|
| `target_engine_version` | string | semver of the engine to launch; passed to GM at registration |
|
|
| `created_at` | int64 | UTC Unix milliseconds |
|
|
| `updated_at` | int64 | UTC Unix milliseconds |
|
|
| `started_at` | int64 | UTC Unix milliseconds; set when status becomes `running` |
|
|
| `finished_at` | int64 | UTC Unix milliseconds; set when status becomes `finished` |
|
|
| `current_turn` | int | denormalized from GM; zero until running |
|
|
| `runtime_status` | string | denormalized from GM; empty until running |
|
|
| `engine_health_summary` | string | denormalized from GM; empty until running |
|
|
| `runtime_binding` | object? | non-null after successful container start; contains `container_id`, `engine_endpoint`, `runtime_job_id`, `bound_at` (Unix ms) |
|
|
|
|
All fields set at creation are validated before the game record is persisted.
|
|
`game_name` is required and must be non-empty after trim.
|
|
`min_players`, `max_players`, `start_gap_hours`, `start_gap_players`, and
|
|
`enrollment_ends_at` are required positive integers with `min_players <= max_players`.
|
|
`turn_schedule` must be a valid five-field cron expression.
|
|
`target_engine_version` must be a non-empty semver string.
|
|
|
|
### Status vocabulary
|
|
|
|
| Status | Meaning |
|
|
| --- | --- |
|
|
| `draft` | Created; enrollment not yet open; editable |
|
|
| `enrollment_open` | Accepting applications (public) or invite redeems (private) |
|
|
| `ready_to_start` | Enrollment closed; start command accepted |
|
|
| `starting` | Start job submitted to Runtime Manager; awaiting result |
|
|
| `start_failed` | Container start or metadata persistence failed |
|
|
| `running` | Game engine container live; normal gameplay |
|
|
| `paused` | Platform-level pause; engine container may still be alive |
|
|
| `finished` | Game ended; record is terminal |
|
|
| `cancelled` | Cancelled before start; record is terminal |
|
|
|
|
### Status transition table
|
|
|
|
| From | To | Trigger |
|
|
| --- | --- | --- |
|
|
| `draft` | `enrollment_open` | explicit command from admin (public) or owner (private) |
|
|
| `enrollment_open` | `ready_to_start` | manual command when `approved_count >= min_players` |
|
|
| `enrollment_open` | `ready_to_start` | `enrollment_ends_at` reached and `approved_count >= min_players` |
|
|
| `enrollment_open` | `ready_to_start` | gap window exhausted (time or player count) |
|
|
| `ready_to_start` | `starting` | start command from admin (public) or owner (private) |
|
|
| `starting` | `running` | Runtime Manager confirms container; GM registration succeeds |
|
|
| `starting` | `paused` | Runtime Manager confirms container; GM registration fails (unavailable) |
|
|
| `starting` | `start_failed` | Runtime Manager reports container start failure |
|
|
| `start_failed` | `ready_to_start` | explicit retry command from admin or owner |
|
|
| `running` | `paused` | explicit pause command from admin or owner |
|
|
| `running` | `finished` | `game_finished` event from `Game Master` via Redis Stream |
|
|
| `paused` | `running` | explicit resume command from admin or owner |
|
|
| `paused` | `finished` | `game_finished` event from `Game Master` via Redis Stream |
|
|
| `draft` | `cancelled` | explicit cancel command from admin or owner |
|
|
| `enrollment_open` | `cancelled` | explicit cancel command from admin or owner |
|
|
| `ready_to_start` | `cancelled` | explicit cancel command from admin or owner |
|
|
| `start_failed` | `cancelled` | explicit cancel command from admin or owner |
|
|
| `draft` | `cancelled` | `external_block` cascade on owner permanent_block / DeleteUser |
|
|
| `enrollment_open` | `cancelled` | `external_block` cascade on owner permanent_block / DeleteUser |
|
|
| `ready_to_start` | `cancelled` | `external_block` cascade on owner permanent_block / DeleteUser |
|
|
| `start_failed` | `cancelled` | `external_block` cascade on owner permanent_block / DeleteUser |
|
|
| `starting` | `cancelled` | `external_block` cascade on owner permanent_block / DeleteUser |
|
|
| `running` | `cancelled` | `external_block` cascade on owner permanent_block / DeleteUser |
|
|
| `paused` | `cancelled` | `external_block` cascade on owner permanent_block / DeleteUser |
|
|
|
|
Outside the `external_block` cascade, `running` and `paused` games cannot be
|
|
cancelled directly; use stop operations through `Game Master` and await the
|
|
`game_finished` event instead. The cascade publishes a stop-job to Runtime
|
|
Manager before applying the `external_block` transition for in-flight games.
|
|
|
|
## Enrollment Rules
|
|
|
|
`enrollment_open → ready_to_start` fires on the first of these conditions:
|
|
|
|
### Manual close
|
|
|
|
Admin (public game) or owner (private game) issues `lobby.game.ready_to_start`
|
|
when `approved_count >= min_players`.
|
|
|
|
### Deadline
|
|
|
|
Enrollment automation worker detects that `enrollment_ends_at` is in the past
|
|
and `approved_count >= min_players`.
|
|
If the deadline is reached but `approved_count < min_players`, the game remains
|
|
in `enrollment_open` — the transition does not fire until the player count
|
|
condition is also satisfied.
|
|
|
|
### Gap exhaustion
|
|
|
|
When `approved_count` reaches `max_players`, the gap window opens.
|
|
During the gap window:
|
|
|
|
- new applications and invite redeems continue to be accepted up to
|
|
`max_players + start_gap_players` total approved participants
|
|
- the game does not automatically transition while the gap is open
|
|
|
|
The transition fires when either:
|
|
|
|
- `start_gap_hours` have elapsed since the gap window opened, or
|
|
- `approved_count` reaches `max_players + start_gap_players`
|
|
|
|
### On enrollment close
|
|
|
|
When any path transitions the game to `ready_to_start`:
|
|
|
|
- all invites in `created` status transition to `expired`
|
|
- `lobby.invite.expired` notification intents are published for each expired invite
|
|
(recipient: private-game owner)
|
|
- no new applications are accepted in `ready_to_start` status
|
|
|
|
## Application Lifecycle
|
|
|
|
Applications are used for public games only.
|
|
Private games use the invite flow exclusively.
|
|
|
|
### Submit
|
|
|
|
An authenticated user submits `lobby.application.submit` with `race_name`.
|
|
|
|
Pre-conditions checked synchronously:
|
|
|
|
- game status is `enrollment_open`
|
|
- game type is `public`
|
|
- user has no existing non-rejected application to the same game
|
|
- `User Service` eligibility check confirms `can_join_game=true`
|
|
- `approved_count < max_players + start_gap_players` (or gap window not yet open)
|
|
- Race Name Directory confirms `race_name` is available for the applicant
|
|
|
|
On success:
|
|
|
|
- an `Application` record is created with `status=submitted`
|
|
- `lobby.application.submitted` intent published (`audience_kind=admin_email`)
|
|
with payload: `game_id`, `game_name`, `applicant_user_id`, `applicant_name`
|
|
|
|
`applicant_name` in the notification payload equals the submitted `race_name`.
|
|
|
|
### Approve
|
|
|
|
Admin issues `lobby.application.approve`.
|
|
|
|
Pre-conditions:
|
|
|
|
- game is `enrollment_open`
|
|
- application is in `submitted` status
|
|
- `approved_count < max_players + start_gap_players`
|
|
|
|
On success:
|
|
|
|
- Race Name Directory reserves `race_name` for the applicant
|
|
- application `status` → `approved`
|
|
- `Membership` record created with `status=active`
|
|
- `lobby.membership.approved` intent published (recipient: applicant)
|
|
with payload: `game_id`, `game_name`
|
|
- gap window opens automatically if `approved_count` now equals `max_players`
|
|
- auto-transition to `ready_to_start` if gap exhaustion condition is immediately met
|
|
|
|
### Reject
|
|
|
|
Admin issues `lobby.application.reject`.
|
|
|
|
Pre-conditions:
|
|
|
|
- application is in `submitted` status
|
|
|
|
On success:
|
|
|
|
- application `status` → `rejected`
|
|
- any pending Race Name Directory reservation for the applicant is released
|
|
- `lobby.membership.rejected` intent published (recipient: applicant)
|
|
with payload: `game_id`, `game_name`
|
|
|
|
### Application state machine
|
|
|
|
```text
|
|
submitted → approved
|
|
submitted → rejected
|
|
```
|
|
|
|
Rejected applicants may re-apply while enrollment is open, subject to a single
|
|
active application constraint (at most one non-rejected application per user per
|
|
game).
|
|
|
|
The single-active constraint is enforced at the persistence layer by the
|
|
`user_game_application` key (see Redis Logical Model). The key is created
|
|
atomically with the submitted application record, removed on rejection, and
|
|
preserved on approval. Service-layer code can rely on this invariant without
|
|
performing its own scan of `user_applications`.
|
|
|
|
## Invite Lifecycle
|
|
|
|
Invites are used for private games only.
|
|
Public games use the application flow exclusively.
|
|
|
|
### Create
|
|
|
|
Private-game owner issues `lobby.invite.create` with `invitee_user_id`.
|
|
|
|
Pre-conditions:
|
|
|
|
- game status is `enrollment_open`
|
|
- game type is `private`
|
|
- the invitee has no active invite or active membership in the game
|
|
- `approved_count < max_players + start_gap_players`
|
|
|
|
On success:
|
|
|
|
- `Invite` record created with `status=created`
|
|
- `expires_at` is set to `enrollment_ends_at` of the game
|
|
- `lobby.invite.created` intent published (recipient: invitee)
|
|
with payload: `game_id`, `game_name`, `inviter_user_id`, `inviter_name`
|
|
|
|
`inviter_name` is the owner's race name if already a member of the game;
|
|
otherwise it is the owner's `user_id`.
|
|
|
|
### Redeem
|
|
|
|
The invited user issues `lobby.invite.redeem` with `race_name`.
|
|
|
|
Pre-conditions:
|
|
|
|
- invite status is `created`
|
|
- game is `enrollment_open`
|
|
- `approved_count < max_players + start_gap_players`
|
|
- inviter and invitee both exist and are not permanently blocked in
|
|
`User Service`
|
|
- Race Name Directory confirms `race_name` is available for the invitee
|
|
|
|
On success:
|
|
|
|
- Race Name Directory reserves `race_name` for the invitee
|
|
- invite `status` → `redeemed`
|
|
- `Membership` record created with `status=active`
|
|
- `lobby.invite.redeemed` intent published (recipient: private-game owner)
|
|
with payload: `game_id`, `game_name`, `invitee_user_id`, `invitee_name`
|
|
- gap window opens automatically if `approved_count` now equals `max_players`
|
|
- auto-transition to `ready_to_start` if gap exhaustion condition is immediately met
|
|
|
|
The synchronous `User Service` check on both inviter and invitee enforces the
|
|
rule that an invite from or to a permanently blocked or deleted user behaves
|
|
as if it never existed, even before the asynchronous user-lifecycle cascade
|
|
has flipped the invite to `revoked`. Cascade-deleted accounts and
|
|
`permanent_block` sanctions surface as `subject_not_found`.
|
|
|
|
### Decline
|
|
|
|
The invited user issues `lobby.invite.decline`.
|
|
|
|
Pre-conditions:
|
|
|
|
- invite status is `created`
|
|
|
|
On success:
|
|
|
|
- invite `status` → `declined`
|
|
- no notification in v1
|
|
|
|
Declined users may receive a new invite from the owner while enrollment is open.
|
|
|
|
### Revoke
|
|
|
|
Owner issues `lobby.invite.revoke`.
|
|
|
|
Pre-conditions:
|
|
|
|
- invite status is `created`
|
|
|
|
On success:
|
|
|
|
- invite `status` → `revoked`
|
|
- no notification in v1
|
|
|
|
### Expire
|
|
|
|
Pending invites (`status=created`) are transitioned to `expired` automatically
|
|
when the game moves to `ready_to_start`.
|
|
|
|
`lobby.invite.expired` intent is published for each expired invite
|
|
(recipient: private-game owner)
|
|
with payload: `game_id`, `game_name`, `invitee_user_id`, `invitee_name`.
|
|
|
|
### Invite state machine
|
|
|
|
```text
|
|
created → redeemed
|
|
created → declined
|
|
created → revoked
|
|
created → expired
|
|
```
|
|
|
|
## Membership Model
|
|
|
|
### Fields
|
|
|
|
| Field | Type | Notes |
|
|
| --- | --- | --- |
|
|
| `membership_id` | string | opaque, stable |
|
|
| `game_id` | string | reference to game |
|
|
| `user_id` | string | reference to platform user |
|
|
| `race_name` | string | confirmed in-game name as submitted (original casing) |
|
|
| `canonical_key` | string | canonicalized key under which the RND reservation is held |
|
|
| `status` | enum | `active`, `removed`, `blocked` |
|
|
| `joined_at` | int64 | UTC Unix milliseconds |
|
|
| `removed_at` | int64 | UTC Unix milliseconds; set on remove or block |
|
|
|
|
### Status vocabulary
|
|
|
|
| Status | Meaning |
|
|
| --- | --- |
|
|
| `active` | Full participant; may send commands through `Game Master` |
|
|
| `removed` | Permanently removed; engine slot deactivated after game start |
|
|
| `blocked` | Platform-level block; engine slot retained but commands blocked |
|
|
|
|
### Status transition table
|
|
|
|
| From | To | Trigger |
|
|
| --- | --- | --- |
|
|
| `active` | `removed` | explicit remove command from admin or owner (post-start) |
|
|
| `active` | `blocked` | explicit block command from admin or owner |
|
|
|
|
`removed` and `blocked` are terminal statuses. Pre-start remove drops the
|
|
membership record entirely rather than transitioning to `removed`
|
|
(see Removal rules below).
|
|
|
|
### Removal rules
|
|
|
|
Before game start:
|
|
|
|
- remove drops membership and releases the race name reservation
|
|
|
|
After game start:
|
|
|
|
- `blocked`: the player cannot send commands; engine keeps the player slot
|
|
- `removed`: `Game Lobby` marks membership `removed`; `Game Master` must also
|
|
deactivate the player inside the engine; race name reservation remains until
|
|
game is finished
|
|
|
|
This distinction is architectural and must remain explicit in all implementations.
|
|
|
|
## Race Name Directory
|
|
|
|
### Purpose
|
|
|
|
`Race Name Directory` (RND) is the platform source of truth for all in-game
|
|
`race_name` values. It owns three levels of state per name:
|
|
|
|
- **registered** — permanent user-owned names. Once registered, the name is
|
|
unavailable to any other user and cannot be released by the owner; only
|
|
`permanent_block` or `DeleteUser` on the owning account frees it.
|
|
- **reservation** — a per-game holding created when a participant joins
|
|
through application approval or invite redeem. Reservations are keyed by
|
|
`(game_id, canonical_key)`. One user may hold the same name in multiple
|
|
active games concurrently.
|
|
- **pending_registration** — a reservation that survived a capable finish and
|
|
is now waiting up to 30 days for the owner to upgrade it into a registered
|
|
name via `lobby.race_name.register`. Expiration releases the binding.
|
|
|
|
`User Service` does not store `race_name` values. It only exposes
|
|
`max_registered_race_names` in the eligibility snapshot and publishes
|
|
`user.lifecycle.permanent_blocked` / `user.lifecycle.deleted` events.
|
|
|
|
### Canonical key + confusable-pair policy
|
|
|
|
Every RND key is derived by
|
|
`racename.Canonicalize(raceName) (canonical string, err error)` living in
|
|
`lobby/internal/domain/racename/policy.go`:
|
|
|
|
1. trim and validate the character set via `pkg/util/string.go:ValidateTypeName`;
|
|
2. lowercase Unicode fold;
|
|
3. apply the frozen confusable-pair replacement map (ported from the former
|
|
`user/internal/ports/race_name_policy.go`).
|
|
|
|
A name is considered taken for the actor when the RND holds at least one
|
|
`registered`, active `reservation`, or `pending_registration` whose owner
|
|
differs from the actor on the same canonical key.
|
|
|
|
### Port interface
|
|
|
|
```
|
|
type RaceNameDirectory interface {
|
|
Canonicalize(raceName string) (canonical string, err error)
|
|
|
|
Check(ctx context.Context, raceName, actorUserID string) (Availability, error)
|
|
|
|
Reserve(ctx context.Context, gameID, userID, raceName string) error
|
|
ReleaseReservation(ctx context.Context, gameID, userID, raceName string) error
|
|
|
|
MarkPendingRegistration(
|
|
ctx context.Context,
|
|
gameID, userID, raceName string,
|
|
eligibleUntil time.Time,
|
|
) error
|
|
ExpirePendingRegistrations(ctx context.Context, now time.Time) ([]ExpiredPending, error)
|
|
|
|
Register(ctx context.Context, gameID, userID, raceName string) error
|
|
|
|
ListRegistered(ctx context.Context, userID string) ([]RegisteredName, error)
|
|
ListPendingRegistrations(ctx context.Context, userID string) ([]PendingRegistration, error)
|
|
ListReservations(ctx context.Context, userID string) ([]Reservation, error)
|
|
|
|
ReleaseAllByUser(ctx context.Context, userID string) error
|
|
}
|
|
|
|
type Availability struct {
|
|
Taken bool
|
|
HolderUserID string // "" when available
|
|
Kind string // "registered" | "reservation" | "pending_registration"
|
|
}
|
|
```
|
|
|
|
Sentinel errors: `ErrNameTaken`, `ErrInvalidName`, `ErrPendingMissing`,
|
|
`ErrPendingExpired`, `ErrQuotaExceeded`.
|
|
|
|
### v1 backends
|
|
|
|
- **PostgreSQL** (`lobby/internal/adapters/postgres/racenamedir/directory.go`)
|
|
— the production adapter; one row per binding under
|
|
`lobby.race_names`, transactional writes guarded by
|
|
`pg_advisory_xact_lock(hashtextextended(canonical_key, 0))`. See
|
|
`docs/postgres-migration.md` §6B for the full schema and decision
|
|
record.
|
|
- **In-memory** (`lobby/internal/adapters/racenameinmem/directory.go`) —
|
|
in-process implementation used by unit tests that do not need
|
|
PostgreSQL and by deployments that select the in-memory backend with
|
|
`LOBBY_RACE_NAME_DIRECTORY_BACKEND=stub` (the config token name is
|
|
preserved for backward compatibility).
|
|
|
|
A future dedicated `Race Name Service` replaces the adapter without changing
|
|
the domain or service layer.
|
|
|
|
### Reservation lifecycle and capability
|
|
|
|
1. `approveapplication` / `redeeminvite` → `Reserve(game_id, user_id,
|
|
race_name)`.
|
|
2. `removemember` before start → `ReleaseReservation`.
|
|
3. `removemember` / `blockmember` after start → reservation kept; resolved at
|
|
`game_finished`.
|
|
4. On `game_finished` the capability evaluator runs per active membership:
|
|
- `capable = max_planets > initial_planets AND max_population >
|
|
initial_population`, using the per-game stats aggregate (see §Runtime
|
|
Snapshot);
|
|
- capable ⇒ `MarkPendingRegistration(..., finished_at + 30 days)` +
|
|
`lobby.race_name.registration_eligible`;
|
|
- not capable ⇒ `ReleaseReservation` + optional
|
|
`lobby.race_name.registration_denied`.
|
|
5. The pending-registration worker
|
|
(`LOBBY_RACE_NAME_EXPIRATION_INTERVAL`) releases expired entries.
|
|
|
|
### Registration flow
|
|
|
|
`lobby.race_name.register` → `POST /api/v1/lobby/race-names/register`:
|
|
|
|
- actor is the authenticated user;
|
|
- body: `{race_name, source_game_id}`;
|
|
- preconditions:
|
|
- `pending_registration` exists for `(source_game_id, user_id, canonical_key)`
|
|
with `eligible_until > now`;
|
|
- `UserService.GetEligibility` snapshot: no `permanent_block`,
|
|
`current_registered_count < max_registered_race_names` (a snapshot value
|
|
of `0` denotes unlimited);
|
|
- commit: `RND.Register` atomically deletes the pending entry, creates a
|
|
registered entry, and publishes `lobby.race_name.registered`.
|
|
|
|
Errors: `race_name_registration_quota_exceeded`,
|
|
`race_name_pending_window_expired`, `subject_not_found`, `forbidden`.
|
|
|
|
### Self-service reads
|
|
|
|
`lobby.race_names.list` → `GET /api/v1/lobby/my/race-names` returns the
|
|
acting user's `{registered[], pending[], reservations[]}` using the
|
|
`user_registered` / `user_reservations` indexes (no full scan).
|
|
|
|
The response shape is fixed by `api/public-openapi.yaml` and carries:
|
|
|
|
- `registered[]`: `canonical_key`, `race_name`, `source_game_id`,
|
|
`registered_at_ms`;
|
|
- `pending[]`: `canonical_key`, `race_name`, `source_game_id` (the
|
|
game whose capable finish promoted the reservation),
|
|
`reserved_at_ms`, `eligible_until_ms`;
|
|
- `reservations[]`: `canonical_key`, `race_name`, `game_id`,
|
|
`reserved_at_ms`, `game_status` (current `game.Status` of the
|
|
hosting game, joined on read).
|
|
|
|
Each slice is sorted ascending by its time field with `canonical_key`
|
|
as the tie-breaker so the wire output is stable. The endpoint is
|
|
exclusively self-service: there is no `?user_id=` parameter and no
|
|
admin counterpart on the internal port. Visibility is enforced by the
|
|
`X-User-ID` header alone.
|
|
|
|
### Cascade release
|
|
|
|
`Game Lobby` consumes `user:lifecycle_events` through a dedicated worker. On
|
|
`user.lifecycle.permanent_blocked` or `user.lifecycle.deleted`:
|
|
|
|
- `RND.ReleaseAllByUser(user_id)` clears every registered, reservation, and
|
|
pending entry owned by the user;
|
|
- every active membership held by the user transitions to `blocked`. For each
|
|
such membership in a third-party private game, a `lobby.membership.blocked`
|
|
intent is published to the game owner;
|
|
- every outstanding `submitted` application authored by the user is rejected;
|
|
- every `created` invite where the user is invitee or inviter transitions to
|
|
`revoked`;
|
|
- every non-terminal game owned by the user transitions to `cancelled` via
|
|
the `external_block` trigger. For in-flight games (`starting`, `running`,
|
|
`paused`) a stop-job is published to Runtime Manager before the status
|
|
transition.
|
|
|
|
Synchronous guard: `lobby.invite.redeem` calls `UserService.GetEligibility`
|
|
for both the inviter and the invitee. If either party has been permanently
|
|
blocked or soft-deleted, the redeem fails with `subject_not_found`, matching
|
|
the «as if the invite never existed» semantic even before the cascade
|
|
flips the invite to `revoked`.
|
|
|
|
### Retry and release semantics
|
|
|
|
- `Reserve` is idempotent for the same holder under the same game. A second
|
|
call returns no error so that `approveapplication` and `redeeminvite`
|
|
retries after transient upstream failures stay safe.
|
|
- `ReleaseReservation` is a no-op when no reservation exists for the tuple
|
|
and also when the reservation belongs to a different user. Defensive
|
|
release paths (`rejectapplication`, `revokeinvite`, `declineinvite`) never
|
|
surface an error.
|
|
- `Register` is idempotent only for the same `(game_id, user_id, race_name)`
|
|
tuple — repeated calls after success return the same registered record
|
|
without consuming additional quota.
|
|
- `MarkPendingRegistration` is idempotent when called with the same
|
|
`eligible_until`; re-emitting it with a different timestamp returns
|
|
`ErrInvalidName`.
|
|
|
|
## Game Start Flow
|
|
|
|
The start sequence spans three services and must be treated as a distributed
|
|
transaction with explicit failure handling.
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Admin as Admin or Private Owner
|
|
participant Lobby
|
|
participant Runtime
|
|
participant GM as Game Master
|
|
participant Redis
|
|
|
|
Admin->>Lobby: lobby.game.start
|
|
Lobby->>Lobby: validate ready_to_start + roster
|
|
Lobby->>Lobby: status → starting
|
|
Lobby->>Redis: publish start job to runtime:start_jobs
|
|
Runtime->>Runtime: start container
|
|
Runtime->>Redis: publish result to runtime:job_results
|
|
|
|
alt container start failed
|
|
Lobby->>Lobby: status → start_failed
|
|
else container started
|
|
Lobby->>Lobby: persist runtime binding
|
|
Lobby->>GM: POST /internal/games/{game_id}/register (sync)
|
|
alt GM registration success
|
|
GM-->>Lobby: 200 OK
|
|
Lobby->>Lobby: status → running; set started_at
|
|
else GM unavailable
|
|
GM-->>Lobby: error / timeout
|
|
Lobby->>Lobby: status → paused
|
|
Lobby->>Redis: publish lobby.runtime_paused_after_start intent
|
|
end
|
|
end
|
|
```
|
|
|
|
### Critical invariants
|
|
|
|
- If the container starts but `Lobby` cannot persist the runtime binding metadata,
|
|
the start is a full failure: `Lobby` must issue a stop job to `Runtime Manager`
|
|
with `reason=orphan_cleanup` before setting `start_failed`.
|
|
- If metadata is persisted but `Game Master` is unavailable, the game must be
|
|
placed in `paused`, not in `start_failed`. The container is alive; only the
|
|
platform tracking is incomplete.
|
|
- No start job is accepted while the game is not in `ready_to_start`.
|
|
- Concurrent start attempts for the same game must be serialized; the second
|
|
attempt must fail if the first already moved the game to `starting`.
|
|
|
|
### Runtime Manager envelopes
|
|
|
|
`Lobby` is the producer for both `runtime:start_jobs` and `runtime:stop_jobs`.
|
|
The `Lobby ↔ Runtime Manager` transport stays asynchronous indefinitely; there
|
|
is no synchronous Lobby→RTM REST call in v1 or planned for v2.
|
|
|
|
`runtime:start_jobs` envelope:
|
|
|
|
| Field | Type | Notes |
|
|
| --- | --- | --- |
|
|
| `game_id` | string | Lobby `game_id`. |
|
|
| `image_ref` | string | Docker reference resolved from `target_engine_version` via `LOBBY_ENGINE_IMAGE_TEMPLATE`. |
|
|
| `requested_at_ms` | int64 | UTC milliseconds; diagnostics only. |
|
|
|
|
`runtime:stop_jobs` envelope:
|
|
|
|
| Field | Type | Notes |
|
|
| --- | --- | --- |
|
|
| `game_id` | string | |
|
|
| `reason` | enum | `orphan_cleanup`, `cancelled`, `finished`, `admin_request`, `timeout`. |
|
|
| `requested_at_ms` | int64 | UTC milliseconds. |
|
|
|
|
`reason` semantics (Lobby producer side):
|
|
|
|
- `orphan_cleanup` — used by Lobby's runtime-job-result consumer to release a
|
|
container whose metadata persistence failed after a successful container
|
|
start.
|
|
- `cancelled` — used by the user-lifecycle cascade and by explicit cancel paths
|
|
for in-flight games.
|
|
- `finished` — reserved; not produced by Lobby in v1 because `game_finished`
|
|
is engine-driven and stop jobs after finish are an Admin/GM concern.
|
|
- `admin_request` — reserved for future admin-initiated stop paths through
|
|
Lobby; not produced in v1.
|
|
- `timeout` — reserved for future enrollment-timeout-driven stop paths; not
|
|
produced in v1.
|
|
|
|
### Design rationale: StopReason placement
|
|
|
|
The `StopReason` enum is declared in
|
|
`lobby/internal/ports/runtimemanager.go` alongside the `RuntimeManager`
|
|
interface that consumes it. The enum is publisher-side protocol: it
|
|
mirrors the AsyncAPI discriminator on `runtime:stop_jobs`, has no
|
|
behaviour beyond `Validate`, and co-locating it with the interface keeps
|
|
the AsyncAPI ↔ Go mapping visible in one file.
|
|
|
|
Alternatives considered and rejected:
|
|
|
|
- a dedicated `lobby/internal/domain/runtimejob` package — manufactures
|
|
a domain layer for a single string enum that exists only to be
|
|
serialised onto a Redis Stream;
|
|
- placing the enum in the publisher adapter package
|
|
(`lobby/internal/adapters/runtimemanager`) — the callers (start-game
|
|
service, runtime-job-result worker, user-lifecycle worker) live
|
|
outside that package and would have to depend on a concrete adapter
|
|
for an enum value.
|
|
|
|
### Design rationale: `engineimage.Resolver` validates the template at construction
|
|
|
|
`engineimage.Resolver` stores the validated template; the per-game
|
|
`Resolve(version)` call is therefore a pure string substitution that
|
|
cannot fail except on an empty `version`.
|
|
|
|
`LOBBY_ENGINE_IMAGE_TEMPLATE` is loaded at startup. A malformed value
|
|
(missing `{engine_version}` placeholder, empty string) is an
|
|
operational misconfiguration that fails fast before any traffic arrives
|
|
— not on the first start-game request hours later. The synchronous
|
|
start handler then incurs no per-call template-shape recheck.
|
|
|
|
A stateless free function `engineimage.Resolve(template, version)` was
|
|
rejected: the only useful checkpoint for the template literal is at
|
|
startup; a free function would either re-validate on every call (waste)
|
|
or skip validation (regression).
|
|
|
|
The resolver only guards against an empty/whitespace `version`. Semver
|
|
validation lives in `lobby/internal/domain/game/model.go:validateSemver`
|
|
and runs at game-record construction time. Re-running it inside the
|
|
resolver would either duplicate the rule (drift risk) or import the
|
|
validator across package boundaries for no behavioural gain. Keeping the
|
|
resolver narrow leaves it reusable from a future producer (for example
|
|
`Game Master`, when it takes over `image_ref` resolution) without
|
|
dragging Lobby's domain rules along.
|
|
|
|
The defensive `return start game: resolve image ref: %w` in
|
|
`startgame.Service.Handle` is a guard against a future invariant
|
|
violation; it is not exercised by the service-level test suite because
|
|
the only resolver-failure mode (empty `version`) requires bypassing
|
|
`game.Validate`, which `gameinmem.Save` always runs. Adding test
|
|
scaffolding to skip validation would teach the test suite a back door
|
|
that the production code path does not have.
|
|
|
|
## Paused State
|
|
|
|
`Lobby.paused` is a platform-level pause, distinct from `Game Master` runtime
|
|
failure states. Two paths lead to `paused`:
|
|
|
|
### Voluntary pause
|
|
|
|
Admin or owner issues `lobby.game.pause` while the game is `running`.
|
|
Resume is issued with `lobby.game.resume`; `Lobby` performs a synchronous
|
|
liveness check against `Game Master` before transitioning back to `running`.
|
|
|
|
### Forced pause (GM unavailable after start)
|
|
|
|
If the game start sequence succeeds at the runtime layer but `Game Master`
|
|
registration fails, `Lobby` transitions to `paused` and publishes
|
|
`lobby.runtime_paused_after_start` to administrators.
|
|
|
|
Administrators investigate, restore `Game Master`, and issue `lobby.game.resume`
|
|
through the internal admin surface.
|
|
|
|
## Game Finish Flow
|
|
|
|
`Game Master` publishes a `game_finished` event to the GM events Redis Stream
|
|
when the engine reports that the game has ended.
|
|
|
|
`Lobby` consumes this event and, before advancing the stream offset:
|
|
|
|
- transitions game status to `finished`
|
|
- sets `finished_at` to the event timestamp
|
|
- updates the denormalized runtime snapshot with the final values
|
|
- runs the capability evaluator against every `active` membership:
|
|
- `capable = max_planets > initial_planets AND max_population >
|
|
initial_population` from the per-member stats aggregate
|
|
- capable ⇒ `RND.MarkPendingRegistration(game_id, user_id, race_name,
|
|
finished_at + 30 days)` and publish
|
|
`lobby.race_name.registration_eligible`
|
|
- not capable ⇒ `RND.ReleaseReservation(game_id, user_id, race_name)` and
|
|
(optional) publish `lobby.race_name.registration_denied`
|
|
- resolves outstanding reservations on `removed` and `blocked` memberships by
|
|
calling `RND.ReleaseReservation` (post-start remove/block keeps the
|
|
reservation alive specifically so capability evaluation resolves it here)
|
|
- deletes the per-game stats aggregate
|
|
|
|
The `game_finished` event from `Game Master` is the sole trigger for the
|
|
`finished` status. `Lobby` does not independently decide that a game is
|
|
finished. Capability evaluation must be idempotent: a replayed
|
|
`game_finished` event must not produce additional RND side effects or
|
|
notifications.
|
|
|
|
## Runtime Snapshot
|
|
|
|
`Game Lobby` stores a denormalized runtime snapshot on the game record to
|
|
prevent fan-out reads to `Game Master` on every user-facing list or detail
|
|
request, and aggregates per-member stats to support capability evaluation at
|
|
game finish.
|
|
|
|
### Denormalized snapshot fields
|
|
|
|
| Field | Source |
|
|
| --- | --- |
|
|
| `current_turn` | GM event `runtime_snapshot_update` |
|
|
| `runtime_status` | GM event `runtime_snapshot_update` |
|
|
| `engine_health_summary` | GM event `runtime_snapshot_update` |
|
|
|
|
### Per-member stats aggregate
|
|
|
|
Each `runtime_snapshot_update` carries a `player_turn_stats` array with one
|
|
entry per active member: `{user_id, planets, population, ships_built}`.
|
|
`Lobby` aggregates these in `lobby:game_turn_stats:<game_id>:<user_id>` with
|
|
the shape
|
|
`{initial_planets, initial_population, initial_ships_built, max_planets,
|
|
max_population, max_ships_built}`.
|
|
|
|
Rules:
|
|
|
|
- `initial_*` values are frozen from the first event after
|
|
`starting → running`; later events must not change them.
|
|
- `max_*` values are maintained by max-semantic update; they never decrease.
|
|
- the aggregate is read once by the capability evaluator at `game_finished`
|
|
and then deleted.
|
|
|
|
### Update mechanism
|
|
|
|
`Game Master` publishes events to a dedicated Redis Stream consumed by `Lobby`:
|
|
|
|
- `runtime_snapshot_update`: carries updated `current_turn`, `runtime_status`,
|
|
`engine_health_summary`, and `player_turn_stats`; `Lobby` applies a
|
|
compare-and-swap update on the game record plus a stats aggregate upsert.
|
|
- `game_finished`: carries final snapshot values and signals the finish
|
|
transition; capability evaluator (see §Game Finish Flow) runs before the
|
|
stream offset is advanced.
|
|
|
|
`Lobby` does not expose the runtime snapshot update as an internal HTTP
|
|
endpoint. All snapshot updates are asynchronous and delivered through the
|
|
stream.
|
|
|
|
## Public vs Private Game Rules
|
|
|
|
### Public games
|
|
|
|
- created and controlled by system administrators through the internal admin surface
|
|
- visible in the public game list when in `enrollment_open`, `ready_to_start`,
|
|
`running`, or `finished` status
|
|
- `draft` public games are not visible to non-admin users
|
|
- players join through the application flow; admission requires admin approval
|
|
- turn schedule and engine version are set by the administrator
|
|
|
|
### Private games
|
|
|
|
- created only by eligible paid users whose `User Service` eligibility snapshot
|
|
carries `can_create_private_game=true` and whose `max_owned_private_games`
|
|
limit allows it
|
|
- visible only to the owner and to users who have an active membership or a
|
|
non-expired invite
|
|
- `draft` private games are visible only to the owner
|
|
- players join through the invite flow; invite redemption creates active
|
|
membership immediately without further owner approval
|
|
- owner manages invites, turn schedule, and engine version
|
|
|
|
## Owner-Admin Capabilities
|
|
|
|
Private-game owners have a limited owner-admin capability set over their own
|
|
games only:
|
|
|
|
- open enrollment (`draft` → `enrollment_open`)
|
|
- create and revoke invites
|
|
- manually close enrollment (`enrollment_open` → `ready_to_start`)
|
|
- start the game (`ready_to_start` → `starting`)
|
|
- pause and resume the game (`running` ↔ `paused`)
|
|
- retry start or cancel after `start_failed`
|
|
- remove or block members
|
|
- cancel the game (from `draft`, `enrollment_open`, `ready_to_start`, `start_failed`)
|
|
|
|
Owners do not have system-admin power.
|
|
They cannot see or operate on other users' private games.
|
|
They cannot approve or reject applications (applications are public-game only).
|
|
|
|
## Trusted Surfaces
|
|
|
|
### Public authenticated REST (gateway-facing)
|
|
|
|
All user-facing commands arrive through `Edge Gateway`.
|
|
Gateway verifies the authenticated session, transcodes the FlatBuffers command
|
|
to a trusted REST call, and forwards it to `Lobby` on the public port.
|
|
|
|
Gateway enriches each request with the authenticated `user_id` via the
|
|
`X-User-ID` header.
|
|
`Lobby` must never derive the acting user from the request payload.
|
|
|
|
#### Message type catalog
|
|
|
|
| `message_type` | Method | Path | Actor |
|
|
| --- | --- | --- | --- |
|
|
| `lobby.game.create` | `POST` | `/api/v1/lobby/games` | admin (public), eligible user (private) |
|
|
| `lobby.game.update` | `PATCH` | `/api/v1/lobby/games/{game_id}` | admin or owner; draft only |
|
|
| `lobby.game.get` | `GET` | `/api/v1/lobby/games/{game_id}` | any authenticated user (visibility rules apply) |
|
|
| `lobby.games.list` | `GET` | `/api/v1/lobby/games` | any authenticated user |
|
|
| `lobby.game.open_enrollment` | `POST` | `/api/v1/lobby/games/{game_id}/open-enrollment` | admin or owner |
|
|
| `lobby.game.ready_to_start` | `POST` | `/api/v1/lobby/games/{game_id}/ready-to-start` | admin or owner |
|
|
| `lobby.game.start` | `POST` | `/api/v1/lobby/games/{game_id}/start` | admin or owner |
|
|
| `lobby.game.pause` | `POST` | `/api/v1/lobby/games/{game_id}/pause` | admin or owner |
|
|
| `lobby.game.resume` | `POST` | `/api/v1/lobby/games/{game_id}/resume` | admin or owner |
|
|
| `lobby.game.cancel` | `POST` | `/api/v1/lobby/games/{game_id}/cancel` | admin or owner |
|
|
| `lobby.game.retry_start` | `POST` | `/api/v1/lobby/games/{game_id}/retry-start` | admin or owner |
|
|
| `lobby.application.submit` | `POST` | `/api/v1/lobby/games/{game_id}/applications` | authenticated user |
|
|
| `lobby.application.approve` | `POST` | `/api/v1/lobby/games/{game_id}/applications/{application_id}/approve` | admin |
|
|
| `lobby.application.reject` | `POST` | `/api/v1/lobby/games/{game_id}/applications/{application_id}/reject` | admin |
|
|
| `lobby.invite.create` | `POST` | `/api/v1/lobby/games/{game_id}/invites` | private-game owner |
|
|
| `lobby.invite.redeem` | `POST` | `/api/v1/lobby/games/{game_id}/invites/{invite_id}/redeem` | invited user |
|
|
| `lobby.invite.decline` | `POST` | `/api/v1/lobby/games/{game_id}/invites/{invite_id}/decline` | invited user |
|
|
| `lobby.invite.revoke` | `POST` | `/api/v1/lobby/games/{game_id}/invites/{invite_id}/revoke` | private-game owner |
|
|
| `lobby.membership.remove` | `POST` | `/api/v1/lobby/games/{game_id}/memberships/{membership_id}/remove` | admin or owner |
|
|
| `lobby.membership.block` | `POST` | `/api/v1/lobby/games/{game_id}/memberships/{membership_id}/block` | admin or owner |
|
|
| `lobby.memberships.list` | `GET` | `/api/v1/lobby/games/{game_id}/memberships` | admin, owner, or active member |
|
|
| `lobby.my_games.list` | `GET` | `/api/v1/lobby/my/games` | authenticated user |
|
|
| `lobby.my_applications.list` | `GET` | `/api/v1/lobby/my/applications` | authenticated user |
|
|
| `lobby.my_invites.list` | `GET` | `/api/v1/lobby/my/invites` | authenticated user |
|
|
| `lobby.race_name.register` | `POST` | `/api/v1/lobby/race-names/register` | authenticated user |
|
|
| `lobby.race_names.list` | `GET` | `/api/v1/lobby/my/race-names` | authenticated user |
|
|
|
|
### Internal trusted REST (internal-facing)
|
|
|
|
The internal port is not reachable from the public internet.
|
|
It is used by `Game Master` for the synchronous registration call and by the
|
|
administrative backend for admin-only operations.
|
|
|
|
Key internal endpoints:
|
|
|
|
| Method | Path | Purpose |
|
|
| --- | --- | --- |
|
|
| `GET` | `/api/v1/internal/games/{game_id}` | game detail read for GM/admin |
|
|
| `GET` | `/api/v1/internal/games/{game_id}/memberships` | full membership list for GM |
|
|
| `GET` | `/api/v1/internal/healthz` | health probe |
|
|
| `GET` | `/api/v1/internal/readyz` | readiness probe |
|
|
|
|
Note: the registration call from Lobby to Game Master after a successful
|
|
container start is **outgoing** — Lobby calls
|
|
`POST /api/v1/internal/games/{game_id}/register-runtime` on Game Master's
|
|
internal port. Lobby does not expose an inbound `register-runtime`
|
|
endpoint.
|
|
|
|
Admin-only operations (approve, reject, cancel, create public games, etc.) are
|
|
also exposed on the internal port and are intended to be called by `Admin Service`
|
|
after it enforces the system-admin role check at the gateway boundary.
|
|
|
|
## User-Facing Lists
|
|
|
|
### My active games
|
|
|
|
Returns games where the authenticated user has an active membership and the game
|
|
status is `running` or `paused`.
|
|
Response includes the denormalized runtime snapshot.
|
|
|
|
### My pending applications
|
|
|
|
Returns applications submitted by the authenticated user with status `submitted`.
|
|
Includes game name and type for display.
|
|
|
|
### My open invitations
|
|
|
|
Returns invites addressed to the authenticated user with status `created`.
|
|
Includes game name, inviter name, and `expires_at`.
|
|
|
|
### Public game list
|
|
|
|
Paginated list of public games with status in
|
|
`enrollment_open`, `ready_to_start`, `running`, or `finished`.
|
|
Games in `draft` or `cancelled` are excluded.
|
|
Default order: `enrollment_open` and `ready_to_start` first, then `running`, then
|
|
`finished` (most recent first within each group).
|
|
|
|
### Visibility rules
|
|
|
|
- private `draft` games: visible only to the owner
|
|
- private non-draft games: visible only to the owner and users with active
|
|
membership or non-expired invite
|
|
- public `draft` games: visible only to system administrators
|
|
- public non-draft games: visible in the public list
|
|
|
|
## Notification Contracts
|
|
|
|
`Game Lobby` publishes normalized notification intents to `notification:intents`
|
|
using the `galaxy/notificationintent` producer module.
|
|
|
|
| Trigger | `notification_type` | Audience | Channels |
|
|
| --- | --- | --- | --- |
|
|
| Application submitted (public game) | `lobby.application.submitted` | configured admin email list | `email` |
|
|
| Application approved | `lobby.membership.approved` | applicant user | `push+email` |
|
|
| Application rejected | `lobby.membership.rejected` | applicant user | `push+email` |
|
|
| Cascade membership block (`permanent_block`/`DeleteUser`) | `lobby.membership.blocked` | private-game owner | `push+email` |
|
|
| Invite created (private game) | `lobby.invite.created` | invited user | `push+email` |
|
|
| Invite redeemed (private game) | `lobby.invite.redeemed` | private-game owner | `push+email` |
|
|
| Invite expired (on enrollment close) | `lobby.invite.expired` | private-game owner | `email` |
|
|
| GM unavailable after start (forced pause) | `lobby.runtime_paused_after_start` | configured admin email list | `email` |
|
|
| Race name eligible for registration | `lobby.race_name.registration_eligible` | capable member | `push+email` |
|
|
| Race name successfully registered | `lobby.race_name.registered` | registering user | `push+email` |
|
|
| Race name registration denied (capability) | `lobby.race_name.registration_denied` | incapable member | `email` |
|
|
|
|
Rules:
|
|
|
|
- intents carry explicit `recipient_user_id` values; `Lobby` resolves recipients
|
|
before publishing rather than delegating audience resolution to `Notification Service`
|
|
- a failed intent publication is a notification degradation and must not roll back
|
|
already committed business state
|
|
- `lobby.invite.revoked` and `lobby.invite.declined` produce no notification in v1
|
|
- `lobby.application.submitted` is published only for public games; the private-game
|
|
owner-targeting path defined in the notification catalog is reserved for future use
|
|
|
|
## Domain Events
|
|
|
|
`Game Lobby` publishes auxiliary post-commit domain events to the Redis stream
|
|
configured for lobby domain events.
|
|
|
|
Frozen event types:
|
|
|
|
- `lobby.game.created`
|
|
- `lobby.game.status_changed`
|
|
- `lobby.membership.activated`
|
|
- `lobby.membership.removed`
|
|
- `lobby.membership.blocked`
|
|
|
|
Event rules:
|
|
|
|
- events are post-commit only; they are not emitted on failed operations
|
|
- event envelopes carry `game_id`, optional `user_id`, occurrence timestamp,
|
|
new status (for `status_changed`), and optional trace correlation
|
|
- domain events are observability and downstream-read-model artifacts;
|
|
they must not carry full business state payloads
|
|
|
|
## Error Model
|
|
|
|
The trusted internal REST contract uses strict JSON error envelopes:
|
|
|
|
```json
|
|
{
|
|
"error": {
|
|
"code": "invalid_request",
|
|
"message": "request is invalid"
|
|
}
|
|
}
|
|
```
|
|
|
|
Stable error codes:
|
|
|
|
- `invalid_request` — malformed input or failed validation
|
|
- `conflict` — state transition not allowed from current status
|
|
- `subject_not_found` — game, application, invite, membership, or pending
|
|
race-name registration not found
|
|
- `eligibility_denied` — user not eligible per `User Service`
|
|
- `name_taken` — `race_name` already registered, reserved, or pending for
|
|
another user
|
|
- `race_name_registration_quota_exceeded` — user's `max_registered_race_names`
|
|
slot is full
|
|
- `race_name_pending_window_expired` — the 30-day registration window has
|
|
passed for the pending entry
|
|
- `race_name_capability_not_met` — capability condition not satisfied at
|
|
game finish (reservation released)
|
|
- `race_name_permanent_blocked` — the user carries an active
|
|
`permanent_block` sanction
|
|
- `forbidden` — caller is not authorized for this operation on this game or
|
|
this race name
|
|
- `internal_error` — unexpected service error
|
|
- `service_unavailable` — upstream dependency unavailable
|
|
|
|
## Configuration
|
|
|
|
### Required
|
|
|
|
- `LOBBY_REDIS_MASTER_ADDR`
|
|
- `LOBBY_REDIS_PASSWORD`
|
|
- `LOBBY_POSTGRES_PRIMARY_DSN`
|
|
- `LOBBY_USER_SERVICE_BASE_URL`
|
|
- `LOBBY_GM_BASE_URL`
|
|
|
|
### Configuration groups
|
|
|
|
Process and logging:
|
|
|
|
- `LOBBY_SHUTDOWN_TIMEOUT` with default `30s`
|
|
- `LOBBY_LOG_LEVEL` with default `info`
|
|
|
|
Public HTTP:
|
|
|
|
- `LOBBY_PUBLIC_HTTP_ADDR` with default `:8094`
|
|
- `LOBBY_PUBLIC_HTTP_READ_HEADER_TIMEOUT` with default `2s`
|
|
- `LOBBY_PUBLIC_HTTP_READ_TIMEOUT` with default `10s`
|
|
- `LOBBY_PUBLIC_HTTP_IDLE_TIMEOUT` with default `1m`
|
|
|
|
Internal HTTP:
|
|
|
|
- `LOBBY_INTERNAL_HTTP_ADDR` with default `:8095`
|
|
- `LOBBY_INTERNAL_HTTP_READ_HEADER_TIMEOUT` with default `2s`
|
|
- `LOBBY_INTERNAL_HTTP_READ_TIMEOUT` with default `10s`
|
|
- `LOBBY_INTERNAL_HTTP_IDLE_TIMEOUT` with default `1m`
|
|
|
|
Redis connectivity:
|
|
|
|
- `LOBBY_REDIS_MASTER_ADDR` (required)
|
|
- `LOBBY_REDIS_REPLICA_ADDRS` (optional, comma-separated; not consumed yet)
|
|
- `LOBBY_REDIS_PASSWORD` (required)
|
|
- `LOBBY_REDIS_DB` (default 0)
|
|
- `LOBBY_REDIS_OPERATION_TIMEOUT` (default 250ms)
|
|
|
|
The legacy `LOBBY_REDIS_ADDR`, `LOBBY_REDIS_USERNAME`, and
|
|
`LOBBY_REDIS_TLS_ENABLED` env vars were retired in PG_PLAN.md §6A; setting
|
|
either of the latter two now fails fast at startup. See
|
|
`ARCHITECTURE.md §Persistence Backends` for the architectural rules.
|
|
|
|
PostgreSQL connectivity (PG_PLAN.md §6A and §6B; durable game /
|
|
application / invite / membership records and the Race Name Directory
|
|
live here):
|
|
|
|
- `LOBBY_POSTGRES_PRIMARY_DSN` (required;
|
|
e.g. `postgres://lobbyservice:secret@postgres:5432/galaxy?search_path=lobby&sslmode=disable`)
|
|
- `LOBBY_POSTGRES_REPLICA_DSNS` (optional, comma-separated; not consumed yet)
|
|
- `LOBBY_POSTGRES_OPERATION_TIMEOUT` (default 1s)
|
|
- `LOBBY_POSTGRES_MAX_OPEN_CONNS` (default 25)
|
|
- `LOBBY_POSTGRES_MAX_IDLE_CONNS` (default 5)
|
|
- `LOBBY_POSTGRES_CONN_MAX_LIFETIME` (default 30m)
|
|
|
|
Stream names:
|
|
|
|
- `LOBBY_GM_EVENTS_STREAM` with default `gm:lobby_events`
|
|
- `LOBBY_GM_EVENTS_READ_BLOCK_TIMEOUT` with default `2s`
|
|
- `LOBBY_RUNTIME_START_JOBS_STREAM` with default `runtime:start_jobs`
|
|
- `LOBBY_RUNTIME_STOP_JOBS_STREAM` with default `runtime:stop_jobs`
|
|
- `LOBBY_RUNTIME_JOB_RESULTS_STREAM` with default `runtime:job_results`
|
|
- `LOBBY_RUNTIME_JOB_RESULTS_READ_BLOCK_TIMEOUT` with default `2s`
|
|
- `LOBBY_NOTIFICATION_INTENTS_STREAM` with default `notification:intents`
|
|
|
|
Runtime Manager integration:
|
|
|
|
- `LOBBY_ENGINE_IMAGE_TEMPLATE` with default `galaxy/game:{engine_version}` —
|
|
Go-style template applied to a game's `target_engine_version` to resolve
|
|
the Docker `image_ref` published on `runtime:start_jobs`. The template
|
|
must contain the literal placeholder `{engine_version}`; Lobby fails
|
|
fast at startup otherwise.
|
|
|
|
Upstream clients:
|
|
|
|
- `LOBBY_USER_SERVICE_TIMEOUT` with default `1s`
|
|
- `LOBBY_GM_TIMEOUT` with default `5s`
|
|
|
|
Enrollment automation:
|
|
|
|
- `LOBBY_ENROLLMENT_AUTOMATION_INTERVAL` with default `30s`
|
|
|
|
Race Name Directory:
|
|
|
|
- `LOBBY_RACE_NAME_DIRECTORY_BACKEND` with default `postgres`
|
|
(alternate: `stub` for in-process tests; PG_PLAN.md §6B retired the
|
|
`redis` backend)
|
|
- `LOBBY_RACE_NAME_EXPIRATION_INTERVAL` with default `1h` — pending
|
|
registration expiration worker tick
|
|
|
|
The 30-day eligibility window for `pending_registration` entries is the
|
|
constant `service/capabilityevaluation.PendingRegistrationWindow`. It is
|
|
intentionally not operator-tunable today; the env var name
|
|
`LOBBY_PENDING_REGISTRATION_TTL_HOURS` is reserved for a future change.
|
|
|
|
User lifecycle:
|
|
|
|
- `LOBBY_USER_LIFECYCLE_STREAM` with default `user:lifecycle_events`
|
|
- `LOBBY_USER_LIFECYCLE_READ_BLOCK_TIMEOUT` with default `2s`
|
|
|
|
OpenTelemetry:
|
|
|
|
- standard `OTEL_*` variables
|
|
- `LOBBY_OTEL_STDOUT_TRACES_ENABLED`
|
|
- `LOBBY_OTEL_STDOUT_METRICS_ENABLED`
|
|
|
|
## Persistence Layout
|
|
|
|
Game / application / invite / membership records live in PostgreSQL after
|
|
PG_PLAN.md §6A; the Race Name Directory followed in §6B. See
|
|
`docs/postgres-migration.md` for the schema and decision records. The
|
|
`lobby` schema owns five tables — `games`, `applications`, `invites`,
|
|
`memberships`, `race_names` — plus the partial UNIQUE index on
|
|
`applications(applicant_user_id, game_id) WHERE status <> 'rejected'` that
|
|
enforces the single-active-application invariant and the partial UNIQUE
|
|
index on `race_names(canonical_key) WHERE binding_kind = 'registered'`
|
|
that enforces single-registered-per-canonical.
|
|
|
|
The Redis-backed keys below survive both stages. Redis owns the
|
|
runtime-coordination state — per-game runtime aggregates, gap activation,
|
|
capability-evaluation guards, and stream consumer offsets — plus the
|
|
event-bus streams themselves.
|
|
|
|
### Redis key table
|
|
|
|
Storage rules for Redis:
|
|
|
|
- timestamps are stored in Unix milliseconds unless noted otherwise
|
|
- dynamic key segments are base64url-encoded
|
|
|
|
| Logical artifact | Redis key |
|
|
| --- | --- |
|
|
| per-game per-user stats aggregate | `lobby:game_turn_stats:<game_id>:<user_id>` → JSON aggregate |
|
|
| per-game stats user index | `lobby:game_turn_stats_by_game:<game_id>` (set of `user_id`) |
|
|
| capability-evaluation guard | `lobby:capability_evaluation:done:<game_id>` (sentinel string) |
|
|
| GM event stream offset | `lobby:stream_offsets:gm_events` |
|
|
| runtime job result offset | `lobby:stream_offsets:runtime_results` |
|
|
| user lifecycle stream offset | `lobby:stream_offsets:user_lifecycle` |
|
|
| gap window activation time | `lobby:gap_activated_at:<game_id>` |
|
|
|
|
### Frozen record fields
|
|
|
|
The five durable records are stored in PostgreSQL columns; the field set
|
|
per record is unchanged from the previous Redis JSON shape and is
|
|
documented inline with the migration scripts under
|
|
`internal/adapters/postgres/migrations/`.
|
|
|
|
| Record | Frozen fields |
|
|
| --- | --- |
|
|
| game record | all game fields listed in Game Record Model section |
|
|
| application record | `application_id`, `game_id`, `applicant_user_id`, `race_name`, `status`, `created_at`, `decided_at` |
|
|
| invite record | `invite_id`, `game_id`, `inviter_user_id`, `invitee_user_id`, `race_name` (set at redeem), `status`, `created_at`, `expires_at`, `decided_at` |
|
|
| membership record | all membership fields listed in Membership Model section |
|
|
| race_names row | `canonical_key`, `game_id`, `holder_user_id`, `race_name`, `binding_kind`, `source_game_id`, `reserved_at_ms`, `eligible_until_ms` (pending only), `registered_at_ms` (registered only) |
|
|
|
|
## Observability
|
|
|
|
### Metrics
|
|
|
|
- `lobby.game.transitions` — counter; attributes: `from_status`, `to_status`, `trigger` (`command`, `manual`, `deadline`, `gap`, `runtime_event`, `external_block`)
|
|
- `lobby.application.outcomes` — counter; attributes: `outcome` (`submitted`, `approved`, `rejected`)
|
|
- `lobby.invite.outcomes` — counter; attributes: `outcome` (`created`, `redeemed`, `declined`, `revoked`, `expired`)
|
|
- `lobby.membership.changes` — counter; attributes: `change` (`activated`, `removed`, `blocked`, `external_block`)
|
|
- `lobby.start_flow.outcomes` — counter; attributes: `outcome` (`running`, `paused`, `start_failed`)
|
|
- `lobby.notification.publish_attempts` — counter; attributes: `notification_type`, `result` (`ok`, `error`)
|
|
- `lobby.active_games` — observable gauge; attributes: `status`
|
|
- `lobby.enrollment_automation.checks` — counter; attributes: `result` (`no_op`, `transitioned`)
|
|
- `lobby.gm_events.oldest_unprocessed_age_ms` — observable gauge
|
|
- `lobby.runtime_results.oldest_unprocessed_age_ms` — observable gauge
|
|
- `lobby.user_lifecycle.oldest_unprocessed_age_ms` — observable gauge
|
|
- `lobby.race_name.outcomes` — counter; attributes: `outcome` (`reserved`, `reservation_released`, `pending_created`, `pending_released`, `registered`, `registered_released`)
|
|
- `lobby.pending_registration.expirations` — counter; attributes: `trigger` (`tick`, `manual`)
|
|
- `lobby.user_lifecycle.cascade_releases` — counter; attributes: `event` (`permanent_blocked`, `deleted`)
|
|
- `lobby.capability_evaluations` — counter; attributes: `result` (`capable`, `incapable`, `noop`)
|
|
|
|
Metrics avoid high-cardinality attributes such as `game_id`, `user_id`,
|
|
`application_id`, `invite_id`, and `canonical_key`.
|
|
|
|
### Structured log fields
|
|
|
|
Key operations emit structured logs with these stable field names where applicable:
|
|
|
|
- `game_id`
|
|
- `game_type`
|
|
- `game_status`
|
|
- `from_status`
|
|
- `to_status`
|
|
- `user_id`
|
|
- `application_id`
|
|
- `invite_id`
|
|
- `membership_id`
|
|
- `race_name`
|
|
- `canonical_key`
|
|
- `reservation_kind` (`reserved` / `pending_registration` / `registered`)
|
|
- `eligible_until_ms`
|
|
- `trigger`
|
|
- `lifecycle_event`
|
|
- `request_id`
|
|
- `trace_id`
|
|
|
|
## Verification
|
|
|
|
Test doubles split between two styles. Wide-surface ports with no
|
|
production state (`RuntimeManager`, `IntentPublisher`, `GMClient`,
|
|
`UserService`) use `gomock`-generated mocks under
|
|
`internal/adapters/mocks/`; regenerate with `make -C lobby mocks`.
|
|
Stateful behavioural fakes that mirror the production adapter
|
|
contract (`gameinmem`, `applicationinmem`, `inviteinmem`,
|
|
`membershipinmem`, `gameturnstatsinmem`, `racenameinmem`,
|
|
`evaluationguardinmem`, `gapactivationinmem`, `streamoffsetinmem`)
|
|
live as in-memory adapters under `internal/adapters/<name>inmem/`
|
|
and stay hand-rolled because tests rely on their CAS, status-transition,
|
|
and invariant-tracking behaviour.
|
|
|
|
Focused service-local coverage verifies:
|
|
|
|
- configuration loading and validation for all env var groups
|
|
- both HTTP listeners start and serve `/healthz` and `/readyz`
|
|
- game CRUD: create, update, get, list with correct field validation
|
|
- each status transition fires only from allowed source statuses
|
|
- enrollment automation: deadline trigger, gap trigger, manual trigger
|
|
- application flow: submit (eligibility check, race name check), approve, reject
|
|
- invite flow: create, redeem (auto-membership), decline, revoke, expire on enrollment close
|
|
- membership model: activate, remove, block with correct before/after-start semantics
|
|
- Race Name Directory (PostgreSQL + in-memory adapters against the same suite):
|
|
canonicalization + confusable-pair policy, `Reserve`/`ReleaseReservation`
|
|
per-game semantics, `MarkPendingRegistration`/`ExpirePendingRegistrations`
|
|
window, `Register` idempotency + quota, `ReleaseAllByUser` cascade
|
|
- game start flow: success path (→ running), GM unavailable path (→ paused),
|
|
container failure path (→ start_failed), metadata persistence failure path
|
|
(container removed, → start_failed)
|
|
- GM event stream consumer: snapshot update (stats aggregate),
|
|
`game_finished` with capability evaluation
|
|
- user lifecycle stream consumer: `permanent_blocked` and `deleted`
|
|
cascade release + membership/application/invite settlement
|
|
- pending-registration expiration worker idempotency
|
|
- race name registration service: capability, tariff quota, pending window,
|
|
idempotent retry
|
|
- notification intent publication for all ten supported triggers
|
|
- visibility rules: private game hidden from non-member non-owner users
|
|
- error model: all stable codes returned for correct conditions
|
|
|
|
Cross-service coverage verifies:
|
|
|
|
- `Lobby → User Service` eligibility check compatibility (including the new
|
|
`max_registered_race_names` field) and failure handling
|
|
- `Lobby → Notification Service` intent publication for all lobby notification types
|
|
- `Lobby → Runtime Manager` start job publication and result consumption
|
|
- `Lobby → Game Master` synchronous registration call (success and failure)
|
|
- `User Service → Lobby` cascade flow: permanent_block or DeleteUser on a
|
|
user leads to full RND release + memberships blocked + applications/invites
|
|
cancelled
|