3b1c52cd02
TurnWipeExtinctRaces iterated only non-extinct races, so an administratively banished race (flagged extinct, TTL untouched) was never wiped: its planets stayed owned and its ships lingered, while the race itself could no longer act. The loop now covers every race and wipes when either an active race's TTL has run out (idle / quit) or an extinct race still holds assets (banish). The asset check makes repeated passes idempotent. wipeRace already matched the rules for exclusion (ships removed, planets uninhabited, industry and capital cleared, material retained), so the behaviour is just documented in game/README.md. Tests: banish releases planets and ships on the next turn (and is idempotent); idle-timeout wipe still fires under the new iterator. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
246 lines
11 KiB
Markdown
246 lines
11 KiB
Markdown
# Game Service Engine
|
|
|
|
`galaxy/game` is the game engine binary that runs inside one
|
|
`galaxy-game-{game_id}` container. It hosts a single game instance and exposes
|
|
a REST API for game initialization, turn advancement, player reports, and
|
|
batched player command execution.
|
|
|
|
## References
|
|
|
|
- [`openapi.yaml`](openapi.yaml) — REST contract.
|
|
- [`../docs/ARCHITECTURE.md`](../docs/ARCHITECTURE.md) — system architecture.
|
|
- [`../rtmanager/README.md`](../rtmanager/README.md) — Runtime Manager owns
|
|
container lifecycle for this binary.
|
|
|
|
## Container model
|
|
|
|
The engine is meant to be run inside a Docker container managed by
|
|
`Runtime Manager`. One container hosts exactly one game instance and listens
|
|
on TCP `:8080` inside the container. Outside the container the endpoint is
|
|
addressed as `http://galaxy-game-{game_id}:8080` through Docker's embedded DNS
|
|
on the configured `RTMANAGER_DOCKER_NETWORK`.
|
|
|
|
The container image is built from [`Dockerfile`](Dockerfile) at the root of
|
|
this module. The Dockerfile is a multi-stage build (Go builder + small runtime
|
|
base) that exposes `:8080`, runs as a non-root user, and ships container
|
|
labels that `Runtime Manager` reads at create time:
|
|
|
|
| Label | Meaning |
|
|
| --- | --- |
|
|
| `com.galaxy.cpu_quota` | CPU quota for the container (`--cpus`). |
|
|
| `com.galaxy.memory` | Memory limit for the container (`--memory`). |
|
|
| `com.galaxy.pids_limit` | PID limit for the container (`--pids-limit`). |
|
|
| `org.opencontainers.image.title` | `galaxy-game-engine`. |
|
|
|
|
Image defaults are `cpu_quota=1.0`, `memory=512m`, `pids_limit=512`. Operators
|
|
override them at image-build time by editing the Dockerfile labels; producers
|
|
do not pass per-game limits.
|
|
|
|
## Endpoints
|
|
|
|
The contract is the union of `openapi.yaml` and the technical liveness probe
|
|
described below. Endpoints split into two route classes:
|
|
|
|
| Class | Path | Caller | Purpose |
|
|
| --- | --- | --- | --- |
|
|
| Admin (GM-only) | `POST /api/v1/admin/init` | `Game Master` | Initialise the engine with a canonical `gameId` and the race roster. |
|
|
| Admin (GM-only) | `GET /api/v1/admin/status` | `Game Master` | Read the full game state. |
|
|
| Admin (GM-only) | `PUT /api/v1/admin/turn` | `Game Master` | Generate the next turn. |
|
|
| Admin (GM-only) | `POST /api/v1/admin/race/banish` | `Game Master` | Deactivate a race after a permanent platform removal. |
|
|
| Player | `PUT /api/v1/order` | `Game Master` | Validate and store a batch of player orders. |
|
|
| Player | `GET /api/v1/order` | `Game Master` | Fetch the previously stored player order for a turn. |
|
|
| Player | `GET /api/v1/report` | `Game Master` | Fetch the per-player turn report. |
|
|
| Probe | `GET /healthz` | `Runtime Manager` | Technical liveness probe. |
|
|
|
|
Admin paths are unauthenticated but are routed only from inside the
|
|
trusted network segment that connects `Game Master` to the engine
|
|
container. The engine does not enforce caller identity — network-level
|
|
segmentation is the boundary. Player paths apply the same rule and rely
|
|
on `Game Master` to forward only verified player payloads.
|
|
|
|
### Game endpoints
|
|
|
|
Documented in [`openapi.yaml`](openapi.yaml). When the engine has not been
|
|
initialised through `POST /api/v1/admin/init`, game endpoints respond
|
|
`501 Not Implemented` to make the uninitialised state unambiguous.
|
|
|
|
### `POST /api/v1/admin/init`
|
|
|
|
The canonical game identity is owned by the orchestrator (`Game Master`),
|
|
not by the engine. The request body is `{ "gameId": "<uuid>", "races": [...] }`
|
|
where:
|
|
|
|
- `gameId` is a non-zero UUID generated by the orchestrator before the
|
|
engine container is launched. The same value names the engine's host
|
|
storage directory and is persisted into `state.json`. The engine
|
|
rejects the zero UUID with `400 Bad Request` and any value that
|
|
conflicts with an existing `state.json` on disk with
|
|
`409 Conflict`. A second `init` on the same `gameId` is also
|
|
rejected with `409`; idempotency is not part of the contract.
|
|
- `races` is the race roster; minimum 10 entries.
|
|
|
|
On success the engine responds `201 Created` with a `StateResponse`
|
|
whose `id` echoes the supplied `gameId`.
|
|
|
|
### `StateResponse.finished`
|
|
|
|
`StateResponse` (returned by `GET /api/v1/admin/status` and
|
|
`PUT /api/v1/admin/turn`) carries a required boolean `finished` field.
|
|
The engine sets it to `true` exactly once on the turn-generation response
|
|
that ends the game; otherwise it stays `false`. `Game Master` uses this
|
|
field as the sole signal to run the platform finish flow. The conditional
|
|
logic that flips `finished` to `true` lives in the engine's domain code
|
|
and is owned by the engine maintainers.
|
|
|
|
### `POST /api/v1/admin/race/banish`
|
|
|
|
Deactivates a race after a permanent platform-level membership removal.
|
|
`Game Master` calls this endpoint synchronously after a Lobby-driven
|
|
remove-and-banish flow.
|
|
|
|
- Request body: `{ "race_name": "<name>" }`. `race_name` must be
|
|
non-empty and must match an existing race in the engine's roster.
|
|
- Successful response: `204 No Content` with an empty body.
|
|
- Error responses follow the same `400` / `500` envelope shape as the
|
|
other admin endpoints. `banish` only flags the race extinct, so it can
|
|
no longer submit or have orders applied; its assets are released at the
|
|
start of the next turn generation (`TurnWipeExtinctRaces`), the same way
|
|
an idle/quit timeout is handled but without the wait — ship groups and
|
|
fleets are removed, its planets become uninhabited (the working industry
|
|
and the capital stockpile are cleared, raw material is retained), and
|
|
votes cast for it are reset.
|
|
|
|
### `GET /healthz`
|
|
|
|
Technical liveness probe used by `Runtime Manager` and operator tooling.
|
|
|
|
- Returns `{"status":"ok"}` with HTTP `200` whenever the HTTP server is
|
|
serving requests, regardless of whether the engine has been initialised
|
|
through `POST /api/v1/admin/init`.
|
|
- Carries no game-state semantics. Use `GET /api/v1/admin/status` for
|
|
game-state inspection.
|
|
|
|
This endpoint exists so that `Runtime Manager` can probe a freshly started
|
|
container before `init` runs.
|
|
|
|
## Storage
|
|
|
|
The engine reads its persistent storage path from environment variables in
|
|
the following order of precedence:
|
|
|
|
1. `STORAGE_PATH` — historical name; honoured for backward compatibility.
|
|
2. `GAME_STATE_PATH` — canonical name written by `Runtime Manager`.
|
|
|
|
If both are set, `STORAGE_PATH` wins. If neither is set, the binary fails
|
|
fast on startup. The Dockerfile defaults `STORAGE_PATH=/var/lib/galaxy-game`
|
|
so the image runs out of the box if the operator does not supply either
|
|
variable.
|
|
|
|
`Runtime Manager` creates a per-game host directory under
|
|
`<RTMANAGER_GAME_STATE_ROOT>/{game_id}` and bind-mounts it into the container
|
|
at `RTMANAGER_ENGINE_STATE_MOUNT_PATH` (default `/var/lib/galaxy-game`). The
|
|
mount path is then exposed to the engine through `GAME_STATE_PATH` (and, for
|
|
compatibility, also as `STORAGE_PATH`).
|
|
|
|
The engine is responsible for the contents of the storage directory.
|
|
`Runtime Manager` never reads or writes the directory contents, never
|
|
deletes the directory, and never inspects per-game state files.
|
|
|
|
### Design rationale: storage-path env precedence
|
|
|
|
`STORAGE_PATH` wins over `GAME_STATE_PATH` because the engine already
|
|
shipped with `STORAGE_PATH` (see `game/Makefile` and
|
|
`game/internal/router/handler/handler.go`). Keeping `STORAGE_PATH` as
|
|
the authoritative variable means existing engine deployments and
|
|
integration fixtures continue to work without code change, while
|
|
`GAME_STATE_PATH` is the platform contract written by `Runtime Manager`
|
|
and documented in `ARCHITECTURE.md §9`.
|
|
|
|
Alternatives considered and rejected:
|
|
|
|
- accept only `GAME_STATE_PATH` — would force a breaking change on the
|
|
engine binary and on every existing `STORAGE_PATH=...` invocation in
|
|
`game/Makefile` and dev scripts;
|
|
- `GAME_STATE_PATH` wins over `STORAGE_PATH` — would silently invert
|
|
the meaning of an explicit `STORAGE_PATH=` invocation if the operator
|
|
also sets `GAME_STATE_PATH` for any reason.
|
|
|
|
### Design rationale: storage-path validation site
|
|
|
|
`game/internal/router/handler/handler.go` exports `ResolveStoragePath`,
|
|
which returns the engine storage path from the env-var pair above and
|
|
an error when neither is set. `cmd/http/main.go` calls it once at
|
|
startup, prints the error to stderr and exits non-zero on failure, then
|
|
builds the engine service (`controller.NewService(path)`) and hands it
|
|
to `router.NewRouter`.
|
|
|
|
Storage is resolved exactly once, at construction, rather than per
|
|
request: the `Service` holds the file-backed repo for the process
|
|
lifetime and `router.NewRouter` takes the `handler.Engine` it routes
|
|
to (in production, the `Service`). This keeps the env binding in one
|
|
place — a startup helper plus the `main` check — and leaves the
|
|
handlers free of configuration concerns.
|
|
|
|
## Build
|
|
|
|
The container image is built from [`Dockerfile`](Dockerfile). The Docker
|
|
build context is the workspace root (`galaxy/`) rather than the `game/`
|
|
subdirectory, because `game/` resolves `galaxy/{model,error,util,...}`
|
|
through `go.work` `replace` directives. From the workspace root:
|
|
|
|
```sh
|
|
docker build -t galaxy/game:test -f game/Dockerfile .
|
|
```
|
|
|
|
The build is two-staged: a `golang:1.26.2-alpine` builder produces a
|
|
statically linked binary (`CGO_ENABLED=0`), then `gcr.io/distroless/static-debian12:nonroot`
|
|
runs it as the `nonroot` user and exposes `:8080`.
|
|
|
|
### Design rationale: workspace-root build context
|
|
|
|
`game/` is a member of the multi-module `go.work` workspace at the
|
|
repository root. Its imports of `galaxy/model`, `galaxy/error`,
|
|
`galaxy/util`, etc. are satisfied by `replace` directives in `go.work`
|
|
that point at sibling modules under `pkg/`. There is no published
|
|
`galaxy/model` module to download.
|
|
|
|
A standalone `docker build ./game` therefore cannot resolve those
|
|
imports: the `pkg/` tree is outside the build context, and `game/go.mod`
|
|
alone has no `replace` directives pointing at it.
|
|
|
|
Alternatives rejected:
|
|
|
|
- adding `replace` directives to `game/go.mod` and copying `pkg/` into a
|
|
vendored layout — duplicates the workspace inside `game/`, drifts from
|
|
the rest of the repository, and forces every other workspace member
|
|
that ships a Dockerfile to repeat the trick;
|
|
- running `go mod vendor` inside `game/` before each build — workspaces
|
|
do not vendor cleanly, the resulting `vendor/` would be noisy, and CI
|
|
/ Makefile would need a custom pre-build step.
|
|
|
|
No `.dockerignore` is needed: every `COPY` in `game/Dockerfile` names an
|
|
explicit subdirectory (`pkg/calc`, `pkg/error`, `pkg/model`, `pkg/util`,
|
|
`game`), and BuildKit (forced by `# syntax=docker/dockerfile:1.7`) only
|
|
transfers the paths a `COPY` actually references.
|
|
|
|
### Design rationale: `gcr.io/distroless/static-debian12:nonroot` runtime base
|
|
|
|
Distroless static is roughly 2 MB and contains no shell or package
|
|
manager, which keeps the attack surface and CVE exposure minimal —
|
|
appropriate for a service that `Runtime Manager` will start by the
|
|
dozen. The image already runs as UID `65532:65532` named `nonroot`,
|
|
satisfying the non-root-user requirement without an explicit
|
|
`RUN adduser`.
|
|
|
|
Alternatives rejected:
|
|
|
|
- `alpine:3.20` — provides a shell for ad-hoc debugging but is roughly
|
|
10 MB and inherits regular CVE churn on `musl` / `apk`. The convenience
|
|
is not worth the larger attack surface for a fleet of identical engine
|
|
containers; operators can always `docker exec` from a debug image when
|
|
needed;
|
|
- `scratch` — smallest possible image, but ships no `/tmp`, no CA bundle,
|
|
and no `/etc/passwd`. Distroless wins on the same security axis while
|
|
leaving room for future needs (TLS, logging) without rebuilding the
|
|
base layout.
|