Files
galaxy-game/game/README.md
T
2026-05-07 00:58:53 +03:00

226 lines
10 KiB
Markdown

# Game Service Engine
`galaxy/game` is the game engine binary that runs inside one
`galaxy-game-{game_id}` container. It hosts a single game instance and exposes
a REST API for game initialization, turn advancement, player reports, and
batched player command execution.
## References
- [`openapi.yaml`](openapi.yaml) — REST contract.
- [`../docs/ARCHITECTURE.md`](../docs/ARCHITECTURE.md) — system architecture.
- [`../rtmanager/README.md`](../rtmanager/README.md) — Runtime Manager owns
container lifecycle for this binary.
## Container model
The engine is meant to be run inside a Docker container managed by
`Runtime Manager`. One container hosts exactly one game instance and listens
on TCP `:8080` inside the container. Outside the container the endpoint is
addressed as `http://galaxy-game-{game_id}:8080` through Docker's embedded DNS
on the configured `RTMANAGER_DOCKER_NETWORK`.
The container image is built from [`Dockerfile`](Dockerfile) at the root of
this module. The Dockerfile is a multi-stage build (Go builder + small runtime
base) that exposes `:8080`, runs as a non-root user, and ships container
labels that `Runtime Manager` reads at create time:
| Label | Meaning |
| --- | --- |
| `com.galaxy.cpu_quota` | CPU quota for the container (`--cpus`). |
| `com.galaxy.memory` | Memory limit for the container (`--memory`). |
| `com.galaxy.pids_limit` | PID limit for the container (`--pids-limit`). |
| `org.opencontainers.image.title` | `galaxy-game-engine`. |
Image defaults are `cpu_quota=1.0`, `memory=512m`, `pids_limit=512`. Operators
override them at image-build time by editing the Dockerfile labels; producers
do not pass per-game limits.
## Endpoints
The contract is the union of `openapi.yaml` and the technical liveness probe
described below. Endpoints split into two route classes:
| Class | Path | Caller | Purpose |
| --- | --- | --- | --- |
| Admin (GM-only) | `POST /api/v1/admin/init` | `Game Master` | Initialise the engine with the race roster. |
| Admin (GM-only) | `GET /api/v1/admin/status` | `Game Master` | Read the full game state. |
| Admin (GM-only) | `PUT /api/v1/admin/turn` | `Game Master` | Generate the next turn. |
| Admin (GM-only) | `POST /api/v1/admin/race/banish` | `Game Master` | Deactivate a race after a permanent platform removal. |
| Player | `PUT /api/v1/command` | `Game Master` (forwarded from `Edge Gateway`) | Execute a batch of player commands. |
| Player | `PUT /api/v1/order` | `Game Master` | Validate and store a batch of player orders. |
| Player | `GET /api/v1/report` | `Game Master` | Fetch the per-player turn report. |
| Probe | `GET /healthz` | `Runtime Manager` | Technical liveness probe. |
Admin paths are unauthenticated but are routed only from inside the
trusted network segment that connects `Game Master` to the engine
container. The engine does not enforce caller identity — network-level
segmentation is the boundary. Player paths apply the same rule and rely
on `Game Master` to forward only verified player payloads.
### Game endpoints
Documented in [`openapi.yaml`](openapi.yaml). When the engine has not been
initialised through `POST /api/v1/admin/init`, game endpoints respond
`501 Not Implemented` to make the uninitialised state unambiguous.
### `StateResponse.finished`
`StateResponse` (returned by `GET /api/v1/admin/status` and
`PUT /api/v1/admin/turn`) carries a required boolean `finished` field.
The engine sets it to `true` exactly once on the turn-generation response
that ends the game; otherwise it stays `false`. `Game Master` uses this
field as the sole signal to run the platform finish flow. The conditional
logic that flips `finished` to `true` lives in the engine's domain code
and is owned by the engine maintainers.
### `POST /api/v1/admin/race/banish`
Deactivates a race after a permanent platform-level membership removal.
`Game Master` calls this endpoint synchronously after a Lobby-driven
remove-and-banish flow.
- Request body: `{ "race_name": "<name>" }`. `race_name` must be
non-empty and must match an existing race in the engine's roster.
- Successful response: `204 No Content` with an empty body.
- Error responses follow the same `400` / `500` envelope shape as the
other admin endpoints. The engine-side mechanics of `banish` (what
exactly happens to the race's planets, fleets, and pending orders) are
owned by the engine maintainers.
### `GET /healthz`
Technical liveness probe used by `Runtime Manager` and operator tooling.
- Returns `{"status":"ok"}` with HTTP `200` whenever the HTTP server is
serving requests, regardless of whether the engine has been initialised
through `POST /api/v1/admin/init`.
- Carries no game-state semantics. Use `GET /api/v1/admin/status` for
game-state inspection.
This endpoint exists so that `Runtime Manager` can probe a freshly started
container before `init` runs.
## Storage
The engine reads its persistent storage path from environment variables in
the following order of precedence:
1. `STORAGE_PATH` — historical name; honoured for backward compatibility.
2. `GAME_STATE_PATH` — canonical name written by `Runtime Manager`.
If both are set, `STORAGE_PATH` wins. If neither is set, the binary fails
fast on startup. The Dockerfile defaults `STORAGE_PATH=/var/lib/galaxy-game`
so the image runs out of the box if the operator does not supply either
variable.
`Runtime Manager` creates a per-game host directory under
`<RTMANAGER_GAME_STATE_ROOT>/{game_id}` and bind-mounts it into the container
at `RTMANAGER_ENGINE_STATE_MOUNT_PATH` (default `/var/lib/galaxy-game`). The
mount path is then exposed to the engine through `GAME_STATE_PATH` (and, for
compatibility, also as `STORAGE_PATH`).
The engine is responsible for the contents of the storage directory.
`Runtime Manager` never reads or writes the directory contents, never
deletes the directory, and never inspects per-game state files.
### Design rationale: storage-path env precedence
`STORAGE_PATH` wins over `GAME_STATE_PATH` because the engine already
shipped with `STORAGE_PATH` (see `game/Makefile` and
`game/internal/router/handler/handler.go`). Keeping `STORAGE_PATH` as
the authoritative variable means existing engine deployments and
integration fixtures continue to work without code change, while
`GAME_STATE_PATH` is the platform contract written by `Runtime Manager`
and documented in `ARCHITECTURE.md §9`.
Alternatives considered and rejected:
- accept only `GAME_STATE_PATH` — would force a breaking change on the
engine binary and on every existing `STORAGE_PATH=...` invocation in
`game/Makefile` and dev scripts;
- `GAME_STATE_PATH` wins over `STORAGE_PATH` — would silently invert
the meaning of an explicit `STORAGE_PATH=` invocation if the operator
also sets `GAME_STATE_PATH` for any reason.
### Design rationale: storage-path validation site
`game/internal/router/handler/handler.go` exports `ResolveStoragePath`,
which returns the engine storage path from the env-var pair above and
an error when neither is set. `cmd/http/main.go` calls it before
constructing the router, prints the error to stderr, and exits non-zero.
The existing `initConfig` closure also calls `ResolveStoragePath` to
populate `controller.Param.StoragePath` at request time; the error there
is dropped because `main` already validated the environment at startup.
This keeps the public router surface (`router.NewRouter`) unchanged —
the env binding is satisfied by one helper plus a startup check, with
no API ripple. Moving env reading entirely into `main` and changing
`NewRouter` / `NewDefaultExecutor` to accept an explicit path was
rejected: it churns multiple call sites for no functional gain. The
current shape leaves the configurer closure ready for future
config-injection refactors without forcing one now.
## Build
The container image is built from [`Dockerfile`](Dockerfile). The Docker
build context is the workspace root (`galaxy/`) rather than the `game/`
subdirectory, because `game/` resolves `galaxy/{model,error,util,...}`
through `go.work` `replace` directives. From the workspace root:
```sh
docker build -t galaxy/game:test -f game/Dockerfile .
```
The build is two-staged: a `golang:1.26.2-alpine` builder produces a
statically linked binary (`CGO_ENABLED=0`), then `gcr.io/distroless/static-debian12:nonroot`
runs it as the `nonroot` user and exposes `:8080`.
### Design rationale: workspace-root build context
`game/` is a member of the multi-module `go.work` workspace at the
repository root. Its imports of `galaxy/model`, `galaxy/error`,
`galaxy/util`, etc. are satisfied by `replace` directives in `go.work`
that point at sibling modules under `pkg/`. There is no published
`galaxy/model` module to download.
A standalone `docker build ./game` therefore cannot resolve those
imports: the `pkg/` tree is outside the build context, and `game/go.mod`
alone has no `replace` directives pointing at it.
Alternatives rejected:
- adding `replace` directives to `game/go.mod` and copying `pkg/` into a
vendored layout — duplicates the workspace inside `game/`, drifts from
the rest of the repository, and forces every other workspace member
that ships a Dockerfile to repeat the trick;
- running `go mod vendor` inside `game/` before each build — workspaces
do not vendor cleanly, the resulting `vendor/` would be noisy, and CI
/ Makefile would need a custom pre-build step.
No `.dockerignore` is needed: every `COPY` in `game/Dockerfile` names an
explicit subdirectory (`pkg/calc`, `pkg/error`, `pkg/model`, `pkg/util`,
`game`), and BuildKit (forced by `# syntax=docker/dockerfile:1.7`) only
transfers the paths a `COPY` actually references.
### Design rationale: `gcr.io/distroless/static-debian12:nonroot` runtime base
Distroless static is roughly 2 MB and contains no shell or package
manager, which keeps the attack surface and CVE exposure minimal —
appropriate for a service that `Runtime Manager` will start by the
dozen. The image already runs as UID `65532:65532` named `nonroot`,
satisfying the non-root-user requirement without an explicit
`RUN adduser`.
Alternatives rejected:
- `alpine:3.20` — provides a shell for ad-hoc debugging but is roughly
10 MB and inherits regular CVE churn on `musl` / `apk`. The convenience
is not worth the larger attack surface for a fleet of identical engine
containers; operators can always `docker exec` from a debug image when
needed;
- `scratch` — smallest possible image, but ships no `/tmp`, no CA bundle,
and no `/etc/passwd`. Distroless wins on the same security axis while
leaving room for future needs (TLS, logging) without rebuilding the
base layout.