feat: gamemaster

This commit is contained in:
Ilia Denisov
2026-05-03 07:59:03 +02:00
committed by GitHub
parent a7cee15115
commit 3e2622757e
229 changed files with 41521 additions and 1098 deletions
+211
View File
@@ -0,0 +1,211 @@
---
stage: 12
title: External clients
---
# Stage 12 — External clients
This decision record captures the non-obvious choices made while
implementing the five outbound adapters Game Master uses to talk to
the engine, Game Lobby, Runtime Manager, the notification stream, and
the lobby-events stream at PLAN Stage 12.
## Context
[`../PLAN.md` Stage 12](../PLAN.md) ships the adapter layer the
service-layer stages 1318 depend on. Ports were frozen by Stage 10
([`stage10-domain-and-ports.md`](./stage10-domain-and-ports.md)) and
the AsyncAPI/OpenAPI contracts were frozen by Stage 06
([`stage06-contract-files.md`](./stage06-contract-files.md)). The
reference precedent is `rtmanager`'s adapter tree
([`rtmanager/internal/adapters/lobbyclient`](../../rtmanager/internal/adapters/lobbyclient),
[`rtmanager/internal/adapters/notificationpublisher`](../../rtmanager/internal/adapters/notificationpublisher),
[`rtmanager/internal/adapters/healtheventspublisher`](../../rtmanager/internal/adapters/healtheventspublisher)),
which Stage 11 already locked in as the canonical shape for Game
Master persistence adapters. Stage 12 extends that precedent to the
HTTP clients and stream publishers.
Six decisions deviate from a literal copy of the `rtmanager` precedent
or extend the literal task list of PLAN Stage 12. Each is recorded
below.
## Decisions
### 1. Engine client carries no `BaseURL` in `Config`
**Decision.**
[`engineclient.Config`](../internal/adapters/engineclient/client.go)
exposes only `CallTimeout` and `ProbeTimeout`. The engine endpoint
URL is supplied per call from `runtime_records.engine_endpoint`.
**Why.** Game Master operates on N concurrent games at runtime; each
game lives behind its own DNS hostname (`http://galaxy-game-{game_id}:8080`).
Binding a base URL at construction would force a per-game client
instance and complicate the caller. The port already reflects the
right shape (`baseURL` is a method parameter on every method), so the
adapter follows it. The `*http.Client` is shared, so the HTTP
connection pool stays single-instance.
### 2. Two timeouts on the engine client, dispatched per method
**Decision.** The engine client routes turn-generation-class methods
(`Init`, `Turn`, `BanishRace`, `ExecuteCommands`, `PutOrders`)
through `CallTimeout` and inspect-style methods (`Status`,
`GetReport`) through `ProbeTimeout`. Both are required and must be
positive at construction.
**Why.** README §Configuration already declares the two
(`GAMEMASTER_ENGINE_CALL_TIMEOUT=30s`,
`GAMEMASTER_ENGINE_PROBE_TIMEOUT=5s`) for exactly this dispatch:
turn generation on a large game can run for tens of seconds, while
status/report reads are bounded and benefit from a tight ceiling.
A single shared timeout would either starve the long calls or relax
the short ones; the dispatch keeps the contract consistent with the
documented intent.
### 3. Engine `population` (number) decoded into `int` via `math.Round`
**Decision.**
[`engineclient`](../internal/adapters/engineclient/client.go) decodes
each `PlayerState.population` (typed as `number` in `game/openapi.yaml`)
into a private `float64` field, then converts to the port-level `int`
through `int(math.Round(value))`. NaN, infinite, and negative values
are rejected as `ports.ErrEngineProtocolViolation`.
**Why.** The port (Stage 10) and the AsyncAPI for `gm:lobby_events`
both treat population as a non-negative integer; the engine spec is
the only place it is typed as `number`. The engine in practice
returns whole values, but a defensive `math.Round` removes any
floating-point noise that would otherwise propagate to Lobby.
Rejecting NaN/Inf/negative payloads keeps the protocol invariant
explicit at the trust boundary.
### 4. Lobby client walks pagination with a hard page cap
**Decision.**
[`lobbyclient.GetMemberships`](../internal/adapters/lobbyclient/client.go)
walks the `next_page_token` chain transparently with `page_size=200`,
stopping when the upstream response carries an empty
`next_page_token`. A hard cap of 64 pages (`maxPages`) surfaces as
`fmt.Errorf("%w: pagination overflow ...", ports.ErrLobbyUnavailable)`
when crossed.
**Why.** The port contract is "every membership of gameID, in any
status"; the only way to satisfy it across Lobby's paged contract is
to follow the chain. The 64-page cap is a defensive guard against a
broken upstream that keeps issuing tokens; 64 × 200 = 12 800
memberships per game, two orders of magnitude beyond any realistic
Galaxy roster, so legitimate traffic never trips it. Surfacing the
overflow as `ErrLobbyUnavailable` lets the membership cache treat it
the same as any other transport fault.
### 5. RTM client does not introduce `ErrSemverPatchOnly`
**Decision.** RTM's `409 conflict` with `error_code=semver_patch_only`
is wrapped as `fmt.Errorf("%w: rtm patch: ... (error_code=semver_patch_only)", ports.ErrRTMUnavailable)`
without a dedicated typed sentinel.
**Why.** The Stage 10 port [`RTMClient.Patch`](../internal/ports/rtmclient.go)
declares only `ErrRTMUnavailable`. Adding `ErrSemverPatchOnly` here
would extend the port contract beyond Stage 10's frozen surface, and
the v1 service-layer caller (Stage 17, `adminpatch`) already
validates semver-patch eligibility against `engineversionstore`
before issuing the call. The 409 path is therefore a defence-in-depth
signal, not a primary branch; a single wrapped error keeps the port
narrow and lets the caller match on the message substring if it
ever needs to (today it does not).
### 6. Lobby-events publisher reuses the `rtmanager/healtheventspublisher`
shape, with two methods sharing one stream
**Decision.**
[`lobbyeventspublisher.Publisher`](../internal/adapters/lobbyeventspublisher/publisher.go)
exposes `PublishSnapshotUpdate` and `PublishGameFinished`, both
hitting the same Redis Stream key (`cfg.Streams.LobbyEvents`,
default `gm:lobby_events`). Each XADD encodes the same field
vocabulary as `rtmanager/healtheventspublisher`: integer fields are
serialised through `strconv.FormatInt` / `strconv.Itoa`, the
per-player projection is JSON-encoded into one stream field
(`player_turn_stats`), and the discriminator field (`event_type`) is
a string literal pinned to one of the two AsyncAPI const values.
No MAXLEN cap is set on XADD; an empty `PlayerTurnStats` slice is
serialised as `"[]"` (literal). All `time.Time` fields are coerced
to UTC before `UnixMilli()` so the published timestamps match the
contract regardless of caller-supplied timezone.
**Why.** The two messages share one channel per the AsyncAPI spec
([`runtime-events-asyncapi.yaml`](../api/runtime-events-asyncapi.yaml));
the discriminator is the documented dispatch key for Lobby's
consumer. Using the existing field-encoding pattern from
`rtmanager/healtheventspublisher` keeps the wire format consistent
across services and lets Lobby reuse the same XADD-decoding helpers
it already runs against `runtime:health_events`. Setting MAXLEN was
considered and rejected: Game Master never processes the stream
itself, and the Lobby consumer owns its consumer-group offset, so
trimming would risk dropping unconsumed entries. The empty `"[]"`
default keeps the stream entry valid JSON for the field even before
the first turn generates (when no per-player stats exist yet).
### 7. Defensive Makefile guard for `make mocks` between Stage 12 and Stage 19
**Decision.** The `mocks` Makefile target now skips the
`internal/api/internalhttp/handlers/...` line when that directory
does not yet exist:
```makefile
mocks:
go generate ./internal/ports/...
@if [ -d ./internal/api/internalhttp/handlers ]; then \
go generate ./internal/api/internalhttp/handlers/...; \
fi
```
**Why.** Stage 8 wired the Makefile to regenerate both port-level
and handler-level mocks, but the handlers directory only appears at
Stage 19. Without the guard, `make mocks` fails with `lstat: no such
file or directory` between Stage 12 and Stage 19 — exactly when GM
is being grown stage by stage. The guard makes the target idempotent
across stages and adds zero cost when the directory is finally
created.
## Files landed
- [`../internal/adapters/engineclient/client.go`](../internal/adapters/engineclient/client.go),
[`../internal/adapters/engineclient/client_test.go`](../internal/adapters/engineclient/client_test.go)
- [`../internal/adapters/lobbyclient/client.go`](../internal/adapters/lobbyclient/client.go),
[`../internal/adapters/lobbyclient/client_test.go`](../internal/adapters/lobbyclient/client_test.go)
- [`../internal/adapters/rtmclient/client.go`](../internal/adapters/rtmclient/client.go),
[`../internal/adapters/rtmclient/client_test.go`](../internal/adapters/rtmclient/client_test.go)
- [`../internal/adapters/notificationpublisher/publisher.go`](../internal/adapters/notificationpublisher/publisher.go),
[`../internal/adapters/notificationpublisher/publisher_test.go`](../internal/adapters/notificationpublisher/publisher_test.go)
- [`../internal/adapters/lobbyeventspublisher/publisher.go`](../internal/adapters/lobbyeventspublisher/publisher.go),
[`../internal/adapters/lobbyeventspublisher/publisher_test.go`](../internal/adapters/lobbyeventspublisher/publisher_test.go)
- [`../internal/adapters/mocks/`](../internal/adapters/mocks) — ten
generated `mockgen` files covering every Stage 10 port (engine,
lobby, rtm, notification publisher, lobby-events publisher, plus
the five store/log ports landed by Stage 11).
- [`../Makefile`](../Makefile) — defensive guard on the `mocks`
target.
- [`../README.md`](../README.md) — §References pointer to this
record.
## Verification
```sh
cd gamemaster
# Mocks regenerate cleanly with no diff after a second run.
make mocks
git diff --exit-code internal/adapters/mocks
# Adapter-level unit tests against httptest / miniredis.
go test ./internal/adapters/engineclient/...
go test ./internal/adapters/lobbyclient/...
go test ./internal/adapters/rtmclient/...
go test ./internal/adapters/notificationpublisher/...
go test ./internal/adapters/lobbyeventspublisher/...
# Full repo build remains green; Stage 06/07/0911 contract and
# adapter tests are unaffected.
go test ./...
```