11 KiB
stage, title
| stage | title |
|---|---|
| 19 | Internal REST handlers |
Stage 19 — Internal REST handlers
This decision record captures the non-obvious choices made while
bringing the trusted internal REST listener of Game Master to full
contract coverage. The handlers wire the existing service layer
(stages 13–17) and the membership cache (stage 16) to the eighteen
operations frozen by
../api/internal-openapi.yaml. The
listener lifecycle, OpenTelemetry middleware, and the /healthz /
/readyz probes were established in stage 08; this stage adds the
per-operation handler subpackage, widens the listener Dependencies
struct to thread every service port, and grows
../internal/app/wiring.go to construct
the entire dependency graph (stores, adapters, services, workers).
The reference precedent for the handler shape is the rtmanager
internal/api/internalhttp/handlers tree; the conformance test
mirrors rtmanager/internal/api/internalhttp/conformance_test.go.
Eight decisions deviate from a literal reading of
../PLAN.md or are sharp enough to surface here.
Decisions
D1. Conformance test lives inside the listener package
Decision. The OpenAPI conformance test ships at
../internal/api/internalhttp/conformance_test.go,
in the internalhttp package, not at
gamemaster/api/openapi_conformance_test.go as the literal text of
PLAN.md Stage 19 suggests.
Why. The test instantiates the live Server.handler through
NewServer(...) with stub services and replays each documented
operation against it. That requires reading the unexported
handler field and wiring stub implementations of the
handler-package interfaces; both are package-internal concerns that a
sibling test under gamemaster/api/ would not have access to without
exporting hooks that exist solely for the test. The rtmanager
service ships the analogous test inside its own internalhttp
package; we follow the same idiom.
How to apply. Future surface-shape audits go in this file. PLAN.md text is treated as a drift; the constraint that the spec is covered by a kin-openapi-driven validation is honoured exactly.
D2. DELETE /engine-versions/{version} calls Service.Deprecate
Decision. The handler bound to the OpenAPI operation
internalDeprecateEngineVersion calls
engineversion.Service.Deprecate
and never Service.Delete. The 409 response declared by the
spec for engine_version_in_use is therefore unreachable on this
endpoint.
Why. The operation id and the first sentence of the description
explicitly say «Sets the engine version status to deprecated». The
sentence about hard removal and engine_version_in_use is a
leftover of an earlier intent — Service.Deprecate does not consult
IsReferencedByActiveRuntime, so the in-use rejection cannot fire
through this code path. Hard delete is a future Admin Service
operation; v1 does not expose it through REST.
How to apply. Calls that need to release the registry row
permanently must use Service.Delete directly (not yet wired through
REST). The spec's leftover 409 example is recorded here so a future
contract reviewer does not chase a phantom failure mode.
D3. Workers wired and started alongside the listener
Decision. This stage constructs the scheduler ticker (stage 15)
and the runtime:health_events consumer (stage 18) inside
wiring.buildWorkers and registers them as App.Component-s next
to the internal HTTP server.
Why. Stage 19's narrow text says «ship the gateway-, Lobby- and Admin-facing REST surface backed by the service layer». But the service layer collaborators referenced from the listener (turn generation, membership cache, runtime record store, etc.) only make sense inside a process that is also producing turns and consuming health events. Keeping the workers idle would leave the wiring graph half-built and the dev experience surprising. Constructing and starting them here makes a freshly-deployed process production-ready the moment the listener accepts traffic.
How to apply. The two workers are owned by App.Run exactly
like the listener: both Run (long-lived) and Shutdown are part
of App.Component. See D4 for the trivial Shutdown added on the
scheduler ticker.
D4. schedulerticker.Worker.Shutdown is a no-op
Decision. The scheduler ticker adds a one-line
Shutdown(_ context.Context) error { return nil } so the type
satisfies app.Component.
Why. The worker's Run already returns when the supplied
context is cancelled, and wg.Wait drains the in-flight per-game
goroutines before Run returns. There is nothing additional to
release. The healtheventsconsumer.Worker already had a Shutdown
from stage 18; this just brings the two workers to the same shape.
How to apply. When future workers grow real shutdown logic
(buffered output to flush, persistent connections to drain), they
should embed it inside Shutdown rather than relying on context
cancellation alone.
D5. New RuntimeRecordStore.List(ctx) method
Decision. The port grows a fifth read method:
List(ctx) ([]runtime.RuntimeRecord, error). The PostgreSQL
adapter implements it as one SELECT ordered by
(created_at DESC, game_id ASC).
Why. The OpenAPI operation internalListRuntimes accepts an
optional status query parameter. With the parameter set, the
existing ListByStatus answers; without it, no method on the port
returned every record. Composing the unfiltered list as a
loop-over-statuses would dilute the ordering guarantee and double
the round-trip cost. The new method is additive — every other
caller keeps using its narrow read.
How to apply. Test fakes (fakeRuntimeRecords in service tests,
fakeRuntimeRecordsBackend in scheduler-ticker tests) gained the
method as well. The handler-side RuntimeRecordsReader interface
exposes only the three read methods (Get, List, ListByStatus)
so the listener cannot accidentally mutate runtime state.
D6. next_generation_at encodes as 0 when unscheduled
Decision. The wire RuntimeRecord.next_generation_at field is
declared required: true and format: int64. The domain holds
*time.Time and may carry nil — typically while a runtime is in
status starting and the first scheduling write has not yet
landed. The encoder writes 0 in that case and writes the UTC
millisecond value otherwise.
Why. Encoding nil as 0 keeps the wire shape JSON-Schema-valid
without forcing every record reader to handle a missing field.
Optional pointer-typed timestamps (started_at, stopped_at,
finished_at) are still omitted from the JSON form via omitempty,
matching the required list in the spec.
How to apply. Readers must treat next_generation_at == 0 as
«not yet scheduled» when the status warrants it; the field will
turn into a real Unix-millisecond value once the scheduler's first
write lands. The conformance test seeds a non-nil
NextGenerationAt, so the strict response validator never sees
this edge case at the wire boundary.
D7. Hot-path bodies are pass-through, not strict-decoded
Decision. Handlers internalExecuteCommands, internalPutOrders
read the request body as raw bytes. The body is rejected only when
empty or not valid JSON; unknown fields pass through.
Why. The OpenAPI request schemas for these three operations carry
additionalProperties: true because the envelopes are engine-owned
(galaxy/game/openapi.yaml). Strict decoding here would reject
legitimate engine extensions and force every contract bump to land
in two services in lockstep.
How to apply. Engine engine_validation_error responses still
surface as the canonical Game Master error envelope at HTTP 502 —
the engine response body is recorded in result.RawResponse for
audit but the OpenAPI spec mandates the error envelope on this code
path. If a future contract version requires forwarding the engine's
4xx body to the gateway, a separate response shape needs to land in
the spec first.
D8. X-Galaxy-Caller mapping with admin default
Decision. The resolveOpSource helper maps the
X-Galaxy-Caller header values to
operation.OpSource as
follows: gateway → OpSourceGatewayPlayer,
lobby → OpSourceLobbyInternal, admin → OpSourceAdminRest.
Missing or unrecognised values fall back to OpSourceAdminRest,
matching the contract documented in
../README.md §«Internal REST API».
Why. The default is conservative: an Admin Service request without the header still records as admin instead of being dropped. The other two values are reserved for the documented callers and trim/lowercase tolerantly so a casing slip in development does not produce a confusing audit row.
How to apply. New REST callers should set the header
explicitly. Adding a fourth caller type requires an OpSource
constant alongside the mapping change.
What ships
- Eighteen operation handlers under
../internal/api/internalhttp/handlers. - The probe-only
internal/api/internalhttp/server.gonow widensDependenciesand forwards the per-operation services tohandlers.Register. - Full dependency graph in
../internal/app/wiring.go: five stores, five external adapters, eleven services, two workers. RuntimeRecordStore.List(ctx)plus its PostgreSQL adapter implementation and regression tests (../internal/adapters/postgres/runtimerecordstore).schedulerticker.Worker.Shutdownso the worker is anApp.Component.- Mockgen-generated handler-port mocks under
../internal/api/internalhttp/handlers/mocks. - A kin-openapi-driven conformance test
(
../internal/api/internalhttp/conformance_test.go) that validates request and response shapes for every documented operation against../api/internal-openapi.yaml. - Per-handler unit tests covering happy paths, error-code mapping, unknown-field rejection, and header validation.
What remains for later stages
- Lobby refactor (stage 20) flips Lobby's start flow to call
GET /api/v1/internal/engine-versions/{version}/image-refsynchronously and adds theInvalidateMembershipsoutbound call on every roster mutation. - Service-local integration suite (stage 21) drives the listener end-to-end against a real engine container.
- Cross-service integration tests (stages 22–23) cover Lobby + GM, Lobby + GM + RTM happy and failure paths.