--- stage: 19 title: Internal REST handlers --- # Stage 19 — Internal REST handlers This decision record captures the non-obvious choices made while bringing the trusted internal REST listener of Game Master to full contract coverage. The handlers wire the existing service layer (stages 13–17) and the membership cache (stage 16) to the eighteen operations frozen by [`../api/internal-openapi.yaml`](../api/internal-openapi.yaml). The listener lifecycle, OpenTelemetry middleware, and the `/healthz` / `/readyz` probes were established in stage 08; this stage adds the per-operation handler subpackage, widens the listener `Dependencies` struct to thread every service port, and grows [`../internal/app/wiring.go`](../internal/app/wiring.go) to construct the entire dependency graph (stores, adapters, services, workers). The reference precedent for the handler shape is the rtmanager `internal/api/internalhttp/handlers` tree; the conformance test mirrors `rtmanager/internal/api/internalhttp/conformance_test.go`. Eight decisions deviate from a literal reading of [`../PLAN.md`](../PLAN.md) or are sharp enough to surface here. ## Decisions ### D1. Conformance test lives inside the listener package **Decision.** The OpenAPI conformance test ships at [`../internal/api/internalhttp/conformance_test.go`](../internal/api/internalhttp/conformance_test.go), in the `internalhttp` package, not at `gamemaster/api/openapi_conformance_test.go` as the literal text of PLAN.md Stage 19 suggests. **Why.** The test instantiates the live `Server.handler` through `NewServer(...)` with stub services and replays each documented operation against it. That requires reading the unexported `handler` field and wiring stub implementations of the handler-package interfaces; both are package-internal concerns that a sibling test under `gamemaster/api/` would not have access to without exporting hooks that exist solely for the test. The rtmanager service ships the analogous test inside its own `internalhttp` package; we follow the same idiom. **How to apply.** Future surface-shape audits go in this file. PLAN.md text is treated as a drift; the constraint that the spec is covered by a kin-openapi-driven validation is honoured exactly. ### D2. `DELETE /engine-versions/{version}` calls `Service.Deprecate` **Decision.** The handler bound to the OpenAPI operation `internalDeprecateEngineVersion` calls [`engineversion.Service.Deprecate`](../internal/service/engineversion/service.go) and never `Service.Delete`. The 409 response declared by the spec for `engine_version_in_use` is therefore unreachable on this endpoint. **Why.** The operation id and the first sentence of the description explicitly say «Sets the engine version status to `deprecated`». The sentence about hard removal and `engine_version_in_use` is a leftover of an earlier intent — `Service.Deprecate` does not consult `IsReferencedByActiveRuntime`, so the in-use rejection cannot fire through this code path. Hard delete is a future Admin Service operation; v1 does not expose it through REST. **How to apply.** Calls that need to release the registry row permanently must use `Service.Delete` directly (not yet wired through REST). The spec's leftover 409 example is recorded here so a future contract reviewer does not chase a phantom failure mode. ### D3. Workers wired and started alongside the listener **Decision.** This stage constructs the scheduler ticker (stage 15) and the runtime:health_events consumer (stage 18) inside `wiring.buildWorkers` and registers them as `App.Component`-s next to the internal HTTP server. **Why.** Stage 19's narrow text says «ship the gateway-, Lobby- and Admin-facing REST surface backed by the service layer». But the service layer collaborators referenced from the listener (turn generation, membership cache, runtime record store, etc.) only make sense inside a process that is also producing turns and consuming health events. Keeping the workers idle would leave the wiring graph half-built and the dev experience surprising. Constructing and starting them here makes a freshly-deployed process production-ready the moment the listener accepts traffic. **How to apply.** The two workers are owned by `App.Run` exactly like the listener: both `Run` (long-lived) and `Shutdown` are part of `App.Component`. See D4 for the trivial `Shutdown` added on the scheduler ticker. ### D4. `schedulerticker.Worker.Shutdown` is a no-op **Decision.** The scheduler ticker adds a one-line `Shutdown(_ context.Context) error { return nil }` so the type satisfies `app.Component`. **Why.** The worker's `Run` already returns when the supplied context is cancelled, and `wg.Wait` drains the in-flight per-game goroutines before `Run` returns. There is nothing additional to release. The `healtheventsconsumer.Worker` already had a `Shutdown` from stage 18; this just brings the two workers to the same shape. **How to apply.** When future workers grow real shutdown logic (buffered output to flush, persistent connections to drain), they should embed it inside `Shutdown` rather than relying on context cancellation alone. ### D5. New `RuntimeRecordStore.List(ctx)` method **Decision.** The port grows a fifth read method: `List(ctx) ([]runtime.RuntimeRecord, error)`. The PostgreSQL adapter implements it as one SELECT ordered by `(created_at DESC, game_id ASC)`. **Why.** The OpenAPI operation `internalListRuntimes` accepts an optional `status` query parameter. With the parameter set, the existing `ListByStatus` answers; without it, no method on the port returned every record. Composing the unfiltered list as a loop-over-statuses would dilute the ordering guarantee and double the round-trip cost. The new method is additive — every other caller keeps using its narrow read. **How to apply.** Test fakes (`fakeRuntimeRecords` in service tests, `fakeRuntimeRecordsBackend` in scheduler-ticker tests) gained the method as well. The handler-side `RuntimeRecordsReader` interface exposes only the three read methods (`Get`, `List`, `ListByStatus`) so the listener cannot accidentally mutate runtime state. ### D6. `next_generation_at` encodes as `0` when unscheduled **Decision.** The wire `RuntimeRecord.next_generation_at` field is declared `required: true` and `format: int64`. The domain holds `*time.Time` and may carry `nil` — typically while a runtime is in status `starting` and the first scheduling write has not yet landed. The encoder writes `0` in that case and writes the UTC millisecond value otherwise. **Why.** Encoding `nil` as `0` keeps the wire shape JSON-Schema-valid without forcing every record reader to handle a missing field. Optional pointer-typed timestamps (`started_at`, `stopped_at`, `finished_at`) are still omitted from the JSON form via `omitempty`, matching the `required` list in the spec. **How to apply.** Readers must treat `next_generation_at == 0` as «not yet scheduled» when the status warrants it; the field will turn into a real Unix-millisecond value once the scheduler's first write lands. The conformance test seeds a non-nil `NextGenerationAt`, so the strict response validator never sees this edge case at the wire boundary. ### D7. Hot-path bodies are pass-through, not strict-decoded **Decision.** Handlers `internalExecuteCommands`, `internalPutOrders` read the request body as raw bytes. The body is rejected only when empty or not valid JSON; unknown fields pass through. **Why.** The OpenAPI request schemas for these three operations carry `additionalProperties: true` because the envelopes are engine-owned (`galaxy/game/openapi.yaml`). Strict decoding here would reject legitimate engine extensions and force every contract bump to land in two services in lockstep. **How to apply.** Engine `engine_validation_error` responses still surface as the canonical Game Master error envelope at HTTP 502 — the engine response body is recorded in `result.RawResponse` for audit but the OpenAPI spec mandates the error envelope on this code path. If a future contract version requires forwarding the engine's 4xx body to the gateway, a separate response shape needs to land in the spec first. ### D8. `X-Galaxy-Caller` mapping with admin default **Decision.** The `resolveOpSource` helper maps the `X-Galaxy-Caller` header values to [`operation.OpSource`](../internal/domain/operation/log.go) as follows: `gateway → OpSourceGatewayPlayer`, `lobby → OpSourceLobbyInternal`, `admin → OpSourceAdminRest`. Missing or unrecognised values fall back to `OpSourceAdminRest`, matching the contract documented in [`../README.md` §«Internal REST API»](../README.md). **Why.** The default is conservative: an Admin Service request without the header still records as admin instead of being dropped. The other two values are reserved for the documented callers and trim/lowercase tolerantly so a casing slip in development does not produce a confusing audit row. **How to apply.** New REST callers should set the header explicitly. Adding a fourth caller type requires an `OpSource` constant alongside the mapping change. ## What ships - Eighteen operation handlers under [`../internal/api/internalhttp/handlers`](../internal/api/internalhttp/handlers). - The probe-only `internal/api/internalhttp/server.go` now widens `Dependencies` and forwards the per-operation services to `handlers.Register`. - Full dependency graph in [`../internal/app/wiring.go`](../internal/app/wiring.go): five stores, five external adapters, eleven services, two workers. - `RuntimeRecordStore.List(ctx)` plus its PostgreSQL adapter implementation and regression tests ([`../internal/adapters/postgres/runtimerecordstore`](../internal/adapters/postgres/runtimerecordstore)). - `schedulerticker.Worker.Shutdown` so the worker is an `App.Component`. - Mockgen-generated handler-port mocks under [`../internal/api/internalhttp/handlers/mocks`](../internal/api/internalhttp/handlers/mocks). - A kin-openapi-driven conformance test ([`../internal/api/internalhttp/conformance_test.go`](../internal/api/internalhttp/conformance_test.go)) that validates request and response shapes for every documented operation against [`../api/internal-openapi.yaml`](../api/internal-openapi.yaml). - Per-handler unit tests covering happy paths, error-code mapping, unknown-field rejection, and header validation. ## What remains for later stages - Lobby refactor (stage 20) flips Lobby's start flow to call `GET /api/v1/internal/engine-versions/{version}/image-ref` synchronously and adds the `InvalidateMemberships` outbound call on every roster mutation. - Service-local integration suite (stage 21) drives the listener end-to-end against a real engine container. - Cross-service integration tests (stages 22–23) cover Lobby + GM, Lobby + GM + RTM happy and failure paths.