10 KiB
stage, title
| stage | title |
|---|---|
| 14 | Engine version registry service |
Stage 14 — Engine version registry service
This decision record captures the non-obvious choices made while
implementing the engine_version registry service-layer at PLAN
Stage 14. The service backs the
/api/v1/internal/engine-versions/* REST surface (Stage 19) and the
hot-path image_ref resolve called synchronously by Game Lobby's
start flow.
Context
../PLAN.md Stage 14 lists seven service methods:
List, Get, Create, Update, Deprecate, Delete,
ResolveImageRef. The lifecycle the service drives is frozen by
../README.md §Engine Version Registry. The reference
precedent for shape and audit semantics is
../internal/service/registerruntime
landed at Stage 13.
Five decisions deviate from a literal reading of either Stage 14 or the existing port and migration shapes. Each is recorded below.
Decisions
1. EngineVersionStore.Delete extension
Decision. ports.EngineVersionStore
gains a Delete(ctx, version) error method that returns
engineversion.ErrNotFound when no row matches. The PostgreSQL-backed
adapter engineversionstore.Store.Delete
issues a single DELETE FROM engine_versions WHERE version = $1 and
distinguishes "missing" from "removed" via RowsAffected. The mock at
internal/adapters/mocks/mock_engineversionstore.go
is regenerated by make -C gamemaster mocks. Three adapter tests
(TestDeleteHappy, TestDeleteNotFound, TestDeleteRejectsEmptyVersion)
mirror the pattern from the existing Deprecate tests.
Why. Stage 14 explicitly requires the service to expose a hard
Delete distinct from Deprecate. The Stage 11 port surface only
carried Deprecate (idempotent soft-mark) and
IsReferencedByActiveRuntime (read probe). Three alternatives were
considered and rejected:
- Skip hard delete: omits a Stage 14 deliverable and forces a port
delta later. The OpenAPI 409
engine_version_in_useexample would also become a dangling spec entry. - Reuse
Deprecatefor both soft and hard semantics: contradicts README §Engine Version Registry ("statusvalues: ...deprecated(rejected on new starts; existing runtimes unaffected)"). A referenced version must remain deprecable so the operator can phase in a successor while existing runtimes finish out — folding the reference check into Deprecate would break that flow. - Inline the SQL inside the service: contradicts the per-port abstraction Stage 10 set up; the service must not import the jet table package.
This is the same pattern Stage 13 D1 used for
RuntimeRecordStore.Delete: a small, targeted contract delta admitted
by the pre-launch single-init policy.
2. Hard-delete reference probe runs before adapter Delete
Decision. Service.Delete
calls versions.IsReferencedByActiveRuntime first; on a positive
result it surfaces ErrInUse without ever calling the adapter
Delete. Only when the probe reports zero references does the service
issue the SQL DELETE.
Why. Two alternatives were rejected:
- Single transaction with
SELECT ... FOR UPDATEplus DELETE: requires the adapter to expose a transactional sub-interface and forces the service into store-internal locking semantics. The plan is single-instance (README §Non-Goals), so the small race window between probe and delete is acceptable and self-correcting (a late-arriving register-runtime against a deprecated version would fail atruntime_recordsinsert anyway because the version row is gone — the eventual outcome is the same). - Probe-after-delete: leaks the DELETE on transient probe failures and surfaces a misleading "deleted" outcome to the caller.
Surfacing engine_version_in_use before any mutation matches the
README §Error Model wording and the OpenAPI EngineVersionInUseError
example.
3. engine_version_delete op kind added to schema and domain
Decision. A new audit value engine_version_delete is added to:
domain/operation.OpKind(constant,IsKnown,AllOpKinds);migrations/00001_init.sql(theoperation_log_op_kind_chkCHECK constraint);- README §Persistence Layout (the
op_kindenum listing in theoperation_logdescription).
The pre-launch single-init policy from
../../ARCHITECTURE.md §Persistence Backends
allows editing 00001_init.sql until first production deploy.
Why. Two alternatives were rejected:
- Reuse
engine_version_deprecatefor hard delete: semantically weak; audit consumers would have to inspect outcome plus an out-of-band column to tell soft from hard, defeating the audit's signal value. - Skip audit for hard delete: inconsistent with every other service-layer mutation (every Stage 13/14 mutation writes operation_log). Forensics on a destructive admin action are exactly where audit matters most.
4. operation_log.game_id column doubles as audit subject
Decision. Engine-version CRUD audit entries store the canonical
version string in the OperationEntry.GameID field (and therefore
in the operation_log.game_id column). For OpKindEngineVersionCreate
the canonical post-ParseSemver form is used (v1.2.3); for
OpKindEngineVersionUpdate / Deprecate / Delete the user-supplied
version is used so failed lookups still record the attempt verbatim.
Why. Three alternatives were considered and rejected:
- Make
game_idnullable and add asubject_idcolumn: requires a migration delta + jet regeneration + a domain field rename. Out of scope for stage 14 and inconsistent with the minimal-diff principle. - Use a sentinel
engine_version:<v>prefix: harder to query alongside per-game audit reads; the indexoperation_log (game_id, started_at DESC)already covers subject-scoped reads, and a sentinel prefix would force callers to strip it. - Skip audit for engine-version CRUD: README §Persistence Layout
explicitly lists
engine_version_create | engine_version_update | engine_version_deprecateas op_kind values; the audit table is the canonical surface.
The decision is recorded both here and in the README §Persistence Layout note so future readers can find the overload rationale.
5. JSON-object validation for Options
Decision. Service.Create
and Service.Update validate the Options byte slice as a JSON
object before persisting (raw bytes are decoded into
map[string]any; non-objects, including arrays and scalars, are
rejected with invalid_request). Empty/whitespace-only input passes
through as nil; the adapter (Stage 11 D5) already substitutes the
schema default '{}'::jsonb.
Why. The engine_versions.options column is jsonb. Persisting
an array, scalar, or malformed JSON would either be rejected by the
PostgreSQL parser at INSERT time (surfacing as a generic 500) or
accepted and break engine-side consumers that expect an object. The
service-layer validation surfaces a clear invalid_request early and
keeps the contract honest. README §Engine Version Registry already
describes options as a "free-form jsonb document" (object
implied); the validation makes that wording load-bearing.
Files landed
../internal/ports/engineversionstore.go— addedDeleteto the interface and the comment block.../internal/adapters/postgres/engineversionstore/store.go— implementedDelete.../internal/adapters/postgres/engineversionstore/store_test.go— addedTestDeleteHappy,TestDeleteNotFound,TestDeleteRejectsEmptyVersion.../internal/adapters/mocks/mock_engineversionstore.go— regenerated.../internal/adapters/postgres/migrations/00001_init.sql— addedengine_version_deletetooperation_log_op_kind_chk.../internal/domain/operation/log.gowithlog_test.go— addedOpKindEngineVersionDeleteplusIsKnown/AllOpKindsmembership.../internal/service/engineversion/service.gowitherrors.goandservice_test.go— new orchestrator package and tests.../internal/service/registerruntime/service_test.go—fakeEngineVersionsgains a stubDeleteto satisfy the extended port.../README.md— §References pointer to this record; §Persistence Layout note that engine-version CRUD audit entries storeversionin thegame_idcolumn and thatengine_version_deletejoins the op_kind enum.../PLAN.md— Stage 14 marked done.
Verification
cd gamemaster
# Mocks regenerate cleanly with no diff after the port extension is
# committed alongside this stage.
make mocks
git diff --exit-code internal/adapters/mocks
# Domain + port tests still pass (operation log enum membership).
go test ./internal/domain/... ./internal/ports/...
# Adapter test for the new Delete method and the migration's CHECK
# constraint.
go test ./internal/adapters/postgres/engineversionstore/...
go test ./internal/adapters/postgres/operationlog/...
# Service-level tests for the new orchestrator.
go test ./internal/service/engineversion/...
# Stage 13 service tests still pass (the fake gains a stub Delete).
go test ./internal/service/registerruntime/...
# Repo build succeeds at the workspace root.
go build ./...