Files
galaxy-game/gamemaster/docs/stage11-persistence-adapters.md
T
2026-05-03 07:59:03 +02:00

12 KiB

stage, title
stage title
11 Persistence adapters

Stage 11 — Persistence adapters

This decision record captures the non-obvious choices made while implementing the four PostgreSQL stores and the Redis offset store of Game Master at PLAN Stage 11.

Context

../PLAN.md Stage 11 ships the persistence layer that the service-layer stages (13-17) and the worker stage (18) consume. Stage 09 already shipped the schema, embedded migration, and the generated jet code; Stage 10 fixed the domain types and the port interfaces. Stage 11 plugs concrete adapters into those ports.

The reference precedent is rtmanager, the most recently landed PG-backed service. Its internal/adapters/postgres/ and internal/adapters/redisstate/ trees define the shape every Stage 11 file follows: per-store package under postgres/<store>/store.go, helper packages under internal/sqlx and internal/pgtest, Config/Store/New triple, ColumnList-driven canonical SELECTs, sqlx.WithTimeout/sqlx.IsNoRows/ sqlx.IsUniqueViolation shared boundary helpers.

Eight decisions either deviate from a literal copy of rtmanager or extend the literal task list of PLAN Stage 11. Each is recorded below.

Decisions

1. internal/sqlx and internal/pgtest are local clones, not a shared module

Decision. internal/adapters/postgres/internal/sqlx/sqlx.go and internal/adapters/postgres/internal/pgtest/pgtest.go are full copies of rtmanager's sibling files, with the few constants that name the schema and role (gamemaster, gamemasterservice, galaxy_gamemaster) replaced verbatim.

Why. Each PG-backed service owns its own role, schema, and migration FS. Promoting these helpers into pkg/postgres would force that package to either know about every schema or take them as configuration; either path adds surface area for a runtime helper that already covers exactly one boundary. The rtmanager precedent settled on the per-service clone first and Game Master mirrors it for the same architectural reason. The duplication cost is small (≈250 lines total, mechanical) and the alternative would couple services through a testing concern that has no business in production code.

2. CAS via (game_id, status) predicate, not SELECT … FOR UPDATE

Decision. runtimerecordstore.UpdateStatus encodes the compare-and-swap as a WHERE game_id = $1 AND status = $2 predicate on a single UPDATE, then probes the row's existence on RowsAffected == 0 to distinguish runtime.ErrConflict (status changed concurrently) from runtime.ErrNotFound (row absent).

Why. Same reasoning as rtmanager/docs/postgres-migration.md §CAS: holding a SELECT … FOR UPDATE lock would block every other tick on the same game while the Go code computed the next status, lengthening the locked region for no correctness gain. The CAS-only path is verified by TestUpdateStatusConcurrentCAS (8 goroutines, exactly one winner).

3. Port-level deviation: UpdateEngineVersionInput.Now and Deprecate(ctx, version, now)

Decision. ports/engineversionstore.go gains a Now time.Time field on UpdateEngineVersionInput (validated by Validate to be non-zero) and a now time.Time argument on Deprecate. The corresponding port-level test fixtures in engineversionstore_test.go are updated to carry the new value.

Why. Stage 10's literal port did not include a wall-clock for the engine-version mutators, while UpdateStatusInput and UpdateSchedulingInput do. Without Now in the input, the adapter would have to either call time.Now() directly (loses test determinism) or accept a Clock dependency in Config (adds adapter infrastructure for a single use case). Aligning the inputs is a small, targeted contract change allowed by the pre-launch single-init policy and consistent with the clock-from-input convention adopted everywhere else in the service.

4. Domain-level conflict sentinels engineversion.ErrConflict and playermapping.ErrConflict

Decision. The domain packages engineversion and playermapping gain ErrConflict sentinels. Adapters surface PostgreSQL unique violations as fmt.Errorf("...: %w", <pkg>.ErrConflict) so service callers can branch with errors.Is.

Why. runtime.ErrConflict already exists in the runtime package and the rest of the codebase (lobby, rtmanager, notification) uses domain-level conflict sentinels (e.g. membership.ErrConflict, runtime.ErrConflict). Returning a generic wrapped error for engine-version and player-mapping conflicts would break the established pattern and force the service layer to carry adapter implementation knowledge (sqlx.IsUniqueViolation). Adding two sentinels is a small, idiomatic deviation from PLAN Stage 11's bullet list, called out here so future contract diffs do not re-litigate it.

5. Options jsonb requires explicit CAST(... AS jsonb) in dynamic UPDATE

Decision. In engineversionstore.Update the dynamic assignment for options wraps the value in pg.StringExp(pg.CAST(pg.String(...)).AS("jsonb")). The plain pg.String(...) literal makes PostgreSQL infer the right-hand side as text and the assignment to a jsonb column then fails with SQLSTATE 42804 (column is of type jsonb but expression is of type text).

Why. INSERT ... VALUES(...) paths bind the []byte through pgx, which knows how to coerce text into jsonb at the protocol level. Dynamic UPDATE … SET options = '...' does not go through that bind because the SQL contains a string literal directly; PostgreSQL applies its own type inference and fails. Using jet's CAST is the cleanest way to force the right-hand-side type without dropping to raw SQL. Storing '{}'::jsonb as the empty default mirrors the SQL column default.

6. Deprecate is idempotent through a pre-check Get

Decision. engineversionstore.Deprecate runs Get(version) first to distinguish three cases: row absent (return engineversion.ErrNotFound), row already deprecated (return nil with no further mutation), row active (run the UPDATE ... SET status='deprecated'). Without the pre-check the adapter would have to interpret RowsAffected == 0 against an ambiguous SQL guard (WHERE version = ? AND status != 'deprecated').

Why. Deprecation is a relatively rare admin operation; the extra read costs ≈one millisecond and removes the ambiguity. The alternative is the same classifyMissingUpdate probe pattern used by UpdateStatus, which would still need a Get to tell "missing" from "already deprecated". The pre-check is the simplest path.

7. BulkInsert ships every row in one multi-row INSERT, not a transaction

Decision. playermappingstore.BulkInsert emits a single INSERT ... VALUES (a), (b), … with as many tuples as the input slice. Any unique-violation rolls back every row in the same statement.

Why. The atomicity guarantee Game Master needs (no partial roster) is already provided by PostgreSQL's per-statement implicit transaction; wrapping the same rows in BEGIN; INSERT; INSERT; COMMIT buys nothing and adds round-trips. The multi-row form is also the only path that lets jet's InsertStatement.VALUES(...) chain without escape hatches. Atomicity is verified end-to-end by TestBulkInsertAtomicConflictRaceName (3 valid rows + 1 conflicting → 0 rows persisted).

8. miniredis/v2 is a direct gamemaster dependency

Decision. go.mod gains github.com/alicebob/miniredis/v2 as a direct dependency. The streamoffsets test suite uses miniredis.RunT(t) per test for full isolation.

Why. Same reasoning as rtmanager: an in-memory Redis is faster than testcontainers Redis, fully isolated per test, and fits the shape of the offset-store API. Adding it as a direct dep matches the pattern in the repo (rtmanager, notification, lobby all do this for similar adapter test suites).

Files landed

Verification

cd gamemaster

# Domain + port unit tests still pass after the Stage-11 contract
# touch-ups.
go test ./internal/domain/... ./internal/ports/...

# All adapter test suites (require Docker for testcontainers; without
# Docker, the pgtest helpers call t.Skip).
go test ./internal/adapters/postgres/...
go test ./internal/adapters/redisstate/...

# CAS race coverage with -race; the test must observe exactly one
# winner per run.
go test -count=3 -race -run TestUpdateStatusConcurrentCAS \
    ./internal/adapters/postgres/runtimerecordstore

# Stage 06/07 contract freeze tests stay green:
go test ./... -run Contract
go test ./... -run NotificationIntent

The full repo-level go build ./... from the workspace root also succeeds; service-layer stages (13+) and the mocks regeneration (stage 12) are unaffected by Stage 11's adapter additions.