Files
galaxy-game/rtmanager/docs/integration-tests.md
T
2026-04-28 20:39:18 +02:00

7.7 KiB
Raw Blame History

Service-Local Integration Suite

This document explains the design of the service-local integration suite under ../integration/. The current-state behaviour (harness layout, env knobs, scenario coverage) lives next to the files themselves; this document records the rationale.

The cross-service Lobby↔RTM suite at ../../integration/lobbyrtm/ follows different rules (it lives in the top-level galaxy/integration module) and is documented inside that package.

1. Build tag integration

The scenarios under ../integration/*_test.go are guarded by //go:build integration. The default go test ./... invocation skips them, while go test -tags=integration ./integration/... (and the make integration target) runs the full set:

make -C rtmanager integration

The harness package itself (../integration/harness) has no build tag. It compiles on every run because each helper guards its Docker-dependent paths with t.Skip when the daemon is unavailable. This keeps the harness loadable from a tagless go vet or IDE workflow without dragging Docker into the default go test critical path.

2. Smoke test runs in the default go test pass

../internal/adapters/docker/smoke_test.go runs in the regular go test ./... pass and falls back on skipUnlessDockerAvailable when no Docker socket is present. The smoke test is intentionally kept separate from the new integration/ suite because it exercises the production adapter shape (one container at a time against alpine:3.21), not the full runtime; both surfaces are useful.

3. In-process app.NewRuntime instead of a cmd/rtmanager subprocess

The harness drives Runtime Manager through app.NewRuntime(ctx, cfg, logger) directly rather than spawning the binary from cmd/rtmanager/main.go:

  • Cleanup is deterministic. A t.Cleanup block can cancel() the runtime context and call runtime.Close(); the goroutine driving runtime.Run returns with context.Canceled and the helper waits on it via the runDone channel. With a subprocess the equivalent dance requires SIGTERM, output capture, and graceful shutdown timing tied to the child's signal handler.
  • Goroutine and store visibility. Tests read the durable PG state directly through the harness-owned pool and read every Redis stream through the harness-owned client. Both observe the exact wire shape Lobby will see in the cross-service suite.
  • Logger isolation. The harness defaults to slog.Discard so the default test output stays focused on assertions; flipping EnvOptions.LogToStderr lights up the runtime's structured logs for local debugging without requiring any subprocess plumbing.

The cross-service inter-process suite at integration/lobbyrtm/ re-uses the existing integration/internal/harness binary-spawn helpers; the in-process choice here is specific to the service-local scope.

4. httptest.Server stub for the Lobby internal client

Runtime Manager configuration requires a non-empty RTMANAGER_LOBBY_INTERNAL_BASE_URL, and the start service makes a diagnostic GET /api/v1/internal/games/{game_id} call that v1 treats as a no-op (the start envelope already carries the only required field, image_ref; rationale in services.md §7). The harness therefore stands up a tiny httptest.Server per test that returns a stable 200 OK response. The stub is intentionally unconfigurable: every integration scenario produces the same ancillary fetch, and adding routing/error injection would invite test code to depend on a contract the start service deliberately ignores.

5. One built engine image, two semver-compatible tags

The patch lifecycle expects the new and current image refs to share the same major / minor version (semver_patch_only failure otherwise). Building two distinct images would multiply the per-run build cost without changing what the test verifies — the patch path exercises image_ref_not_semver and semver_patch_only validation plus the recreate-with-new-tag flow, none of which depend on distinct image content. The harness builds the engine once and calls client.ImageTag to alias it as both galaxy/game:1.0.0-rtm-it and galaxy/game:1.0.1-rtm-it. Both share the same digest.

The integration tags use the *-rtm-it suffix (rather than plain galaxy/game:1.0.0) so an operator running the suite locally cannot accidentally consume a hand-built dev image, and so a docker image rm of integration leftovers does not nuke a production-shaped tag.

6. Per-test Docker network and per-test state root

EnsureNetwork(t) creates a uniquely-named bridge network per test and registers cleanup; t.ArtifactDir() provides the per-game state root. Both ensure that two scenarios running back-to-back cannot collide on the per-game DNS hostname (galaxy-game-{game_id}) or on filesystem state. Game ids are themselves unique per test (harness.IDFromTestName adds a nanosecond suffix) — combined with the per-test network and state root, the suite is safe to run with -count greater than one.

t.ArtifactDir() keeps the engine state directory around when a test fails (Go ≥ 1.25), so an operator can cd into it after a CI failure and inspect what the engine wrote. On success the directory is automatically cleaned up.

7. PostgreSQL and Redis containers shared per-package

Both fixtures use sync.Once to start one testcontainer per test package, mirroring the ../internal/adapters/postgres/internal/pgtest pattern. TruncatePostgres and FlushRedis reset state between tests so each scenario starts on an empty stack. The trade-off versus per-test containers is the standard one: container startup dominates the per-package latency, so amortising it across the suite keeps the loop tight while the truncate/flush ensures isolation. The ~12 s difference matters in CI.

8. Engine image cache is intentionally retained between runs

buildAndTagEngineImage runs once per package via sync.Once and leaves both image tags in the local Docker cache after the suite exits. The cache is a substantial speed-up on a developer laptop (docker build of galaxy/game takes 30+ seconds cold, sub-second hot), and a stale image is unlikely because the tags carry the *-rtm-it suffix and the underlying Dockerfile is forward-compatible with multiple test runs. Operators who suspect a stale image can docker image rm galaxy/game:1.0.0-rtm-it galaxy/game:1.0.1-rtm-it; the next run rebuilds.

9. Scenario coverage

The suite covers the four end-to-end flows operators care about:

  • lifecycle (lifecycle_test.go) — start → inspect → stop → restart → patch → stop → cleanup. The intermediate stop between patch and cleanup is intentional: the cleanup endpoint refuses to remove a running container per ../README.md §Cleanup.
  • replay (replay_test.go) — duplicate start / stop entries surface as replay_no_op per workers.md §11.
  • health (health_test.go) — external docker rm produces container_disappeared; manual docker run is adopted by the reconciler.
  • notification (notification_test.go) — unresolvable image_ref produces runtime.image_pull_failed plus a failure job_result.

10. Service-local scope only

This suite runs Runtime Manager against a real Docker daemon plus testcontainers PG / Redis but does not include any other Galaxy service. Cross-service flows (Lobby ↔ RTM, RTM ↔ Notification) live in the top-level galaxy/integration/ module, where the harness spawns multiple service binaries and uses real (not stubbed) cross- service streams.