# Testing Test strategy and runbook for the [Galaxy Game](ARCHITECTURE.md) platform. The platform ships three executables — `gateway`, `backend`, `game` (the engine container) — plus the shared `pkg/*` libraries. This document defines the layering of tests, the mandatory minimum coverage per executable, the integration runbook, and the principles every test must follow. ## Layers 1. **Service tests** verify a single executable in isolation. They live next to the implementation as `*_test.go` files and use only in-process or testcontainers-managed dependencies. The package either runs entirely in process or boots a single Postgres testcontainer per test. 2. **Inter-service integration tests** verify one cross-process seam between two real executables (most often `gateway ↔ backend`, sometimes `backend ↔ game`). They live in [`galaxy/integration/`](../integration/) and drive the platform from outside the trust boundary. 3. **Full system tests** are a small, focused subset of the integration suite that walks an entire user-facing flow from the client edge through every component the flow touches. They live in the same `integration/` module and reuse the same fixtures. Service tests are the cheapest and the broadest; integration tests are slower and broader; full-system tests are the slowest and the narrowest. The pyramid stays in this order — never replace a service test with a system test. ## Global rules - Every executable owns the service tests for its packages. Adding a new package without `_test.go` files is a review block. - Every cross-process seam must have at least one passing inter-service test before the seam is wired in production. - Async flows (mail outbox, notification routes, runtime workers, push gRPC) get tests for both the success path and the retry / dead-letter path, and a duplicate-event safety check. - Sync flows get happy path, validation failure, timeout propagation, and dependency unavailable. - Every external or trusted-internal API must have contract tests alongside behaviour tests. `backend/internal/server/contract_test.go` is the reference; gateway runs the same shape against `gateway/openapi.yaml`. - The integration suite must keep running on a developer machine with Docker available. The only acceptable `t.Skip` is `testenv.RequireDocker` (no daemon at all). Any failure deeper than that — `tcpostgres.Run`, network create, image build, schema migration — fails the test loudly with `t.Fatal`. The historical bug we fixed (silent skips on reaper failures masking 27 integration tests as "ok") came from treating an environment break as a skip. ## Service-specific coverage ### `galaxy/gateway` Service tests live under `gateway/internal/`: - Public REST routing, error projection, and OpenAPI contract validation. - Authenticated gRPC envelope verification (`grpcapi.Server`): signature, payload hash, freshness window, anti-replay reservation, unknown / revoked sessions. - Session cache (`session.BackendCache`) — the only implementation in the codebase, a thin wrapper around the `backendclient.RESTClient` per-request lookup. - Response signing for unary responses and stream events (`authn.ResponseSigner`). - Push hub (`push.Hub`) and push fan-out (`push_fanout.go`). - Replay store (`replay.RedisStore`) reservation semantics. - Anti-abuse rate limits per IP / session / user / message class. ### `galaxy/backend` Service tests live under `backend/internal/`: - Startup wiring: `app.App` lifecycle, telemetry runtime, Postgres pool, embedded migrations. - OpenAPI contract test (`internal/server/contract_test.go`): validates every documented operation against the live gin engine. - Domain unit + e2e tests per package (`auth`, `user`, `admin`, `lobby`, `runtime`, `mail`, `notification`, `geo`, `push`). E2E tests (`*_e2e_test.go`) spin up a Postgres testcontainer. - Mail outbox: pickup with `SELECT FOR UPDATE SKIP LOCKED`, retry with backoff plus jitter, dead-letter past `MAX_ATTEMPTS`, resend semantics (`pending|retrying|dead_lettered` → re-armed, `sent` → 409). - Notification: idempotent `Submit`, route materialisation, push + email fan-out, `OnUserDeleted` cascade. Coverage of every catalog kind in `buildClientPushEvent` lives in `internal/notification/events_test.go`. - Lobby: state-machine transitions, RND canonicalisation, sweeper. - Runtime: per-game mutex serialisation, worker pool, scheduler, reconciler, force-next-turn skip flag. - Admin: bcrypt cost 12, idempotent bootstrap, write-through cache, 409 Conflict on duplicate username, last-used timestamp. - Geo: counter increment on every authenticated request, declared-country write at registration, fail-open semantics. ### `galaxy/game` The engine has its own service tests under `game/`: - OpenAPI contract test (`game/openapi_contract_test.go`). - Engine lifecycle (init, status, turn, banish, command, order, report) implemented by the engine package suites. ## Integration runbook ### Entry points ```bash make -C integration preclean # idempotent leftover cleanup make -C integration integration # preclean + serial test run make -C integration integration-step # preclean + one-test-at-a-time ``` `integration` runs every test in the module sequentially (`-p=1 -parallel=1`) — recommended default on a slow / shared Docker. `integration-step` runs them one at a time with a fresh preclean before each test and stops on the first failure; useful to isolate a flake or build up to a full pass without losing context to subsequent tests. ### Why preclean matters `preclean` keys off labels and removes: - Containers labelled `org.testcontainers=true` (every container the testcontainers-go library brings up — backend, gateway, game, postgres, redis, mailpit, ryuk). - Containers labelled `galaxy.backend=1` — engine instances spawned by backend's runtime adapter directly on the host Docker daemon (see `backend/internal/dockerclient/types.go`). - Networks labelled `org.testcontainers=true`. - Locally-built images labelled `galaxy.test.kind=integration-image` — the `galaxy/{backend,gateway,game}:integration` builds produced by `integration/testenv/images.go`. Pulled service images (`postgres:16-alpine`, `redis:7-alpine`, `axllent/mailpit`, `testcontainers/ryuk`) are **not** touched, so the cache stays warm. ### Ryuk reaper The integration runners disable the testcontainers Ryuk reaper: ```makefile export TESTCONTAINERS_RYUK_DISABLED = true ``` This is environment-driven, not principled — Ryuk does not start cleanly on the local colima setup we use, and `preclean` covers the same job by labels. Re-enable Ryuk by exporting `TESTCONTAINERS_RYUK_DISABLED=false` (or unset) before invoking the make target if you have an environment where Ryuk works. ### Cold runs The first run after a clean checkout (or after `preclean`) rebuilds three images: `galaxy/backend:integration`, `galaxy/gateway:integration`, `galaxy/game:integration`. Cold cost is ~30 s per image. Subsequent runs reuse the build cache; `preclean` removes the tagged images themselves but BuildKit cache mounts survive, so re-builds are fast. ## Integration test coverage Mandatory inter-service coverage in `integration/`: - **Gateway ↔ Backend (public auth)**: `auth_flow_test.go` — register + confirm with mailpit-captured code; declared_country populated; idempotent re-confirm. - **Gateway ↔ Backend (authenticated user surface)**: `user_account_test.go`, `user_profile_update_test.go`, `user_settings_update_test.go` — signed envelope, FlatBuffers payload, response signature verification, BCP 47 / IANA validation. - **Gateway ↔ Backend (anti-replay, signature, freshness)**: `gateway_edge_test.go` — body-too-large, bad signature, payload_hash mismatch, stale timestamp, unknown session, unsupported `protocol_version`. - **Gateway ↔ Backend (push)**: `notification_flow_test.go`, `session_revoke_test.go` — push delivery to a SubscribeEvents stream and immediate stream close on revoke. - **Gateway ↔ Backend (anti-replay)**: `anti_replay_test.go` — duplicate `request_id` rejected. - **Backend ↔ Postgres** is exercised by every backend e2e test through testcontainers; integration tests do not duplicate it. - **Backend ↔ SMTP**: `mail_flow_test.go` — login-code email captured by mailpit; admin list reaches `sent`; resend on `sent` returns 409. - **Backend ↔ Game engine**: `runtime_lifecycle_test.go`, `engine_command_proxy_test.go` — start container, healthz green, command, force-next-turn, finish, race name promotion. - **Admin surface (REST)**: `admin_flow_test.go`, `admin_global_games_view_test.go`, `admin_engine_versions_test.go`, `admin_user_sanction_test.go` — bootstrap + CRUD; visibility split between user and admin queries; engine-version registry CRUD; permanent block cascade. - **Lobby flow without engine**: `lobby_flow_test.go` — owner-creates-private-game → open-enrollment → invite → redeem → memberships listing. - **Soft delete cascade**: `soft_delete_test.go` — `POST /api/v1/user/account/delete` cascades through auth/lobby/notification/geo, gateway rejects subsequent calls. - **Geo counters**: `geo_counter_increments_test.go` — multiple authenticated requests with different `X-Forwarded-For` values increment the user's per-country counter rows. Full-system flows beyond the inter-service set are intentionally limited; pick scenarios that exercise the longest vertical slice the platform supports today. ## Principles ### Service tests - **Postgres testcontainers must pin no-op observability providers.** Tests that call `pgshared.OpenPrimary(ctx, cfg)` from `galaxy/postgres` pass `backendpg.NoObservabilityOptions()...` so `otelsql` cannot fall through to the global tracer/meter providers. Without this, an unset OTEL endpoint in the developer environment can stall the test on a background exporter handshake. See `backend/internal/postgres/testopts.go` for the helper and `backend/internal/{auth,user,admin,lobby,mail,notification,runtime,geo,postgres}/` test files for the established call sites. - **A bootstrap failure is fatal, not a skip.** A test that needs a testcontainer must fail loudly when the container fails to come up. `t.Skipf` is reserved for `testenv.RequireDocker` (no daemon at all); anything past that — `tcpostgres.Run`, `db.Ping`, schema migration — uses `t.Fatalf`. ### Integration tests - **Bootstrap is per-test.** Each test calls `testenv.Bootstrap(t)` to spin up a dedicated Postgres, Redis, mailpit, backend, and gateway. Cross-test contamination is impossible. - **Tests do not call `t.Parallel`.** Docker resource pressure makes parallel bootstraps flaky on commodity hardware. - **Anti-abuse limits are loosened by `testenv/gateway.go`.** The bulk-scenario default lifts every gateway rate-limit class (`public_auth`, identity-bucket per-email, IP/session/user/ message-class) to 10 000 req/window with a 1 000 burst. Negative- path edge tests in `gateway_edge_test.go` tighten specific limits per test to observe the protection firing. - **Image labels are intentional.** `integration/testenv/images.go` stamps every locally-built image with `galaxy.test.kind=integration-image`; `preclean` keys off this label. Do not strip it from new image builds added to the test harness. ## Test file ownership matrix | Suite | Where | Boots | Runs how | |--------------------------------------------|-------------------|----------------------------------------------------------------------|-------------------------------------------| | `backend/internal//...` unit | per package | one Postgres testcontainer per test | `go test ./internal//` | | `backend/push` | `backend/push/` | nothing | `go test ./push/` | | `gateway/internal//...` unit | per package | mostly nothing; few use redis tc | `go test ./internal//` | | `pkg/transcoder`, `pkg/postgres` unit | per package | nothing / one tc per test | `go test ./...` from the package | | `integration/` | `integration/` | postgres + redis + mailpit + backend + gateway (+ optional game) | `make -C integration integration` | ## Adding a new test 1. Decide the layer: service, inter-service, or system. A backend change usually lands as service tests plus an integration test for any new cross-process behaviour. 2. Reuse `testenv` fixtures rather than rolling your own container orchestration. 3. Follow the bootstrap-per-test pattern; do not share a global stack across tests. 4. Make the test deterministic: explicit timeouts (no `time.Sleep`), `t.Logf` instead of `fmt.Println`, no `t.Parallel()` in `integration/`. 5. Service test that hits Postgres: copy the `startPostgres(t)` helper from one of the existing packages (e.g. `backend/internal/auth/auth_e2e_test.go`) and pass `backendpg.NoObservabilityOptions()...` to `pgshared.OpenPrimary`. 6. Integration test: add the file under `integration/`, call `testenv.Bootstrap(t)`, and use the typed clients exposed by `testenv` rather than reaching for raw HTTP. New scenarios that need bespoke gateway env should pass `Extra` through `BootstrapOptions` so the loosened defaults stay shared. 7. Any test that brings up its own Docker container (rare — most go through `testenv`) must label the container so `preclean` can find it on the next run. ## Day-to-day execution - Run `go test .//...` for the service you are touching; this is fast (Postgres testcontainers add ~3–5 s per package that uses them). - Run `make -C integration integration` before opening a PR that touches a cross-process seam. Cold runs build three Docker images (`galaxy/backend:integration`, `galaxy/gateway:integration`, `galaxy/game:integration`) — budget ~3 min for the cold path, ~75 s for the warm path. - Use `make -C integration integration-step` when a flake or a real regression needs a per-test isolation pass. - CI runs every layer on every push. Integration tests rely on a reachable Docker daemon; missing daemon yields a clear skip from `testenv.RequireDocker`, anything past that is a hard failure. ## Out-of-scope (legacy architecture) The previous nine-service architecture defined components that no longer exist as distinct services. Their behaviour either lives inside `backend` (and is therefore covered by backend service or integration tests) or has been removed: - *Auth/Session Service*, *User Service*, *Notification Service*, *Mail Service*, *Game Lobby Service*, *Runtime Manager*, *Game Master*, *Admin Service* — consolidated into `backend/internal/*`. Inter-service seams between these former services are now in-process function calls; they are exercised by backend service tests, not by integration tests. - *Geo Profile Service* (suspicious-multi-country detection, review-recommended state, session blocking through geo) — not implemented. The geo concern is intentionally minimal (see `ARCHITECTURE.md §10`) and the test plan does not assert on features we do not ship. - *Billing Service* — not implemented; no tests required until it appears.