# TESTING.md Test strategy for the [Galaxy Game](ARCHITECTURE.md) platform after the consolidation that moved every domain concern into `galaxy/backend`. The platform now ships three executables — `gateway`, `backend`, `game` (the engine container) — plus the shared `pkg/*` libraries. This document defines the layering of tests, the responsibilities of each layer, and the mandatory minimum coverage per executable. ## Three layers 1. **Service tests** verify a single executable in isolation. They live next to the implementation as `*_test.go` files and use only in-process or testcontainers-managed dependencies. 2. **Inter-service integration tests** verify one cross-process seam between two real executables (most often `gateway ↔ backend`, sometimes `backend ↔ game`). They live in [`integration/`](integration/) and drive the platform from outside the trust boundary. 3. **Full system tests** are a small, focused subset of the integration suite that walks an entire user-facing flow from the client edge through every component the flow touches. They live in the same `integration/` module and reuse the same fixtures. Service tests are the cheapest and the broadest; integration tests are slower and broader; full-system tests are the slowest and the narrowest. The pyramid stays in this order — never replace a service test with a system test. ## Global rules - Every executable owns the service tests for its packages. Adding a new package without `_test.go` files is a review block. - Every cross-process seam must have at least one passing inter-service test before the seam is wired in production. - Async flows (mail outbox, notification routes, runtime workers, push gRPC) get tests for both the success path and the retry / dead-letter path, and a duplicate-event safety check. - Sync flows get happy path, validation failure, timeout propagation, and dependency unavailable. - Every external or trusted-internal API must have contract tests alongside behaviour tests. `backend/internal/server/contract_test.go` is the reference; gateway runs the same shape against `gateway/openapi.yaml`. - The integration suite must keep running on a developer machine with Docker available; tests skip cleanly with a clear message when the daemon is unreachable. ## Service-specific coverage ### `galaxy/gateway` Service tests live under `gateway/internal/`: - Public REST routing, error projection, and OpenAPI contract validation. - Authenticated gRPC envelope verification (`grpcapi.Server`): signature, payload hash, freshness window, anti-replay reservation, unknown / revoked sessions. - Session cache (`session.BackendCache`) — the only implementation in the codebase, a thin wrapper around the `backendclient.RESTClient` per-request lookup. - Response signing for unary responses and stream events (`authn.ResponseSigner`). - Push hub (`push.Hub`) and push fan-out (`push_fanout.go`). - Replay store (`replay.RedisStore`) reservation semantics. - Anti-abuse rate limits per IP / session / user / message class. ### `galaxy/backend` Service tests live under `backend/internal/`: - Startup wiring: `app.App` lifecycle, telemetry runtime, Postgres pool, embedded migrations. - OpenAPI contract test (`internal/server/contract_test.go`): validates every documented operation against the live gin engine. - Domain unit + e2e tests per package (`auth`, `user`, `admin`, `lobby`, `runtime`, `mail`, `notification`, `geo`, `push`). E2E tests (`*_e2e_test.go`) spin up a Postgres testcontainer. - Mail outbox: pickup with `SELECT FOR UPDATE SKIP LOCKED`, retry with backoff plus jitter, dead-letter past `MAX_ATTEMPTS`, resend semantics (`pending|retrying|dead_lettered` → re-armed, `sent` → 409). - Notification: idempotent `Submit`, route materialisation, push + email fan-out, `OnUserDeleted` cascade. - Lobby: state-machine transitions, RND canonicalisation, sweeper. - Runtime: per-game mutex serialisation, worker pool, scheduler, reconciler, force-next-turn skip flag. - Admin: bcrypt cost 12, idempotent bootstrap, write-through cache, 409 Conflict on duplicate username, last-used timestamp. - Geo: counter increment on every authenticated request, declared-country write at registration, fail-open semantics. ### `galaxy/game` The engine has its own service tests under `game/`: - OpenAPI contract test (`game/openapi_contract_test.go`). - Engine lifecycle (init, status, turn, banish, command, order, report) implemented by the engine package suites. ## Integration test coverage (`integration/`) The integration module is the single home for inter-service and full-system tests. Every scenario calls `testenv.Bootstrap(t)` which brings up Postgres, Redis, mailpit, the backend image, the gateway image, and (when needed) the engine image. Mandatory inter-service coverage: - **Gateway ↔ Backend (public auth)**: `auth_flow_test.go` — register + confirm with mailpit-captured code; declared_country populated; idempotent re-confirm. - **Gateway ↔ Backend (authenticated user surface)**: `user_account_test.go`, `user_profile_update_test.go`, `user_settings_update_test.go` — signed envelope, FlatBuffers payload, response signature verification, BCP 47 / IANA validation. - **Gateway ↔ Backend (anti-replay, signature, freshness)**: `gateway_edge_test.go` — body-too-large, bad signature, payload_hash mismatch, stale timestamp, unknown session, unsupported `protocol_version`. - **Gateway ↔ Backend (push)**: `notification_flow_test.go`, `session_revoke_test.go` — push delivery to a SubscribeEvents stream and immediate stream close on revoke. - **Gateway ↔ Backend (anti-replay)**: `anti_replay_test.go` — duplicate `request_id` rejected. - **Backend ↔ Postgres** is exercised by every backend e2e test through testcontainers; integration tests do not duplicate it. - **Backend ↔ SMTP**: `mail_flow_test.go` — login-code email captured by mailpit; admin list reaches `sent`; resend on `sent` returns 409. - **Backend ↔ Game engine**: `runtime_lifecycle_test.go`, `engine_command_proxy_test.go` — start container, healthz green, command, force-next-turn, finish, race name promotion. - **Admin surface (REST)**: `admin_flow_test.go`, `admin_global_games_view_test.go`, `admin_engine_versions_test.go`, `admin_user_sanction_test.go` — bootstrap + CRUD; visibility split between user and admin queries; engine-version registry CRUD; permanent block cascade. - **Lobby flow without engine**: `lobby_flow_test.go` — owner-creates-private-game → open-enrollment → invite → redeem → memberships listing. - **Soft delete cascade**: `soft_delete_test.go` — `POST /api/v1/user/account/delete` cascades through auth/lobby/notification/geo, gateway rejects subsequent calls. - **Geo counters**: `geo_counter_increments_test.go` — multiple authenticated requests with different `X-Forwarded-For` values increment the user's per-country counter rows. Full-system flows beyond the inter-service set are intentionally limited; pick scenarios that exercise the longest vertical slice the platform supports today. ## Out-of-scope (legacy architecture) The previous nine-service architecture defined components that no longer exist as distinct services. Their behaviour either lives inside `backend` (and is therefore covered by backend service or integration tests) or has been removed: - *Auth/Session Service*, *User Service*, *Notification Service*, *Mail Service*, *Game Lobby Service*, *Runtime Manager*, *Game Master*, *Admin Service* — consolidated into `backend/internal/*`. Inter-service seams between these former services are now in-process function calls; they are exercised by backend service tests, not by integration tests. - *Geo Profile Service* (suspicious-multi-country detection, review-recommended state, session blocking through geo) — not implemented. The geo concern is intentionally minimal (see `ARCHITECTURE.md §10`) and the test plan does not assert on features we do not ship. - *Billing Service* — not implemented; no tests required until it appears. ## Practical execution During day-to-day development: - Run `go test .//...` for the service you are touching; this is fast (Postgres testcontainers add ~3–5 s per package that uses them). - Run `go test ./integration/...` before opening a PR that touches a cross-process seam. Cold runs build three Docker images (`galaxy/backend:integration`, `galaxy/gateway:integration`, `galaxy/game:integration`) — budget ~3 min for the cold path, ~75 s for the warm path. - CI runs every layer on every push. Integration tests skip with a clear message if Docker is not available. ## Adding a new test 1. Decide the layer: service, inter-service, or system. A backend change usually lands as service tests plus an integration test for any new cross-process behaviour. 2. Reuse `testenv` fixtures rather than rolling your own container orchestration. 3. Follow the bootstrap-per-test pattern; do not share a global stack across tests. 4. Make the test deterministic: explicit timeouts (no `time.Sleep`), `t.Logf` instead of `fmt.Println`, no `t.Parallel()` in `integration/`. 5. Adding a new service-test file is fine; adding an integration-test file requires that the seam be reachable through gateway's REST or gRPC surface (or through backend HTTP directly with `X-User-ID` for routes that gateway does not yet register).