9.4 KiB
TESTING.md
Test strategy for the Galaxy Game platform after the
consolidation that moved every domain concern into galaxy/backend.
The platform now ships three executables — gateway, backend,
game (the engine container) — plus the shared pkg/* libraries.
This document defines the layering of tests, the responsibilities of
each layer, and the mandatory minimum coverage per executable.
Three layers
- Service tests verify a single executable in isolation. They
live next to the implementation as
*_test.gofiles and use only in-process or testcontainers-managed dependencies. - Inter-service integration tests verify one cross-process seam
between two real executables (most often
gateway ↔ backend, sometimesbackend ↔ game). They live inintegration/and drive the platform from outside the trust boundary. - Full system tests are a small, focused subset of the
integration suite that walks an entire user-facing flow from the
client edge through every component the flow touches. They live in
the same
integration/module and reuse the same fixtures.
Service tests are the cheapest and the broadest; integration tests are slower and broader; full-system tests are the slowest and the narrowest. The pyramid stays in this order — never replace a service test with a system test.
Global rules
- Every executable owns the service tests for its packages. Adding a
new package without
_test.gofiles is a review block. - Every cross-process seam must have at least one passing inter-service test before the seam is wired in production.
- Async flows (mail outbox, notification routes, runtime workers, push gRPC) get tests for both the success path and the retry / dead-letter path, and a duplicate-event safety check.
- Sync flows get happy path, validation failure, timeout propagation, and dependency unavailable.
- Every external or trusted-internal API must have contract tests
alongside behaviour tests.
backend/internal/server/contract_test.gois the reference; gateway runs the same shape againstgateway/openapi.yaml. - The integration suite must keep running on a developer machine with Docker available; tests skip cleanly with a clear message when the daemon is unreachable.
Service-specific coverage
galaxy/gateway
Service tests live under gateway/internal/:
- Public REST routing, error projection, and OpenAPI contract validation.
- Authenticated gRPC envelope verification (
grpcapi.Server): signature, payload hash, freshness window, anti-replay reservation, unknown / revoked sessions. - Session cache (
session.BackendCache) — the only implementation in the codebase, a thin wrapper around thebackendclient.RESTClientper-request lookup. - Response signing for unary responses and stream events
(
authn.ResponseSigner). - Push hub (
push.Hub) and push fan-out (push_fanout.go). - Replay store (
replay.RedisStore) reservation semantics. - Anti-abuse rate limits per IP / session / user / message class.
galaxy/backend
Service tests live under backend/internal/:
- Startup wiring:
app.Applifecycle, telemetry runtime, Postgres pool, embedded migrations. - OpenAPI contract test (
internal/server/contract_test.go): validates every documented operation against the live gin engine. - Domain unit + e2e tests per package (
auth,user,admin,lobby,runtime,mail,notification,geo,push). E2E tests (*_e2e_test.go) spin up a Postgres testcontainer. - Mail outbox: pickup with
SELECT FOR UPDATE SKIP LOCKED, retry with backoff plus jitter, dead-letter pastMAX_ATTEMPTS, resend semantics (pending|retrying|dead_lettered→ re-armed,sent→ 409). - Notification: idempotent
Submit, route materialisation, push + email fan-out,OnUserDeletedcascade. - Lobby: state-machine transitions, RND canonicalisation, sweeper.
- Runtime: per-game mutex serialisation, worker pool, scheduler, reconciler, force-next-turn skip flag.
- Admin: bcrypt cost 12, idempotent bootstrap, write-through cache, 409 Conflict on duplicate username, last-used timestamp.
- Geo: counter increment on every authenticated request, declared-country write at registration, fail-open semantics.
galaxy/game
The engine has its own service tests under game/:
- OpenAPI contract test (
game/openapi_contract_test.go). - Engine lifecycle (init, status, turn, banish, command, order, report) implemented by the engine package suites.
Integration test coverage (integration/)
The integration module is the single home for inter-service and
full-system tests. Every scenario calls testenv.Bootstrap(t) which
brings up Postgres, Redis, mailpit, the backend image, the gateway
image, and (when needed) the engine image.
Mandatory inter-service coverage:
- Gateway ↔ Backend (public auth):
auth_flow_test.go— register + confirm with mailpit-captured code; declared_country populated; idempotent re-confirm. - Gateway ↔ Backend (authenticated user surface):
user_account_test.go,user_profile_update_test.go,user_settings_update_test.go— signed envelope, FlatBuffers payload, response signature verification, BCP 47 / IANA validation. - Gateway ↔ Backend (anti-replay, signature, freshness):
gateway_edge_test.go— body-too-large, bad signature, payload_hash mismatch, stale timestamp, unknown session, unsupportedprotocol_version. - Gateway ↔ Backend (push):
notification_flow_test.go,session_revoke_test.go— push delivery to a SubscribeEvents stream and immediate stream close on revoke. - Gateway ↔ Backend (anti-replay):
anti_replay_test.go— duplicaterequest_idrejected. - Backend ↔ Postgres is exercised by every backend e2e test through testcontainers; integration tests do not duplicate it.
- Backend ↔ SMTP:
mail_flow_test.go— login-code email captured by mailpit; admin list reachessent; resend onsentreturns 409. - Backend ↔ Game engine:
runtime_lifecycle_test.go,engine_command_proxy_test.go— start container, healthz green, command, force-next-turn, finish, race name promotion. - Admin surface (REST):
admin_flow_test.go,admin_global_games_view_test.go,admin_engine_versions_test.go,admin_user_sanction_test.go— bootstrap + CRUD; visibility split between user and admin queries; engine-version registry CRUD; permanent block cascade. - Lobby flow without engine:
lobby_flow_test.go— owner-creates-private-game → open-enrollment → invite → redeem → memberships listing. - Soft delete cascade:
soft_delete_test.go—POST /api/v1/user/account/deletecascades through auth/lobby/notification/geo, gateway rejects subsequent calls. - Geo counters:
geo_counter_increments_test.go— multiple authenticated requests with differentX-Forwarded-Forvalues increment the user's per-country counter rows.
Full-system flows beyond the inter-service set are intentionally limited; pick scenarios that exercise the longest vertical slice the platform supports today.
Out-of-scope (legacy architecture)
The previous nine-service architecture defined components that no
longer exist as distinct services. Their behaviour either lives
inside backend (and is therefore covered by backend service or
integration tests) or has been removed:
- Auth/Session Service, User Service, Notification Service,
Mail Service, Game Lobby Service, Runtime Manager,
Game Master, Admin Service — consolidated into
backend/internal/*. Inter-service seams between these former services are now in-process function calls; they are exercised by backend service tests, not by integration tests. - Geo Profile Service (suspicious-multi-country detection,
review-recommended state, session blocking through geo) — not
implemented. The geo concern is intentionally minimal (see
ARCHITECTURE.md §10) and the test plan does not assert on features we do not ship. - Billing Service — not implemented; no tests required until it appears.
Practical execution
During day-to-day development:
- Run
go test ./<service>/...for the service you are touching; this is fast (Postgres testcontainers add ~3–5 s per package that uses them). - Run
go test ./integration/...before opening a PR that touches a cross-process seam. Cold runs build three Docker images (galaxy/backend:integration,galaxy/gateway:integration,galaxy/game:integration) — budget ~3 min for the cold path, ~75 s for the warm path. - CI runs every layer on every push. Integration tests skip with a clear message if Docker is not available.
Adding a new test
- Decide the layer: service, inter-service, or system. A backend change usually lands as service tests plus an integration test for any new cross-process behaviour.
- Reuse
testenvfixtures rather than rolling your own container orchestration. - Follow the bootstrap-per-test pattern; do not share a global stack across tests.
- Make the test deterministic: explicit timeouts (no
time.Sleep),t.Logfinstead offmt.Println, not.Parallel()inintegration/. - Adding a new service-test file is fine; adding an
integration-test file requires that the seam be reachable
through gateway's REST or gRPC surface (or through backend HTTP
directly with
X-User-IDfor routes that gateway does not yet register).