Files
galaxy-game/TESTING.md
T
2026-04-09 12:07:03 +02:00

996 lines
28 KiB
Markdown

# TESTING.md
## Purpose
This document defines the testing strategy for the Galaxy Plus platform and provides a staged testing matrix aligned with the agreed service implementation order.
The strategy is built around the current architecture constraints:
* `Edge Gateway` is the single public ingress and owns the external transport, authenticated gRPC verification pipeline, routing, and push delivery.
* `Auth / Session Service` is the source of truth for challenges and `device_session`, but it must not become the hot-path dependency for every authenticated request.
* `Geo Profile Service` is asynchronous and auxiliary; it must not block the current request and only affects subsequent requests.
* Internal event propagation already exists as an architectural pattern through Redis-backed cache updates and pub/sub-style flows.
## Global Testing Strategy
* Start with **service tests** for each service in isolation.
* As soon as a new service is integrated with already implemented services, add **inter-service integration tests** for that concrete boundary.
* Only after all major components are implemented, add **full system tests** that exercise complete end-to-end platform flows.
* Do not postpone all integration testing until the end.
* Do not try to replace service tests with end-to-end tests.
* Keep most tests deterministic and cheap to run.
* Use real Redis in integration tests where Redis is part of the service contract.
* Keep `Mail Service` stubbed in most integration and system tests, except for a small dedicated smoke suite for the real mail adapter.
* Prefer fake or test-specific implementations for external side effects until the corresponding real service is intentionally introduced.
* For every new service:
* first add service tests;
* then add inter-service tests against already implemented services;
* then add regression scenarios to the growing system test suite.
* For asynchronous flows:
* test both successful delivery and delayed/eventual delivery;
* test duplicate event handling;
* test retry-safe and idempotent consumption;
* test observability of stuck or failed processing.
* For synchronous flows:
* test happy path, validation failures, timeout propagation, dependency unavailability, and deterministic error mapping.
* Every service with an external or trusted internal API must have contract tests in addition to behavioral tests.
* Every service that publishes or consumes Redis Stream events must have schema/contract tests for those event payloads.
* Full system tests should be small in number but broad in vertical coverage.
## Test Layer Definitions
### Service tests
Service tests verify one component in isolation.
They include:
* domain/model tests;
* use-case/service-layer tests;
* adapter tests for storage, queues, clocks, IDs, and protocol encoding;
* API handler/controller tests;
* contract tests for DTOs and stable error surfaces;
* service-local integration tests with owned infrastructure such as Redis.
### Inter-service integration tests
Inter-service integration tests verify one real boundary between two or more already implemented services.
They include:
* synchronous API compatibility;
* event publication and consumption;
* error propagation across service boundaries;
* cache/projection compatibility;
* retry and idempotency behavior across the seam;
* compatibility of internal authenticated context and domain decisions.
### Full system tests
Full system tests verify complete user or admin flows through the real architecture.
They include:
* gateway ingress;
* authentication;
* user/profile state;
* game lifecycle;
* notifications and push;
* runtime orchestration;
* administrative operations;
* failure and recovery behavior across multiple services.
## Test Environment Rules
* Use an isolated Redis instance per integration test suite or per test worker.
* Use a stub `Mail Service` by default.
* Use fake/test doubles for not-yet-implemented downstream services.
* Introduce real downstream services progressively as they are implemented.
* Use a test engine container or test engine stub for `Game Master` and `Runtime Manager` tests before relying on a real production engine image.
* Use deterministic test clocks where scheduling or expiration matters.
* Make async tests wait on observable states, not arbitrary sleeps, whenever possible.
* Keep one small smoke suite for:
* real Redis;
* real runtime backend path;
* real SMTP adapter later;
* real signed gateway request/response flow.
## Recommended Service Implementation and Testing Order
The testing plan follows this service order:
* `Edge Gateway Service`
* `Auth / Session Service`
* `User Service`
* `Mail Service`
* `Notification Service`
* `Game Lobby Service`
* `Runtime Manager`
* `Game Master`
* `Admin Service`
* `Geo Profile Service`
* `Billing Service`
---
## 1. Edge Gateway Service
### Service tests
* Public REST routing tests:
* `GET /healthz`
* `GET /readyz`
* mounted public auth routes
* rejection of oversized public request bodies
* public rate-limit behavior
* stable projection of upstream public auth errors
* Authenticated gRPC envelope validation tests:
* missing required fields
* unsupported `protocol_version`
* malformed `payload_hash`
* mismatched `payload_hash`
* invalid signature
* stale timestamp
* replay detection
* unknown session
* revoked session
* Session cache behavior tests:
* cache hit
* cache miss
* malformed cached record
* cache invalidation/update handling
* Response signing tests:
* signed unary response generation
* signed bootstrap push event generation
* signed stream event generation
* Routing tests:
* unrouted `message_type`
* downstream timeout mapping
* downstream availability mapping
* authenticated internal command context construction
* Push tests:
* `SubscribeEvents` binds `user_id` and `device_session_id`
* bootstrap server-time event is emitted
* stream queue overflow closes only the affected stream
* revoked session closes matching streams only
* Anti-abuse tests:
* IP/session/user/message-class buckets
* interaction between rate limits and verification order
* Redis adapter tests:
* session cache lookup
* replay reservation
* client event stream consumption
* session event stream consumption
### Inter-service integration tests at this stage
* `Gateway <-> Redis`
* session cache compatibility
* replay reservation semantics
* event stream consumption for push
* `Gateway <-> stub Auth adapter`
* public auth passthrough
* timeout/error projection
* `Gateway <-> fake downstream`
* verified authenticated command routing
* signed response generation after downstream success
### Regression tests to keep from this stage onward
* Authenticated request verification pipeline remains stable.
* Public auth routes remain mounted and deterministic.
* Push bootstrap event remains signed and schema-compatible.
---
## 2. Auth / Session Service
### Service tests
* Challenge lifecycle tests:
* challenge creation
* TTL expiration
* resend throttling
* delivery state transitions
* invalid confirm attempt limits
* success-shaped `send-email-code` behavior
* Confirm flow tests:
* valid `challenge_id + code + client_public_key`
* malformed `client_public_key`
* blocked user
* existing user
* creatable user
* short-window idempotent confirm retry
* same challenge plus different public key failure
* session-limit exceeded
* Session lifecycle tests:
* create session
* revoke one session
* revoke all sessions
* block user/email and revoke implied sessions
* already-revoked and already-blocked idempotent results
* Projection tests:
* source-of-truth session write
* gateway KV snapshot write
* gateway session stream event publish
* repeated publish idempotency
* Public API tests:
* JSON decoding and unknown field rejection
* public error mapping
* stable success DTO shape
* Internal API tests:
* `GetSession`
* `ListUserSessions`
* `RevokeDeviceSession`
* `RevokeAllUserSessions`
* `BlockUser`
* Redis adapter tests:
* challenge store
* session store
* config provider
* projection publisher
### Inter-service integration tests with already implemented components
* `Gateway <-> Auth / Session`
* public `send-email-code`
* public `confirm-email-code`
* upstream timeout handling
* public error passthrough
* `Auth / Session <-> Redis`
* challenge persistence
* session persistence
* session projection compatibility
* `Gateway <-> Auth / Session <-> Redis`
* login creates session
* session projection becomes visible to gateway
* revoked session invalidates gateway authentication path
* revoked session closes gateway push stream
* `Auth / Session <-> stub Mail`
* auth code send path
* suppression path
* explicit mail failure path
### Regression tests to keep from this stage onward
* `confirm-email-code` always returns a ready `device_session_id`.
* Gateway continues authenticating from cache rather than synchronous auth lookups.
* Confirm idempotency window behavior remains stable.
* Session projection remains compatible with gateway expectations.
---
## 3. User Service
### Service tests
* User creation and identity tests:
* create user
* find by email
* normalized email uniqueness
* role assignment
* tariff/entitlement fields
* Profile tests:
* allowed profile reads
* allowed profile edits
* forbidden profile edits
* settings reads/writes
* Restriction/sanction tests:
* block flags
* user limits
* override fields
* declared current sanctions view
* Entitlement tests:
* free user
* paid placeholder states
* default simultaneous-game limit and per-user overrides
* Internal/admin-oriented tests:
* resolve existing/creatable/blocked decision for auth
* current `declared_country` read/write path
* Storage and API contract tests:
* public/trusted endpoints
* stable DTO mapping
* Redis persistence if used directly in v1
### Inter-service integration tests with already implemented components
* `Auth / Session <-> User`
* resolve existing user
* create new user during confirm
* blocked-by-policy outcome
* `Gateway <-> User`
* authenticated profile read
* authenticated allowed profile update
* tariff and settings read paths
* `Gateway <-> Auth / Session <-> User`
* first registration by email
* repeat login by same email
* blocked email/user behavior
### Regression tests to keep from this stage onward
* User resolution outcomes remain stable for auth flow.
* User-facing profile APIs do not bypass auth/session rules.
* User limit and sanction data stay compatible with downstream consumers.
---
## 4. Mail Service
### Service tests
* Mail command validation tests:
* recipient validation
* template selection
* payload rendering
* Internal queue tests:
* enqueue
* dequeue
* retry
* permanent failure
* idempotent duplicate suppression where applicable
* Delivery adapter tests:
* stub adapter behavior
* future SMTP adapter smoke behavior
* Operational tests:
* queue backlog metrics
* dead-letter or failure recording behavior
* timeout handling
### Inter-service integration tests with already implemented components
* `Auth / Session <-> Mail`
* direct auth-code send
* explicit mail failure behavior
* suppression path still preserves correct auth semantics
* `Gateway <-> Auth / Session <-> Mail`
* public auth flow still behaves correctly with mail delivery involved
* Keep `Mail Service` stubbed in most broader suites.
* Add only a small dedicated smoke suite for the real mail adapter.
### Regression tests to keep from this stage onward
* Auth code mail remains a direct dependency of auth flow.
* Mail failures do not corrupt auth challenge/session state.
* Stub mail remains the default for most non-mail-focused suites.
---
## 5. Notification Service
### Service tests
* Event intake tests:
* accepted event types
* malformed event rejection
* idempotent duplicate handling
* Routing decision tests:
* push only
* email only
* push and email
* discard/no-delivery cases
* Rendering tests:
* event-to-notification mapping
* payload shaping for push
* payload shaping for email
* Failure isolation tests:
* push failure does not corrupt email route decision
* email failure does not corrupt push route decision
* retriable delivery behavior
* Redis/event bus tests:
* consume domain/integration events
* publish client-facing events for gateway
* enqueue mail commands for mail service
### Inter-service integration tests with already implemented components
* `Notification <-> Gateway`
* client-facing event publication and push delivery
* user-targeted vs session-targeted push routing
* `Notification <-> Mail`
* non-auth email delivery
* retry/failure isolation
* `Lobby/other fake producers <-> Notification`
* domain event intake compatibility
* Assert explicitly that auth-code emails still bypass notification and go directly from auth to mail.
### Regression tests to keep from this stage onward
* Notification stays delivery/orchestration-only and does not become source of truth.
* Non-auth notifications consistently go through notification service.
* Gateway push compatibility remains stable.
---
## 6. Game Lobby Service
### Service tests
* Game lifecycle tests:
* `draft`
* `enrollment_open`
* `enrollment_closed`
* `ready_to_start`
* `starting`
* `running`
* `paused`
* `finished`
* `cancelled`
* Public/private game rules:
* public game creation by admin only
* private game creation entitlement checks
* visibility rules for private games
* Invite lifecycle tests:
* invite code creation
* invite code redemption
* invite approval/rejection
* invite expiration if applicable later
* Application and approval tests:
* public game application
* manual approval
* duplicate application handling
* Membership tests:
* invited
* pending
* accepted
* removed
* blocked from party
* User list/read-model tests:
* active games
* finished games
* pending applications
* invited games
* Start-preparation tests:
* roster validation
* schedule validation
* engine version target validation
* readiness to start
* Runtime snapshot import tests:
* `current_turn`
* `runtime_status`
* `engine_health_summary`
### Inter-service integration tests with already implemented components
* `Gateway <-> Game Lobby`
* authenticated platform-level command routing
* owner-only commands before start
* `Lobby <-> User`
* entitlement checks for private game creation
* per-user simultaneous-game limits
* sanctions affecting join/create flows
* `Lobby <-> Notification`
* invite events
* approval/rejection events
* game status change events at platform level
* `Lobby <-> Auth / Session`
* authenticated context correctly propagated from gateway
* Keep runtime launch boundaries stubbed until `Runtime Manager` exists.
### Regression tests to keep from this stage onward
* `Lobby` remains source of truth for platform game metadata and membership.
* `Lobby` user-facing game lists remain independent from `Game Master`.
* Private-game visibility and invite semantics remain stable.
---
## 7. Runtime Manager
### Service tests
* Runtime job tests:
* start container
* stop container
* restart container
* patch container
* inspect/status
* Invariant tests:
* one game -> one container
* one container -> one game
* Monitoring tests:
* health probe collection
* health event publication
* container disappearance handling
* restart/patch result reporting
* Failure tests:
* Docker API unavailable
* image missing
* startup timeout
* stop timeout
* patch failure
* Event publication tests:
* runtime job completion events
* technical health events
* duplicate event safety
### Inter-service integration tests with already implemented components
* `Lobby <-> Runtime Manager`
* async start job request
* completion event consumption
* full fail-start path
* `Runtime Manager <-> Notification`
* optional operational event routing if enabled
* Use a fake or test runtime backend first, then a targeted smoke suite against a real local Docker backend.
### Regression tests to keep from this stage onward
* Runtime Manager remains the only component talking to Docker API.
* Runtime job event contracts remain stable for `Lobby` and later `Game Master`.
---
## 8. Game Master
### Service tests
* Runtime registry tests:
* register running game
* unregister/stop game
* runtime state transitions
* Engine version registry tests:
* version registration
* patch compatibility policy
* version-specific options
* Runtime metadata tests:
* current turn
* runtime status
* generation status
* engine health summary
* patch state
* Membership/runtime mapping tests:
* `user_id -> engine player UUID`
* game-scoped engine identifiers
* Scheduling tests:
* scheduled turn generation
* cutoff enforcement
* manual force-next-turn
* skip-next-scheduled-slot after manual generation
* Failure tests:
* `generation_failed`
* `engine_unreachable`
* runtime recovery from engine errors
* Post-start administrative tests:
* `stop game`
* `patch engine`
* temporary player removal at platform gate only
* final player removal/deactivation inside engine
* Engine mediation tests:
* engine setup after lobby metadata persistence
* engine finish notification handling
### Inter-service integration tests with already implemented components
* `Gateway <-> Game Master`
* running-game command routing with `game_id`
* runtime-admin commands for running games
* system admin vs private-owner privileges where applicable
* `Game Master <-> Lobby`
* running-game registration after successful container start
* membership lookup/cached authorization
* runtime snapshot backfill into lobby
* finished-game notification to lobby
* `Game Master <-> Runtime Manager`
* patch/stop/restart jobs
* runtime health event consumption
* `Game Master <-> Notification`
* new turn event publication
* game finished event publication
* generation failure admin notification
* `Game Master <-> test engine container`
* command proxying
* status read
* setup call
* finish callback
### Regression tests to keep from this stage onward
* `Game Master` remains the only service allowed to call game engine containers.
* Turn cutoff logic stays authoritative at platform level.
* Manual next-turn generation always suppresses the next scheduled slot.
* Runtime snapshot compatibility with `Lobby` remains stable.
---
## 9. Admin Service
### Service tests
* Admin API surface tests:
* admin-only route handling
* DTO validation
* aggregation/read models
* Orchestration tests:
* forwards trusted operations to downstream services
* error aggregation and normalization
* partial failure handling for multi-step admin workflows
* Role-handling tests:
* admin-only enforcement assumptions
* no accidental privilege leak into normal user flows
### Inter-service integration tests with already implemented components
* `Gateway <-> Admin`
* separate admin REST surface
* admin-authenticated request handling
* `Admin <-> User`
* user restriction/sanction/admin reads
* `Admin <-> Lobby`
* public game administration
* global read of private games
* `Admin <-> Game Master`
* runtime administration
* global status reads
* patch/stop/force-next-turn
* `Admin <-> Auth / Session`
* session revoke/block operations if exposed through admin workflows
* `Admin <-> Notification`
* admin-generated notifications where needed
### Regression tests to keep from this stage onward
* Admin Service remains orchestration/backend only.
* System admin capabilities remain separate from private-owner capabilities.
---
## 10. Geo Profile Service
### Service tests
* Ingest tests:
* enqueue authenticated observation
* ingest validation
* non-blocking acceptance
* Worker pipeline tests:
* geo lookup
* country aggregation
* `usual_connection_country` derivation
* suspicious multi-country detection
* review recommendation calculation
* State tests:
* durable `country_review_recommended`
* declared-country version history
* session block action history
* Admin/query API tests:
* list review candidates
* read user geo profile
* apply approved declared-country change
* Queue and lag tests:
* backlog observability
* duplicate observation safety
* delayed processing behavior
### Inter-service integration tests with already implemented components
* `Gateway <-> Geo`
* async observation publish from authenticated request context
* `Geo <-> Auth / Session`
* suspicious session block request
* subsequent-request effect rather than current-request effect
* `Geo <-> User`
* synchronous update of current `declared_country`
* no divergence between history and current value
* `Geo <-> Notification`
* review-recommended event fan-out
* optional admin notification flow
* Keep geo processing fail-open relative to gameplay in all integration tests.
### Regression tests to keep from this stage onward
* Geo processing never blocks the current gameplay request.
* Session suspicion affects only later requests via auth/session.
* Geo owns history, while user service owns current effective declared country.
---
## 11. Billing Service
### Service tests
* Payment event intake tests:
* accepted event types
* malformed event rejection
* idempotent duplicate handling
* Entitlement mapping tests:
* free
* monthly-paid
* annual-paid
* once-forever-paid
* Lifecycle tests:
* activate paid entitlement
* expire renewable entitlement
* cancel paid entitlement
* preserve perpetual entitlement
* Failure tests:
* unknown user
* invalid payment state
* downstream user update failure
### Inter-service integration tests with already implemented components
* `Billing <-> User`
* entitlement updates become current source of truth in user service
* `Billing <-> Notification`
* optional billing-related user/admin notifications
* `Gateway <-> User` regression:
* user-facing entitlement reads reflect billing-fed updates correctly
### Regression tests to keep from this stage onward
* Other services never depend directly on billing for live entitlement decisions.
* `User Service` remains the source of truth for current entitlement.
---
## Full System Tests
These tests are added only after all major components are implemented.
By default, they should use:
* real gateway;
* real auth/session;
* real user;
* real notification;
* real lobby;
* real runtime manager;
* real game master;
* real admin;
* real geo;
* real Redis;
* stub `Mail Service` by default;
* test engine container or stable test engine image.
### A. Authentication and session lifecycle
* Register/login via email code through gateway.
* Confirm that `device_session_id` becomes usable through gateway without synchronous auth lookups on every request.
* Confirm that repeated `confirm-email-code` within the idempotency window returns the same `device_session_id`.
* Revoke one session and verify:
* authenticated requests fail for that session;
* only push streams bound to that session are closed.
* Revoke all sessions of a user and verify all sessions are rejected afterward.
### B. User profile and entitlement flow
* Read and update allowed user profile fields through gateway.
* Read tariff/entitlement and user limits through gateway.
* Verify that private-party creation entitlement decisions reflect current user-service state.
* Later, verify billing-fed entitlement changes become visible through user-service reads.
### C. Public game lifecycle
* Admin creates a public game.
* Users see it in public lists.
* Users apply.
* Admin approves roster.
* Lobby validates readiness.
* Runtime Manager starts container.
* Lobby persists metadata.
* Game Master registers the running game and initializes engine.
* Game becomes visible as running in user lists.
### D. Private game lifecycle
* Eligible user creates private game.
* Owner creates invite code.
* Another user redeems invite code and applies.
* Owner approves application.
* Owner starts game.
* Running registration completes.
* Only authorized users see the private game.
### E. Running-game command and push flow
* Player sends valid game command before cutoff.
* Gateway authenticates and routes to Game Master.
* Game Master verifies access and forwards to engine.
* Scheduled turn generation occurs.
* Player receives lightweight push notification through gateway.
* Player separately fetches updated per-player game state.
### F. Force-next-turn flow
* Running game has a fixed schedule.
* Owner or admin triggers manual next-turn generation.
* Current turn increments.
* Next scheduled slot is skipped.
* Subsequent scheduled generation happens only after the following valid slot.
### G. Runtime failure flow
* Scheduled turn generation fails.
* Game Master marks `generation_failed`.
* Lobby receives updated runtime snapshot.
* Only administrators are notified through notification flow.
* Users can still observe degraded problem state through status reads.
### H. Start failure and recovery flow
* Lobby requests runtime start.
* Runtime Manager starts container.
* Simulate metadata persistence failure in Lobby.
* Verify container is removed and game is not left half-started.
* Simulate successful metadata persistence but Game Master registration failure.
* Verify game is marked `paused` and admin is notified.
### I. Temporary vs final player removal flow
* Temporarily remove player after game start.
* Verify player can no longer send commands through platform.
* Verify engine still keeps the slot.
* Final-remove or account-block the player.
* Verify Game Master sends engine admin command to deactivate/remove the player.
### J. Notification routing flow
* Lobby emits invite/application/approval events.
* Notification Service sends push through gateway.
* Non-auth email notifications route through Notification Service to Mail Service.
* Auth-code emails remain direct `Auth / Session -> Mail`.
### K. Geo auxiliary flow
* Authenticated traffic generates geo observations.
* Suspicious multi-country pattern is detected.
* Current triggering request still succeeds.
* Auth / Session blocks the suspicious session.
* Next request from that session is rejected.
### L. Admin supervision flow
* System admin uses admin REST through gateway.
* Admin can view public and private games.
* Admin can inspect running-game runtime state.
* Admin can stop game, patch engine, and force next turn.
* Admin can block users and revoke sessions through appropriate downstream APIs.
## Ongoing Regression Policy
* Every time a new service is added, its service tests are mandatory before merging.
* Every new service boundary must add at least one inter-service integration suite against already implemented neighbors.
* Every bug found in integration or system testing must produce:
* one narrow regression test at the lowest useful level;
* and, if applicable, one broader integration or system scenario.
* The full system suite should stay intentionally limited to high-value vertical slices, not explode into a giant matrix.
## Practical Rule of Execution
* During early development:
* run service tests on every change;
* run inter-service tests for affected neighboring services on every branch;
* run a reduced smoke subset of system tests in CI.
* During stabilization:
* keep service and integration tests mandatory in CI;
* expand system tests around the critical product flows only.
## Summary
The project-wide testing strategy is fixed as follows:
* first, **service tests** inside each component;
* then, as components appear, **inter-service integration tests** between real neighboring services;
* finally, after all major components are implemented, **full system tests** for complete end-to-end platform flows.
This order is mandatory for the project because the architecture contains several critical stateful and asynchronous seams:
* gateway verification and routing;
* auth/session projection into gateway cache;
* push delivery through gateway;
* Redis Streams event propagation;
* runtime job completion;
* lobby/game-master synchronization;
* geo post-factum protective actions.