34 KiB
TESTING.md
Purpose
This document defines the testing strategy for the Galaxy Game platform and provides a staged testing matrix aligned with the agreed service implementation order.
The strategy is built around the current architecture constraints:
Edge Gatewayis the single public ingress and owns the external transport, authenticated gRPC verification pipeline, routing, and push delivery.Auth / Session Serviceis the source of truth for challenges anddevice_session, but it must not become the hot-path dependency for every authenticated request.Geo Profile Serviceis asynchronous and auxiliary; it must not block the current request and only affects subsequent requests.- Internal event propagation already exists as an architectural pattern through Redis-backed cache updates and pub/sub-style flows.
Global Testing Strategy
-
Start with service tests for each service in isolation.
-
As soon as a new service is integrated with already implemented services, add inter-service integration tests for that concrete boundary.
-
Only after all major components are implemented, add full system tests that exercise complete end-to-end platform flows.
-
Do not postpone all integration testing until the end.
-
Do not try to replace service tests with end-to-end tests.
-
Keep most tests deterministic and cheap to run.
-
Use real Redis in integration tests where Redis is part of the service contract.
-
Keep
Mail Servicestubbed in most integration and system tests, except for a small dedicated smoke suite for the real mail adapter. -
Prefer fake or test-specific implementations for external side effects until the corresponding real service is intentionally introduced.
-
For every new service:
- first add service tests;
- then add inter-service tests against already implemented services;
- then add regression scenarios to the growing system test suite.
-
For asynchronous flows:
- test both successful delivery and delayed/eventual delivery;
- test duplicate event handling;
- test retry-safe and idempotent consumption;
- test observability of stuck or failed processing.
-
For synchronous flows:
- test happy path, validation failures, timeout propagation, dependency unavailability, and deterministic error mapping.
-
Every service with an external or trusted internal API must have contract tests in addition to behavioral tests.
-
Every service that publishes or consumes Redis Stream events must have schema/contract tests for those event payloads.
-
Full system tests should be small in number but broad in vertical coverage.
Test Layer Definitions
Service tests
Service tests verify one component in isolation.
They include:
- domain/model tests;
- use-case/service-layer tests;
- adapter tests for storage, queues, clocks, IDs, and protocol encoding;
- API handler/controller tests;
- contract tests for DTOs and stable error surfaces;
- service-local integration tests with owned infrastructure such as Redis.
Inter-service integration tests
Inter-service integration tests verify one real boundary between two or more already implemented services.
They include:
- synchronous API compatibility;
- event publication and consumption;
- error propagation across service boundaries;
- cache/projection compatibility;
- retry and idempotency behavior across the seam;
- compatibility of internal authenticated context and domain decisions.
Full system tests
Full system tests verify complete user or admin flows through the real architecture.
They include:
- gateway ingress;
- authentication;
- user/profile state;
- game lifecycle;
- notifications and push;
- runtime orchestration;
- administrative operations;
- failure and recovery behavior across multiple services.
Test Environment Rules
-
Use an isolated Redis instance per integration test suite or per test worker.
-
Use a stub
Mail Serviceby default. -
Use fake/test doubles for not-yet-implemented downstream services.
-
Introduce real downstream services progressively as they are implemented.
-
Use a test engine container or test engine stub for
Game MasterandRuntime Managertests before relying on a real production engine image. -
Use deterministic test clocks where scheduling or expiration matters.
-
Make async tests wait on observable states, not arbitrary sleeps, whenever possible.
-
Keep one small smoke suite for:
- real Redis;
- real runtime backend path;
- real SMTP adapter later;
- real signed gateway request/response flow.
Recommended Service Implementation and Testing Order
The testing plan follows this service order:
Edge Gateway ServiceAuth / Session ServiceUser ServiceMail ServiceNotification ServiceGame Lobby ServiceRuntime ManagerGame MasterAdmin ServiceGeo Profile ServiceBilling Service
1. Edge Gateway Service
Service tests
-
Public REST routing tests:
GET /healthzGET /readyz- mounted public auth routes
- wrong-method and not-found handling
- public route-class classification for auth, browser bootstrap, browser asset, and misc traffic
- isolation of browser/public-auth rate-limit buckets
- rejection of oversized public request bodies
RemoteAddr-based public IP derivation that ignores forwarded proxy headers- public rate-limit behavior
- stable projection of upstream public auth errors
- sensitive-field redaction in public-auth logs
- public OpenAPI contract validation
- admin
/metricsavailability only on the private admin listener
-
Authenticated gRPC envelope validation tests:
- missing required fields
- unsupported
protocol_version - parsed envelope attachment before delegate execution
- malformed
payload_hash - mismatched
payload_hash - invalid signature
- stale timestamp
- replay detection
- unknown session
- revoked session
-
Session cache behavior tests:
- cache hit
- cache miss
- malformed cached record
- read-through local-cache warming after first fallback lookup
- local hit skips fallback lookup
- cache invalidation/update handling
-
Response signing tests:
- signed unary response generation
- unary response fails closed when the response signer is unavailable
- signed bootstrap push event generation
- bootstrap push fails closed when the response signer is unavailable
- signed stream event generation
-
Routing tests:
- unrouted
message_type - downstream timeout mapping
- downstream availability mapping
- authenticated internal command context construction
- verified trace/span context propagation downstream
- graceful drain of in-flight unary requests on shutdown
- sensitive transport material redaction in authenticated logs
- unrouted
-
Push tests:
SubscribeEventsbindsuser_idanddevice_session_id- bootstrap server-time event is emitted
- user-targeted events fan out to all matching user sessions
- session-targeted events reach only the addressed session
- stream queue overflow closes only the affected stream
- revoked session closes matching streams only
- revoked-session stream reopen is rejected
- active streams close with deterministic status on gateway shutdown
-
Anti-abuse tests:
- IP/session/user/message-class buckets
- interaction between rate limits and verification order
- authenticated/public anti-abuse bucket isolation
- authenticated policy-hook input and reject mapping
-
Redis adapter tests:
- session cache lookup
- replay reservation
- client event stream consumption
- session event stream consumption
- subscriber start-from-tail semantics
- malformed-event drop/evict-and-continue behavior
- later-event-wins behavior for session snapshots
- subscriber shutdown interrupts blocking reads
Inter-service integration tests at this stage
-
Gateway <-> Redis- session cache compatibility
- replay reservation semantics
- session update warms local cache without repeated fallback lookups
- revoked snapshot invalidates authenticated requests without fallback lookup
- client-event stream consumption for push fan-out
- session-event stream consumption for revoke propagation and push teardown
-
Gateway <-> stub Auth adapter- public auth passthrough
- timeout/error projection
-
Gateway <-> fake downstream- verified authenticated command routing
- signed response generation after downstream success
Regression tests to keep from this stage onward
- Authenticated request verification pipeline remains stable.
- Public auth routes remain mounted and deterministic.
- Public route classes and anti-abuse buckets remain isolated.
- Admin metrics stay off the public ingress.
- Push bootstrap event remains signed and schema-compatible.
- Push revoke and shutdown close streams with stable status mapping.
- Gateway logs remain free of sensitive request/auth material.
2. Auth / Session Service
Service tests
-
Challenge lifecycle tests:
- challenge creation
- TTL expiration
- resend throttling
delivery_throttledchallenge creation withoutUserDirectoryorMailSendercallsdelivery_suppressedbehavior for blocked subjects- expiry grace-window transition from
challenge_expiredtochallenge_not_found - delivery state transitions
- invalid confirm attempt limits
- success-shaped
send-email-codebehavior
-
Confirm flow tests:
- valid
challenge_id + code + client_public_key - malformed
client_public_key - blocked user
- existing user
- creatable user
- short-window idempotent confirm retry
- projection repair on repeated confirm after prior publish failure
- same challenge plus different public key failure
- confirm-race cleanup of superseded sessions
- session-limit exceeded
- valid
-
Session lifecycle tests:
- create session
- revoke one session
- revoke all sessions
- block user/email and revoke implied sessions
already_revoked,no_active_sessions, andalready_blockedacknowledgement semantics
-
Projection tests:
- source-of-truth session write
- gateway KV snapshot write
- gateway session stream event publish
- repeated publish idempotency
- stored session reread before publish to avoid stale active projection
-
Public API tests:
- JSON decoding, input validation, and invalid-request mapping
- public error mapping
- stable success DTO shape
- end-to-end public HTTP send/confirm scenarios
- timeout mapping and invalid-success-payload rejection
- stable public OpenAPI validation and gateway contract parity
- stable public error examples
- trace/metric emission and sensitive-field log redaction
-
Internal API tests:
GetSessionListUserSessionsRevokeDeviceSessionRevokeAllUserSessionsBlockUser- path/body validation and invalid-request mapping
- end-to-end internal HTTP read/revoke/block scenarios
- timeout mapping and invalid-success-payload rejection
- stable internal OpenAPI validation and frozen mutation DTO/enums
- trace/metric emission and sensitive-field log redaction
-
Redis adapter tests:
- challenge store
- session store
- config provider
- projection publisher
-
Runtime and architecture tests:
- public/internal HTTP server lifecycle
- intentional absence of
/healthz,/readyz, and/metrics - runtime wiring for
stub|restuser-service and mail-service adapters - startup fail-fast on Redis-backed ping failure
- storage-agnostic core for domain/service/ports layers
Inter-service integration tests with already implemented components
-
Gateway <-> Auth / Session- public
send-email-code - public
confirm-email-code - upstream timeout handling
- public error passthrough
- public
-
Auth / Session <-> Redis- challenge persistence
- session persistence
- session projection compatibility
- duplicate publish keeps gateway cache canonical
-
Gateway <-> Auth / Session <-> Redis- login creates session
- session projection becomes visible to gateway
- repeated confirm repairs a previously failed projection publish
- revoked session invalidates gateway authentication path
- revoked session closes gateway push stream
- malformed client public key keeps stable client-facing error
-
Auth / Session <-> stub Mail- auth code send path
- suppression path
- explicit mail failure path
-
Auth / Session <-> Mail REST- sent/suppressed/failure compatibility
- blocked/throttled sends skip mail delivery
-
Auth / Session <-> User REST- resolve-by-email compatibility for public send
- ensure-user compatibility for confirm
- exists/block compatibility for internal revoke/block flows
Regression tests to keep from this stage onward
confirm-email-codealways returns a readydevice_session_id.- Gateway continues authenticating from cache rather than synchronous auth lookups.
- Confirm idempotency window behavior remains stable.
- Projection repair-on-retry remains safe after source-of-truth commits.
- Confirm-race cleanup does not leave multiple active winner sessions.
- Projection repair continues working after process restart.
- Redis reconnect on the same live process preserves recovery semantics.
- Expired challenges continue returning
challenge_expiredduring grace andchallenge_not_foundafter TTL cleanup. - Large session-list and bulk-revoke paths remain stable.
- Concurrent confirm, revoke-all, and block flows do not leak active sessions.
- Session projection remains compatible with gateway expectations.
3. User Service
Service tests
-
User creation and identity tests:
- create user
- find by email
- normalized email uniqueness
- generated default
race_namefor new users race_nameuniqueness and confusable-substitution policy- role assignment
- tariff/entitlement fields
-
Profile tests:
- allowed profile reads
- allowed profile edits
- forbidden profile edits
- self-service rejection for e-mail and
declared_countrymutations profile_update_blocksanction gating for profile/settings writes- settings reads/writes
- BCP 47 and IANA validation for settings values
-
Restriction/sanction tests:
- block flags
- user limits
- override fields
- declared current sanctions view
- effective sanction/limit snapshot shaping for downstream consumers
-
Entitlement tests:
- free user
- paid placeholder states
- default simultaneous-game limit and per-user overrides
- entitlement, sanction, and limit interaction rules
-
Internal/admin-oriented tests:
- resolve existing/creatable/blocked decision for auth
ensure-by-emailcreate-onlyregistration_contextsemantics- current
declared_countryread/write path - exact lookup by
user_id, normalizedemail, andrace_name - paginated filtered listing with deterministic ordering
-
Storage and API contract tests:
- public/trusted endpoints
- stable DTO mapping
- Redis persistence if used directly in v1
Inter-service integration tests with already implemented components
-
Auth / Session <-> User- resolve existing user
- create new user during confirm
- blocked-by-policy outcome
-
Gateway <-> User- authenticated profile read
- authenticated allowed profile update
- tariff and settings read paths
-
Gateway <-> Auth / Session <-> User- first registration by email
- repeat login by same email
- blocked email/user behavior
Regression tests to keep from this stage onward
- User resolution outcomes remain stable for auth flow.
- User-facing profile APIs do not bypass auth/session rules.
registration_contextstays create-only and does not overwrite existing users.race_nameuniqueness policy remains stable for self-service and auth-created users.- User limit and sanction data stay compatible with downstream consumers.
4. Mail Service
Service tests
-
Mail command validation tests:
- recipient validation
- template selection
- payload rendering
-
Internal queue tests:
- enqueue
- dequeue
- retry
- permanent failure
- idempotent duplicate suppression where applicable
-
Delivery adapter tests:
- stub adapter behavior
- future SMTP adapter smoke behavior
-
Operational tests:
- queue backlog metrics
- dead-letter or failure recording behavior
- timeout handling
Inter-service integration tests with already implemented components
-
Auth / Session <-> Mail- direct auth-code send
- explicit mail failure behavior
- suppression path still preserves correct auth semantics
-
Gateway <-> Auth / Session <-> Mail- public auth flow still behaves correctly with mail delivery involved
-
Keep
Mail Servicestubbed in most broader suites. -
Add only a small dedicated smoke suite for the real mail adapter.
Regression tests to keep from this stage onward
- Auth code mail remains a direct dependency of auth flow.
- Mail failures do not corrupt auth challenge/session state.
- Stub mail remains the default for most non-mail-focused suites.
5. Notification Service
Service tests
-
Event intake tests:
- accepted event types
- malformed event rejection
- idempotent duplicate handling
-
Routing decision tests:
- push only
- email only
- push and email
- discard/no-delivery cases
-
Rendering tests:
- event-to-notification mapping
- payload shaping for push
- payload shaping for email
-
Failure isolation tests:
- push failure does not corrupt email route decision
- email failure does not corrupt push route decision
- retriable delivery behavior
-
Redis/event bus tests:
- consume domain/integration events
- publish client-facing events for gateway
- enqueue mail commands for mail service
Inter-service integration tests with already implemented components
-
Notification <-> Gateway- client-facing event publication and push delivery
- user-targeted vs session-targeted push routing
-
Notification <-> Mail- non-auth email delivery
- retry/failure isolation
-
Lobby/other fake producers <-> Notification- domain event intake compatibility
-
Assert explicitly that auth-code emails still bypass notification and go directly from auth to mail.
Regression tests to keep from this stage onward
- Notification stays delivery/orchestration-only and does not become source of truth.
- Non-auth notifications consistently go through notification service.
- Gateway push compatibility remains stable.
6. Game Lobby Service
Service tests
-
Game lifecycle tests:
draftenrollment_openenrollment_closedready_to_startstartingrunningpausedfinishedcancelled
-
Public/private game rules:
- public game creation by admin only
- private game creation entitlement checks
- visibility rules for private games
-
Invite lifecycle tests:
- invite code creation
- invite code redemption
- invite approval/rejection
- invite expiration if applicable later
-
Application and approval tests:
- public game application
- manual approval
- duplicate application handling
-
Membership tests:
- invited
- pending
- accepted
- removed
- blocked from party
-
User list/read-model tests:
- active games
- finished games
- pending applications
- invited games
-
Start-preparation tests:
- roster validation
- schedule validation
- engine version target validation
- readiness to start
-
Runtime snapshot import tests:
current_turnruntime_statusengine_health_summary
Inter-service integration tests with already implemented components
-
Gateway <-> Game Lobby- authenticated platform-level command routing
- owner-only commands before start
-
Lobby <-> User- entitlement checks for private game creation
- per-user simultaneous-game limits
- sanctions affecting join/create flows
-
Lobby <-> Notification- invite events
- approval/rejection events
- game status change events at platform level
-
Lobby <-> Auth / Session- authenticated context correctly propagated from gateway
-
Keep runtime launch boundaries stubbed until
Runtime Managerexists.
Regression tests to keep from this stage onward
Lobbyremains source of truth for platform game metadata and membership.Lobbyuser-facing game lists remain independent fromGame Master.- Private-game visibility and invite semantics remain stable.
7. Runtime Manager
Service tests
-
Runtime job tests:
- start container
- stop container
- restart container
- patch container
- inspect/status
-
Invariant tests:
- one game -> one container
- one container -> one game
-
Monitoring tests:
- health probe collection
- health event publication
- container disappearance handling
- restart/patch result reporting
-
Failure tests:
- Docker API unavailable
- image missing
- startup timeout
- stop timeout
- patch failure
-
Event publication tests:
- runtime job completion events
- technical health events
- duplicate event safety
Inter-service integration tests with already implemented components
-
Lobby <-> Runtime Manager- async start job request
- completion event consumption
- full fail-start path
-
Runtime Manager <-> Notification- optional operational event routing if enabled
-
Use a fake or test runtime backend first, then a targeted smoke suite against a real local Docker backend.
Regression tests to keep from this stage onward
- Runtime Manager remains the only component talking to Docker API.
- Runtime job event contracts remain stable for
Lobbyand laterGame Master.
8. Game Master
Service tests
-
Runtime registry tests:
- register running game
- unregister/stop game
- runtime state transitions
-
Engine version registry tests:
- version registration
- patch compatibility policy
- version-specific options
-
Runtime metadata tests:
- current turn
- runtime status
- generation status
- engine health summary
- patch state
-
Membership/runtime mapping tests:
user_id -> engine player UUID- game-scoped engine identifiers
-
Scheduling tests:
- scheduled turn generation
- cutoff enforcement
- manual force-next-turn
- skip-next-scheduled-slot after manual generation
-
Failure tests:
generation_failedengine_unreachable- runtime recovery from engine errors
-
Post-start administrative tests:
stop gamepatch engine- temporary player removal at platform gate only
- final player removal/deactivation inside engine
-
Engine mediation tests:
- engine setup after lobby metadata persistence
- engine finish notification handling
Inter-service integration tests with already implemented components
-
Gateway <-> Game Master- running-game command routing with
game_id - runtime-admin commands for running games
- system admin vs private-owner privileges where applicable
- running-game command routing with
-
Game Master <-> Lobby- running-game registration after successful container start
- membership lookup/cached authorization
- runtime snapshot backfill into lobby
- finished-game notification to lobby
-
Game Master <-> Runtime Manager- patch/stop/restart jobs
- runtime health event consumption
-
Game Master <-> Notification- new turn event publication
- game finished event publication
- generation failure admin notification
-
Game Master <-> test engine container- command proxying
- status read
- setup call
- finish callback
Regression tests to keep from this stage onward
Game Masterremains the only service allowed to call game engine containers.- Turn cutoff logic stays authoritative at platform level.
- Manual next-turn generation always suppresses the next scheduled slot.
- Runtime snapshot compatibility with
Lobbyremains stable.
9. Admin Service
Service tests
-
Admin API surface tests:
- admin-only route handling
- DTO validation
- aggregation/read models
-
Orchestration tests:
- forwards trusted operations to downstream services
- error aggregation and normalization
- partial failure handling for multi-step admin workflows
-
Role-handling tests:
- admin-only enforcement assumptions
- no accidental privilege leak into normal user flows
Inter-service integration tests with already implemented components
-
Gateway <-> Admin- separate admin REST surface
- admin-authenticated request handling
-
Admin <-> User- user restriction/sanction/admin reads
-
Admin <-> Lobby- public game administration
- global read of private games
-
Admin <-> Game Master- runtime administration
- global status reads
- patch/stop/force-next-turn
-
Admin <-> Auth / Session- session revoke/block operations if exposed through admin workflows
-
Admin <-> Notification- admin-generated notifications where needed
Regression tests to keep from this stage onward
- Admin Service remains orchestration/backend only.
- System admin capabilities remain separate from private-owner capabilities.
10. Geo Profile Service
Service tests
-
Ingest tests:
- enqueue authenticated observation
- ingest validation
- malformed FlatBuffers payload rejection
- required-scalar-field validation
- non-blocking acceptance
-
Worker pipeline tests:
- geo lookup
- geo lookup miss handling
- country aggregation
usual_connection_countryderivation- suspicious multi-country detection
- review recommendation calculation
- queue retry-safe processing
-
State tests:
- durable
country_review_recommended - declared-country version history
- declared-country version lifecycle:
recorded,applied,sync_failed - session block action history
- durable
-
Admin/query API tests:
- list review candidates
- stable ordering and pagination for candidate queries
- read user geo profile
- grouping by
device_session_idin review/read responses - apply approved declared-country change
-
Queue and lag tests:
- backlog observability
- duplicate observation safety
- delayed processing behavior
- retry and failure observability
Inter-service integration tests with already implemented components
-
Gateway <-> Geo- async observation publish from authenticated request context
- fail-open edge behavior when geo ingest is unavailable
-
Geo <-> Auth / Session- suspicious session block request
- subsequent-request effect rather than current-request effect
-
Geo <-> User- synchronous update of current
declared_country - no divergence between history and current value
- synchronous update of current
-
Geo <-> Notification- review-recommended event fan-out
- optional admin notification flow
-
Keep geo processing fail-open relative to gameplay in all integration tests.
Regression tests to keep from this stage onward
- Geo processing never blocks the current gameplay request.
- Review-recommended state remains queryable even when event/mail side effects fail.
- Session suspicion affects only later requests via auth/session.
- Geo owns history, while user service owns current effective declared country.
11. Billing Service
Service tests
-
Payment event intake tests:
- accepted event types
- malformed event rejection
- idempotent duplicate handling
-
Entitlement mapping tests:
- free
- monthly-paid
- annual-paid
- once-forever-paid
-
Lifecycle tests:
- activate paid entitlement
- expire renewable entitlement
- cancel paid entitlement
- preserve perpetual entitlement
-
Failure tests:
- unknown user
- invalid payment state
- downstream user update failure
Inter-service integration tests with already implemented components
-
Billing <-> User- entitlement updates become current source of truth in user service
-
Billing <-> Notification- optional billing-related user/admin notifications
-
Gateway <-> Userregression:- user-facing entitlement reads reflect billing-fed updates correctly
Regression tests to keep from this stage onward
- Other services never depend directly on billing for live entitlement decisions.
User Serviceremains the source of truth for current entitlement.
Full System Tests
These tests are added only after all major components are implemented.
By default, they should use:
- real gateway;
- real auth/session;
- real user;
- real notification;
- real lobby;
- real runtime manager;
- real game master;
- real admin;
- real geo;
- real Redis;
- stub
Mail Serviceby default; - test engine container or stable test engine image.
A. Authentication and session lifecycle
-
Register/login via email code through gateway.
-
Confirm that
device_session_idbecomes usable through gateway without synchronous auth lookups on every request. -
Confirm that repeated
confirm-email-codewithin the idempotency window returns the samedevice_session_id. -
Revoke one session and verify:
- authenticated requests fail for that session;
- only push streams bound to that session are closed.
-
Revoke all sessions of a user and verify all sessions are rejected afterward.
B. User profile and entitlement flow
- Read and update allowed user profile fields through gateway.
- Read tariff/entitlement and user limits through gateway.
- Verify that private-party creation entitlement decisions reflect current user-service state.
- Later, verify billing-fed entitlement changes become visible through user-service reads.
C. Public game lifecycle
- Admin creates a public game.
- Users see it in public lists.
- Users apply.
- Admin approves roster.
- Lobby validates readiness.
- Runtime Manager starts container.
- Lobby persists metadata.
- Game Master registers the running game and initializes engine.
- Game becomes visible as running in user lists.
D. Private game lifecycle
- Eligible user creates private game.
- Owner creates invite code.
- Another user redeems invite code and applies.
- Owner approves application.
- Owner starts game.
- Running registration completes.
- Only authorized users see the private game.
E. Running-game command and push flow
- Player sends valid game command before cutoff.
- Gateway authenticates and routes to Game Master.
- Game Master verifies access and forwards to engine.
- Scheduled turn generation occurs.
- Player receives lightweight push notification through gateway.
- Player separately fetches updated per-player game state.
F. Force-next-turn flow
- Running game has a fixed schedule.
- Owner or admin triggers manual next-turn generation.
- Current turn increments.
- Next scheduled slot is skipped.
- Subsequent scheduled generation happens only after the following valid slot.
G. Runtime failure flow
- Scheduled turn generation fails.
- Game Master marks
generation_failed. - Lobby receives updated runtime snapshot.
- Only administrators are notified through notification flow.
- Users can still observe degraded problem state through status reads.
H. Start failure and recovery flow
- Lobby requests runtime start.
- Runtime Manager starts container.
- Simulate metadata persistence failure in Lobby.
- Verify container is removed and game is not left half-started.
- Simulate successful metadata persistence but Game Master registration failure.
- Verify game is marked
pausedand admin is notified.
I. Temporary vs final player removal flow
- Temporarily remove player after game start.
- Verify player can no longer send commands through platform.
- Verify engine still keeps the slot.
- Final-remove or account-block the player.
- Verify Game Master sends engine admin command to deactivate/remove the player.
J. Notification routing flow
- Lobby emits invite/application/approval events.
- Notification Service sends push through gateway.
- Non-auth email notifications route through Notification Service to Mail Service.
- Auth-code emails remain direct
Auth / Session -> Mail.
K. Geo auxiliary flow
- Authenticated traffic generates geo observations.
- Suspicious multi-country pattern is detected.
- Current triggering request still succeeds.
- Auth / Session blocks the suspicious session.
- Next request from that session is rejected.
L. Admin supervision flow
- System admin uses admin REST through gateway.
- Admin can view public and private games.
- Admin can inspect running-game runtime state.
- Admin can stop game, patch engine, and force next turn.
- Admin can block users and revoke sessions through appropriate downstream APIs.
Ongoing Regression Policy
-
Every time a new service is added, its service tests are mandatory before merging.
-
Every new service boundary must add at least one inter-service integration suite against already implemented neighbors.
-
Every bug found in integration or system testing must produce:
- one narrow regression test at the lowest useful level;
- and, if applicable, one broader integration or system scenario.
-
The full system suite should stay intentionally limited to high-value vertical slices, not explode into a giant matrix.
Practical Rule of Execution
-
During early development:
- run service tests on every change;
- run inter-service tests for affected neighboring services on every branch;
- run a reduced smoke subset of system tests in CI.
-
During stabilization:
- keep service and integration tests mandatory in CI;
- expand system tests around the critical product flows only.
Summary
The project-wide testing strategy is fixed as follows:
- first, service tests inside each component;
- then, as components appear, inter-service integration tests between real neighboring services;
- finally, after all major components are implemented, full system tests for complete end-to-end platform flows.
This order is mandatory for the project because the architecture contains several critical stateful and asynchronous seams:
- gateway verification and routing;
- auth/session projection into gateway cache;
- push delivery through gateway;
- Redis Streams event propagation;
- runtime job completion;
- lobby/game-master synchronization;
- geo post-factum protective actions.