developer/galaxy-game

Fork 0

Files

T

IliaDenisov 94b7b6ce06 chore: fix platform naming

2026-04-09 12:53:08 +02:00

34 KiB

Raw Blame History

TESTING.md

Purpose

This document defines the testing strategy for the Galaxy Game platform and provides a staged testing matrix aligned with the agreed service implementation order.

The strategy is built around the current architecture constraints:

Edge Gateway is the single public ingress and owns the external transport, authenticated gRPC verification pipeline, routing, and push delivery.
Auth / Session Service is the source of truth for challenges and device_session, but it must not become the hot-path dependency for every authenticated request.
Geo Profile Service is asynchronous and auxiliary; it must not block the current request and only affects subsequent requests.
Internal event propagation already exists as an architectural pattern through Redis-backed cache updates and pub/sub-style flows.

Global Testing Strategy

Start with service tests for each service in isolation.
As soon as a new service is integrated with already implemented services, add inter-service integration tests for that concrete boundary.
Only after all major components are implemented, add full system tests that exercise complete end-to-end platform flows.
Do not postpone all integration testing until the end.
Do not try to replace service tests with end-to-end tests.
Keep most tests deterministic and cheap to run.
Use real Redis in integration tests where Redis is part of the service contract.
Keep Mail Service stubbed in most integration and system tests, except for a small dedicated smoke suite for the real mail adapter.
Prefer fake or test-specific implementations for external side effects until the corresponding real service is intentionally introduced.
For every new service:
- first add service tests;
- then add inter-service tests against already implemented services;
- then add regression scenarios to the growing system test suite.
For asynchronous flows:
- test both successful delivery and delayed/eventual delivery;
- test duplicate event handling;
- test retry-safe and idempotent consumption;
- test observability of stuck or failed processing.
For synchronous flows:
- test happy path, validation failures, timeout propagation, dependency unavailability, and deterministic error mapping.
Every service with an external or trusted internal API must have contract tests in addition to behavioral tests.
Every service that publishes or consumes Redis Stream events must have schema/contract tests for those event payloads.
Full system tests should be small in number but broad in vertical coverage.

Test Layer Definitions

Service tests

Service tests verify one component in isolation.

They include:

domain/model tests;
use-case/service-layer tests;
adapter tests for storage, queues, clocks, IDs, and protocol encoding;
API handler/controller tests;
contract tests for DTOs and stable error surfaces;
service-local integration tests with owned infrastructure such as Redis.

Inter-service integration tests

Inter-service integration tests verify one real boundary between two or more already implemented services.

They include:

synchronous API compatibility;
event publication and consumption;
error propagation across service boundaries;
cache/projection compatibility;
retry and idempotency behavior across the seam;
compatibility of internal authenticated context and domain decisions.

Full system tests

Full system tests verify complete user or admin flows through the real architecture.

They include:

gateway ingress;
authentication;
user/profile state;
game lifecycle;
notifications and push;
runtime orchestration;
administrative operations;
failure and recovery behavior across multiple services.

Test Environment Rules

Use an isolated Redis instance per integration test suite or per test worker.
Use a stub Mail Service by default.
Use fake/test doubles for not-yet-implemented downstream services.
Introduce real downstream services progressively as they are implemented.
Use a test engine container or test engine stub for Game Master and Runtime Manager tests before relying on a real production engine image.
Use deterministic test clocks where scheduling or expiration matters.
Make async tests wait on observable states, not arbitrary sleeps, whenever possible.
Keep one small smoke suite for:
- real Redis;
- real runtime backend path;
- real SMTP adapter later;
- real signed gateway request/response flow.

Recommended Service Implementation and Testing Order

The testing plan follows this service order:

Edge Gateway Service
Auth / Session Service
User Service
Mail Service
Notification Service
Game Lobby Service
Runtime Manager
Game Master
Admin Service
Geo Profile Service
Billing Service

1. Edge Gateway Service

Service tests

Public REST routing tests:
- GET /healthz
- GET /readyz
- mounted public auth routes
- wrong-method and not-found handling
- public route-class classification for auth, browser bootstrap, browser asset, and misc traffic
- isolation of browser/public-auth rate-limit buckets
- rejection of oversized public request bodies
- RemoteAddr-based public IP derivation that ignores forwarded proxy headers
- public rate-limit behavior
- stable projection of upstream public auth errors
- sensitive-field redaction in public-auth logs
- public OpenAPI contract validation
- admin /metrics availability only on the private admin listener
Authenticated gRPC envelope validation tests:
- missing required fields
- unsupported protocol_version
- parsed envelope attachment before delegate execution
- malformed payload_hash
- mismatched payload_hash
- invalid signature
- stale timestamp
- replay detection
- unknown session
- revoked session
Session cache behavior tests:
- cache hit
- cache miss
- malformed cached record
- read-through local-cache warming after first fallback lookup
- local hit skips fallback lookup
- cache invalidation/update handling
Response signing tests:
- signed unary response generation
- unary response fails closed when the response signer is unavailable
- signed bootstrap push event generation
- bootstrap push fails closed when the response signer is unavailable
- signed stream event generation
Routing tests:
- unrouted message_type
- downstream timeout mapping
- downstream availability mapping
- authenticated internal command context construction
- verified trace/span context propagation downstream
- graceful drain of in-flight unary requests on shutdown
- sensitive transport material redaction in authenticated logs
Push tests:
- SubscribeEvents binds user_id and device_session_id
- bootstrap server-time event is emitted
- user-targeted events fan out to all matching user sessions
- session-targeted events reach only the addressed session
- stream queue overflow closes only the affected stream
- revoked session closes matching streams only
- revoked-session stream reopen is rejected
- active streams close with deterministic status on gateway shutdown
Anti-abuse tests:
- IP/session/user/message-class buckets
- interaction between rate limits and verification order
- authenticated/public anti-abuse bucket isolation
- authenticated policy-hook input and reject mapping
Redis adapter tests:
- session cache lookup
- replay reservation
- client event stream consumption
- session event stream consumption
- subscriber start-from-tail semantics
- malformed-event drop/evict-and-continue behavior
- later-event-wins behavior for session snapshots
- subscriber shutdown interrupts blocking reads

Inter-service integration tests at this stage

Gateway <-> Redis
- session cache compatibility
- replay reservation semantics
- session update warms local cache without repeated fallback lookups
- revoked snapshot invalidates authenticated requests without fallback lookup
- client-event stream consumption for push fan-out
- session-event stream consumption for revoke propagation and push teardown
Gateway <-> stub Auth adapter
- public auth passthrough
- timeout/error projection
Gateway <-> fake downstream
- verified authenticated command routing
- signed response generation after downstream success

Regression tests to keep from this stage onward

Authenticated request verification pipeline remains stable.
Public auth routes remain mounted and deterministic.
Public route classes and anti-abuse buckets remain isolated.
Admin metrics stay off the public ingress.
Push bootstrap event remains signed and schema-compatible.
Push revoke and shutdown close streams with stable status mapping.
Gateway logs remain free of sensitive request/auth material.

2. Auth / Session Service

Service tests

Challenge lifecycle tests:
- challenge creation
- TTL expiration
- resend throttling
- delivery_throttled challenge creation without UserDirectory or MailSender calls
- delivery_suppressed behavior for blocked subjects
- expiry grace-window transition from challenge_expired to challenge_not_found
- delivery state transitions
- invalid confirm attempt limits
- success-shaped send-email-code behavior
Confirm flow tests:
- valid challenge_id + code + client_public_key
- malformed client_public_key
- blocked user
- existing user
- creatable user
- short-window idempotent confirm retry
- projection repair on repeated confirm after prior publish failure
- same challenge plus different public key failure
- confirm-race cleanup of superseded sessions
- session-limit exceeded
Session lifecycle tests:
- create session
- revoke one session
- revoke all sessions
- block user/email and revoke implied sessions
- already_revoked, no_active_sessions, and already_blocked acknowledgement semantics
Projection tests:
- source-of-truth session write
- gateway KV snapshot write
- gateway session stream event publish
- repeated publish idempotency
- stored session reread before publish to avoid stale active projection
Public API tests:
- JSON decoding, input validation, and invalid-request mapping
- public error mapping
- stable success DTO shape
- end-to-end public HTTP send/confirm scenarios
- timeout mapping and invalid-success-payload rejection
- stable public OpenAPI validation and gateway contract parity
- stable public error examples
- trace/metric emission and sensitive-field log redaction
Internal API tests:
- GetSession
- ListUserSessions
- RevokeDeviceSession
- RevokeAllUserSessions
- BlockUser
- path/body validation and invalid-request mapping
- end-to-end internal HTTP read/revoke/block scenarios
- timeout mapping and invalid-success-payload rejection
- stable internal OpenAPI validation and frozen mutation DTO/enums
- trace/metric emission and sensitive-field log redaction
Redis adapter tests:
- challenge store
- session store
- config provider
- projection publisher
Runtime and architecture tests:
- public/internal HTTP server lifecycle
- intentional absence of /healthz, /readyz, and /metrics
- runtime wiring for stub|rest user-service and mail-service adapters
- startup fail-fast on Redis-backed ping failure
- storage-agnostic core for domain/service/ports layers

Inter-service integration tests with already implemented components

Gateway <-> Auth / Session
- public send-email-code
- public confirm-email-code
- upstream timeout handling
- public error passthrough
Auth / Session <-> Redis
- challenge persistence
- session persistence
- session projection compatibility
- duplicate publish keeps gateway cache canonical
Gateway <-> Auth / Session <-> Redis
- login creates session
- session projection becomes visible to gateway
- repeated confirm repairs a previously failed projection publish
- revoked session invalidates gateway authentication path
- revoked session closes gateway push stream
- malformed client public key keeps stable client-facing error
Auth / Session <-> stub Mail
- auth code send path
- suppression path
- explicit mail failure path
Auth / Session <-> Mail REST
- sent/suppressed/failure compatibility
- blocked/throttled sends skip mail delivery
Auth / Session <-> User REST
- resolve-by-email compatibility for public send
- ensure-user compatibility for confirm
- exists/block compatibility for internal revoke/block flows

Regression tests to keep from this stage onward

confirm-email-code always returns a ready device_session_id.
Gateway continues authenticating from cache rather than synchronous auth lookups.
Confirm idempotency window behavior remains stable.
Projection repair-on-retry remains safe after source-of-truth commits.
Confirm-race cleanup does not leave multiple active winner sessions.
Projection repair continues working after process restart.
Redis reconnect on the same live process preserves recovery semantics.
Expired challenges continue returning challenge_expired during grace and challenge_not_found after TTL cleanup.
Large session-list and bulk-revoke paths remain stable.
Concurrent confirm, revoke-all, and block flows do not leak active sessions.
Session projection remains compatible with gateway expectations.

3. User Service

Service tests

User creation and identity tests:
- create user
- find by email
- normalized email uniqueness
- generated default race_name for new users
- race_name uniqueness and confusable-substitution policy
- role assignment
- tariff/entitlement fields
Profile tests:
- allowed profile reads
- allowed profile edits
- forbidden profile edits
- self-service rejection for e-mail and declared_country mutations
- profile_update_block sanction gating for profile/settings writes
- settings reads/writes
- BCP 47 and IANA validation for settings values
Restriction/sanction tests:
- block flags
- user limits
- override fields
- declared current sanctions view
- effective sanction/limit snapshot shaping for downstream consumers
Entitlement tests:
- free user
- paid placeholder states
- default simultaneous-game limit and per-user overrides
- entitlement, sanction, and limit interaction rules
Internal/admin-oriented tests:
- resolve existing/creatable/blocked decision for auth
- ensure-by-email create-only registration_context semantics
- current declared_country read/write path
- exact lookup by user_id, normalized email, and race_name
- paginated filtered listing with deterministic ordering
Storage and API contract tests:
- public/trusted endpoints
- stable DTO mapping
- Redis persistence if used directly in v1

Inter-service integration tests with already implemented components

Auth / Session <-> User
- resolve existing user
- create new user during confirm
- blocked-by-policy outcome
Gateway <-> User
- authenticated profile read
- authenticated allowed profile update
- tariff and settings read paths
Gateway <-> Auth / Session <-> User
- first registration by email
- repeat login by same email
- blocked email/user behavior

Regression tests to keep from this stage onward

User resolution outcomes remain stable for auth flow.
User-facing profile APIs do not bypass auth/session rules.
registration_context stays create-only and does not overwrite existing users.
race_name uniqueness policy remains stable for self-service and auth-created users.
User limit and sanction data stay compatible with downstream consumers.

4. Mail Service

Service tests

Mail command validation tests:
- recipient validation
- template selection
- payload rendering
Internal queue tests:
- enqueue
- dequeue
- retry
- permanent failure
- idempotent duplicate suppression where applicable
Delivery adapter tests:
- stub adapter behavior
- future SMTP adapter smoke behavior
Operational tests:
- queue backlog metrics
- dead-letter or failure recording behavior
- timeout handling

Inter-service integration tests with already implemented components

Auth / Session <-> Mail
- direct auth-code send
- explicit mail failure behavior
- suppression path still preserves correct auth semantics
Gateway <-> Auth / Session <-> Mail
- public auth flow still behaves correctly with mail delivery involved
Keep Mail Service stubbed in most broader suites.
Add only a small dedicated smoke suite for the real mail adapter.

Regression tests to keep from this stage onward

Auth code mail remains a direct dependency of auth flow.
Mail failures do not corrupt auth challenge/session state.
Stub mail remains the default for most non-mail-focused suites.

5. Notification Service

Service tests

Event intake tests:
- accepted event types
- malformed event rejection
- idempotent duplicate handling
Routing decision tests:
- push only
- email only
- push and email
- discard/no-delivery cases
Rendering tests:
- event-to-notification mapping
- payload shaping for push
- payload shaping for email
Failure isolation tests:
- push failure does not corrupt email route decision
- email failure does not corrupt push route decision
- retriable delivery behavior
Redis/event bus tests:
- consume domain/integration events
- publish client-facing events for gateway
- enqueue mail commands for mail service

Inter-service integration tests with already implemented components

Notification <-> Gateway
- client-facing event publication and push delivery
- user-targeted vs session-targeted push routing
Notification <-> Mail
- non-auth email delivery
- retry/failure isolation
Lobby/other fake producers <-> Notification
- domain event intake compatibility
Assert explicitly that auth-code emails still bypass notification and go directly from auth to mail.

Regression tests to keep from this stage onward

Notification stays delivery/orchestration-only and does not become source of truth.
Non-auth notifications consistently go through notification service.
Gateway push compatibility remains stable.

6. Game Lobby Service

Service tests

Game lifecycle tests:
- draft
- enrollment_open
- enrollment_closed
- ready_to_start
- starting
- running
- paused
- finished
- cancelled
Public/private game rules:
- public game creation by admin only
- private game creation entitlement checks
- visibility rules for private games
Invite lifecycle tests:
- invite code creation
- invite code redemption
- invite approval/rejection
- invite expiration if applicable later
Application and approval tests:
- public game application
- manual approval
- duplicate application handling
Membership tests:
- invited
- pending
- accepted
- removed
- blocked from party
User list/read-model tests:
- active games
- finished games
- pending applications
- invited games
Start-preparation tests:
- roster validation
- schedule validation
- engine version target validation
- readiness to start
Runtime snapshot import tests:
- current_turn
- runtime_status
- engine_health_summary

Inter-service integration tests with already implemented components

Gateway <-> Game Lobby
- authenticated platform-level command routing
- owner-only commands before start
Lobby <-> User
- entitlement checks for private game creation
- per-user simultaneous-game limits
- sanctions affecting join/create flows
Lobby <-> Notification
- invite events
- approval/rejection events
- game status change events at platform level
Lobby <-> Auth / Session
- authenticated context correctly propagated from gateway
Keep runtime launch boundaries stubbed until Runtime Manager exists.

Regression tests to keep from this stage onward

Lobby remains source of truth for platform game metadata and membership.
Lobby user-facing game lists remain independent from Game Master.
Private-game visibility and invite semantics remain stable.

7. Runtime Manager

Service tests

Runtime job tests:
- start container
- stop container
- restart container
- patch container
- inspect/status
Invariant tests:
- one game -> one container
- one container -> one game
Monitoring tests:
- health probe collection
- health event publication
- container disappearance handling
- restart/patch result reporting
Failure tests:
- Docker API unavailable
- image missing
- startup timeout
- stop timeout
- patch failure
Event publication tests:
- runtime job completion events
- technical health events
- duplicate event safety

Inter-service integration tests with already implemented components

Lobby <-> Runtime Manager
- async start job request
- completion event consumption
- full fail-start path
Runtime Manager <-> Notification
- optional operational event routing if enabled
Use a fake or test runtime backend first, then a targeted smoke suite against a real local Docker backend.

Regression tests to keep from this stage onward

Runtime Manager remains the only component talking to Docker API.
Runtime job event contracts remain stable for Lobby and later Game Master.

8. Game Master

Service tests

Runtime registry tests:
- register running game
- unregister/stop game
- runtime state transitions
Engine version registry tests:
- version registration
- patch compatibility policy
- version-specific options
Runtime metadata tests:
- current turn
- runtime status
- generation status
- engine health summary
- patch state
Membership/runtime mapping tests:
- user_id -> engine player UUID
- game-scoped engine identifiers
Scheduling tests:
- scheduled turn generation
- cutoff enforcement
- manual force-next-turn
- skip-next-scheduled-slot after manual generation
Failure tests:
- generation_failed
- engine_unreachable
- runtime recovery from engine errors
Post-start administrative tests:
- stop game
- patch engine
- temporary player removal at platform gate only
- final player removal/deactivation inside engine
Engine mediation tests:
- engine setup after lobby metadata persistence
- engine finish notification handling

Inter-service integration tests with already implemented components

Gateway <-> Game Master
- running-game command routing with game_id
- runtime-admin commands for running games
- system admin vs private-owner privileges where applicable
Game Master <-> Lobby
- running-game registration after successful container start
- membership lookup/cached authorization
- runtime snapshot backfill into lobby
- finished-game notification to lobby
Game Master <-> Runtime Manager
- patch/stop/restart jobs
- runtime health event consumption
Game Master <-> Notification
- new turn event publication
- game finished event publication
- generation failure admin notification
Game Master <-> test engine container
- command proxying
- status read
- setup call
- finish callback

Regression tests to keep from this stage onward

Game Master remains the only service allowed to call game engine containers.
Turn cutoff logic stays authoritative at platform level.
Manual next-turn generation always suppresses the next scheduled slot.
Runtime snapshot compatibility with Lobby remains stable.

9. Admin Service

Service tests

Admin API surface tests:
- admin-only route handling
- DTO validation
- aggregation/read models
Orchestration tests:
- forwards trusted operations to downstream services
- error aggregation and normalization
- partial failure handling for multi-step admin workflows
Role-handling tests:
- admin-only enforcement assumptions
- no accidental privilege leak into normal user flows

Inter-service integration tests with already implemented components

Gateway <-> Admin
- separate admin REST surface
- admin-authenticated request handling
Admin <-> User
- user restriction/sanction/admin reads
Admin <-> Lobby
- public game administration
- global read of private games
Admin <-> Game Master
- runtime administration
- global status reads
- patch/stop/force-next-turn
Admin <-> Auth / Session
- session revoke/block operations if exposed through admin workflows
Admin <-> Notification
- admin-generated notifications where needed

Regression tests to keep from this stage onward

Admin Service remains orchestration/backend only.
System admin capabilities remain separate from private-owner capabilities.

10. Geo Profile Service

Service tests

Ingest tests:
- enqueue authenticated observation
- ingest validation
- malformed FlatBuffers payload rejection
- required-scalar-field validation
- non-blocking acceptance
Worker pipeline tests:
- geo lookup
- geo lookup miss handling
- country aggregation
- usual_connection_country derivation
- suspicious multi-country detection
- review recommendation calculation
- queue retry-safe processing
State tests:
- durable country_review_recommended
- declared-country version history
- declared-country version lifecycle: recorded, applied, sync_failed
- session block action history
Admin/query API tests:
- list review candidates
- stable ordering and pagination for candidate queries
- read user geo profile
- grouping by device_session_id in review/read responses
- apply approved declared-country change
Queue and lag tests:
- backlog observability
- duplicate observation safety
- delayed processing behavior
- retry and failure observability

Inter-service integration tests with already implemented components

Gateway <-> Geo
- async observation publish from authenticated request context
- fail-open edge behavior when geo ingest is unavailable
Geo <-> Auth / Session
- suspicious session block request
- subsequent-request effect rather than current-request effect
Geo <-> User
- synchronous update of current declared_country
- no divergence between history and current value
Geo <-> Notification
- review-recommended event fan-out
- optional admin notification flow
Keep geo processing fail-open relative to gameplay in all integration tests.

Regression tests to keep from this stage onward

Geo processing never blocks the current gameplay request.
Review-recommended state remains queryable even when event/mail side effects fail.
Session suspicion affects only later requests via auth/session.
Geo owns history, while user service owns current effective declared country.

11. Billing Service

Service tests

Payment event intake tests:
- accepted event types
- malformed event rejection
- idempotent duplicate handling
Entitlement mapping tests:
- free
- monthly-paid
- annual-paid
- once-forever-paid
Lifecycle tests:
- activate paid entitlement
- expire renewable entitlement
- cancel paid entitlement
- preserve perpetual entitlement
Failure tests:
- unknown user
- invalid payment state
- downstream user update failure

Inter-service integration tests with already implemented components

Billing <-> User
- entitlement updates become current source of truth in user service
Billing <-> Notification
- optional billing-related user/admin notifications
Gateway <-> User regression:
- user-facing entitlement reads reflect billing-fed updates correctly

Regression tests to keep from this stage onward

Other services never depend directly on billing for live entitlement decisions.
User Service remains the source of truth for current entitlement.

Full System Tests

These tests are added only after all major components are implemented.

By default, they should use:

real gateway;
real auth/session;
real user;
real notification;
real lobby;
real runtime manager;
real game master;
real admin;
real geo;
real Redis;
stub Mail Service by default;
test engine container or stable test engine image.

A. Authentication and session lifecycle

Register/login via email code through gateway.
Confirm that device_session_id becomes usable through gateway without synchronous auth lookups on every request.
Confirm that repeated confirm-email-code within the idempotency window returns the same device_session_id.
Revoke one session and verify:
- authenticated requests fail for that session;
- only push streams bound to that session are closed.
Revoke all sessions of a user and verify all sessions are rejected afterward.

B. User profile and entitlement flow

Read and update allowed user profile fields through gateway.
Read tariff/entitlement and user limits through gateway.
Verify that private-party creation entitlement decisions reflect current user-service state.
Later, verify billing-fed entitlement changes become visible through user-service reads.

C. Public game lifecycle

Admin creates a public game.
Users see it in public lists.
Users apply.
Admin approves roster.
Lobby validates readiness.
Runtime Manager starts container.
Lobby persists metadata.
Game Master registers the running game and initializes engine.
Game becomes visible as running in user lists.

D. Private game lifecycle

Eligible user creates private game.
Owner creates invite code.
Another user redeems invite code and applies.
Owner approves application.
Owner starts game.
Running registration completes.
Only authorized users see the private game.

E. Running-game command and push flow

Player sends valid game command before cutoff.
Gateway authenticates and routes to Game Master.
Game Master verifies access and forwards to engine.
Scheduled turn generation occurs.
Player receives lightweight push notification through gateway.
Player separately fetches updated per-player game state.

F. Force-next-turn flow

Running game has a fixed schedule.
Owner or admin triggers manual next-turn generation.
Current turn increments.
Next scheduled slot is skipped.
Subsequent scheduled generation happens only after the following valid slot.

G. Runtime failure flow

Scheduled turn generation fails.
Game Master marks generation_failed.
Lobby receives updated runtime snapshot.
Only administrators are notified through notification flow.
Users can still observe degraded problem state through status reads.

H. Start failure and recovery flow

Lobby requests runtime start.
Runtime Manager starts container.
Simulate metadata persistence failure in Lobby.
Verify container is removed and game is not left half-started.
Simulate successful metadata persistence but Game Master registration failure.
Verify game is marked paused and admin is notified.

I. Temporary vs final player removal flow

Temporarily remove player after game start.
Verify player can no longer send commands through platform.
Verify engine still keeps the slot.
Final-remove or account-block the player.
Verify Game Master sends engine admin command to deactivate/remove the player.

J. Notification routing flow

Lobby emits invite/application/approval events.
Notification Service sends push through gateway.
Non-auth email notifications route through Notification Service to Mail Service.
Auth-code emails remain direct Auth / Session -> Mail.

K. Geo auxiliary flow

Authenticated traffic generates geo observations.
Suspicious multi-country pattern is detected.
Current triggering request still succeeds.
Auth / Session blocks the suspicious session.
Next request from that session is rejected.

L. Admin supervision flow

System admin uses admin REST through gateway.
Admin can view public and private games.
Admin can inspect running-game runtime state.
Admin can stop game, patch engine, and force next turn.
Admin can block users and revoke sessions through appropriate downstream APIs.

Ongoing Regression Policy

Every time a new service is added, its service tests are mandatory before merging.
Every new service boundary must add at least one inter-service integration suite against already implemented neighbors.
Every bug found in integration or system testing must produce:
- one narrow regression test at the lowest useful level;
- and, if applicable, one broader integration or system scenario.
The full system suite should stay intentionally limited to high-value vertical slices, not explode into a giant matrix.

Practical Rule of Execution

During early development:
- run service tests on every change;
- run inter-service tests for affected neighboring services on every branch;
- run a reduced smoke subset of system tests in CI.
During stabilization:
- keep service and integration tests mandatory in CI;
- expand system tests around the critical product flows only.

Summary

The project-wide testing strategy is fixed as follows:

first, service tests inside each component;
then, as components appear, inter-service integration tests between real neighboring services;
finally, after all major components are implemented, full system tests for complete end-to-end platform flows.

This order is mandatory for the project because the architecture contains several critical stateful and asynchronous seams:

gateway verification and routing;
auth/session projection into gateway cache;
push delivery through gateway;
Redis Streams event propagation;
runtime job completion;
lobby/game-master synchronization;
geo post-factum protective actions.

34 KiB Raw Blame History

TESTING.md

Purpose

Global Testing Strategy

Test Layer Definitions

Service tests

Inter-service integration tests

Full system tests

Test Environment Rules

Recommended Service Implementation and Testing Order

1. Edge Gateway Service

Service tests

Inter-service integration tests at this stage

Regression tests to keep from this stage onward

2. Auth / Session Service

Service tests

Inter-service integration tests with already implemented components

Regression tests to keep from this stage onward

3. User Service

Service tests

Inter-service integration tests with already implemented components

Regression tests to keep from this stage onward

4. Mail Service

Service tests

Inter-service integration tests with already implemented components

Regression tests to keep from this stage onward

5. Notification Service

Service tests

Inter-service integration tests with already implemented components

Regression tests to keep from this stage onward

6. Game Lobby Service

Service tests

Inter-service integration tests with already implemented components

Regression tests to keep from this stage onward

7. Runtime Manager

Service tests

Inter-service integration tests with already implemented components

Regression tests to keep from this stage onward

8. Game Master

Service tests

Inter-service integration tests with already implemented components

Regression tests to keep from this stage onward

9. Admin Service

Service tests

Inter-service integration tests with already implemented components

Regression tests to keep from this stage onward

10. Geo Profile Service

Service tests

Inter-service integration tests with already implemented components

Regression tests to keep from this stage onward

11. Billing Service

Service tests

Inter-service integration tests with already implemented components

Regression tests to keep from this stage onward

Full System Tests

A. Authentication and session lifecycle

B. User profile and entitlement flow

C. Public game lifecycle

D. Private game lifecycle

E. Running-game command and push flow

F. Force-next-turn flow

G. Runtime failure flow

H. Start failure and recovery flow

I. Temporary vs final player removal flow

J. Notification routing flow

K. Geo auxiliary flow

L. Admin supervision flow

Ongoing Regression Policy

Practical Rule of Execution

Summary

34 KiB

Raw Blame History