Files
galaxy-game/TESTING.md
T
2026-04-09 12:34:55 +02:00

32 KiB

TESTING.md

Purpose

This document defines the testing strategy for the Galaxy Plus platform and provides a staged testing matrix aligned with the agreed service implementation order.

The strategy is built around the current architecture constraints:

  • Edge Gateway is the single public ingress and owns the external transport, authenticated gRPC verification pipeline, routing, and push delivery.
  • Auth / Session Service is the source of truth for challenges and device_session, but it must not become the hot-path dependency for every authenticated request.
  • Geo Profile Service is asynchronous and auxiliary; it must not block the current request and only affects subsequent requests.
  • Internal event propagation already exists as an architectural pattern through Redis-backed cache updates and pub/sub-style flows.

Global Testing Strategy

  • Start with service tests for each service in isolation.

  • As soon as a new service is integrated with already implemented services, add inter-service integration tests for that concrete boundary.

  • Only after all major components are implemented, add full system tests that exercise complete end-to-end platform flows.

  • Do not postpone all integration testing until the end.

  • Do not try to replace service tests with end-to-end tests.

  • Keep most tests deterministic and cheap to run.

  • Use real Redis in integration tests where Redis is part of the service contract.

  • Keep Mail Service stubbed in most integration and system tests, except for a small dedicated smoke suite for the real mail adapter.

  • Prefer fake or test-specific implementations for external side effects until the corresponding real service is intentionally introduced.

  • For every new service:

    • first add service tests;
    • then add inter-service tests against already implemented services;
    • then add regression scenarios to the growing system test suite.
  • For asynchronous flows:

    • test both successful delivery and delayed/eventual delivery;
    • test duplicate event handling;
    • test retry-safe and idempotent consumption;
    • test observability of stuck or failed processing.
  • For synchronous flows:

    • test happy path, validation failures, timeout propagation, dependency unavailability, and deterministic error mapping.
  • Every service with an external or trusted internal API must have contract tests in addition to behavioral tests.

  • Every service that publishes or consumes Redis Stream events must have schema/contract tests for those event payloads.

  • Full system tests should be small in number but broad in vertical coverage.

Test Layer Definitions

Service tests

Service tests verify one component in isolation.

They include:

  • domain/model tests;
  • use-case/service-layer tests;
  • adapter tests for storage, queues, clocks, IDs, and protocol encoding;
  • API handler/controller tests;
  • contract tests for DTOs and stable error surfaces;
  • service-local integration tests with owned infrastructure such as Redis.

Inter-service integration tests

Inter-service integration tests verify one real boundary between two or more already implemented services.

They include:

  • synchronous API compatibility;
  • event publication and consumption;
  • error propagation across service boundaries;
  • cache/projection compatibility;
  • retry and idempotency behavior across the seam;
  • compatibility of internal authenticated context and domain decisions.

Full system tests

Full system tests verify complete user or admin flows through the real architecture.

They include:

  • gateway ingress;
  • authentication;
  • user/profile state;
  • game lifecycle;
  • notifications and push;
  • runtime orchestration;
  • administrative operations;
  • failure and recovery behavior across multiple services.

Test Environment Rules

  • Use an isolated Redis instance per integration test suite or per test worker.

  • Use a stub Mail Service by default.

  • Use fake/test doubles for not-yet-implemented downstream services.

  • Introduce real downstream services progressively as they are implemented.

  • Use a test engine container or test engine stub for Game Master and Runtime Manager tests before relying on a real production engine image.

  • Use deterministic test clocks where scheduling or expiration matters.

  • Make async tests wait on observable states, not arbitrary sleeps, whenever possible.

  • Keep one small smoke suite for:

    • real Redis;
    • real runtime backend path;
    • real SMTP adapter later;
    • real signed gateway request/response flow.

The testing plan follows this service order:

  • Edge Gateway Service
  • Auth / Session Service
  • User Service
  • Mail Service
  • Notification Service
  • Game Lobby Service
  • Runtime Manager
  • Game Master
  • Admin Service
  • Geo Profile Service
  • Billing Service

1. Edge Gateway Service

Service tests

  • Public REST routing tests:

    • GET /healthz
    • GET /readyz
    • mounted public auth routes
    • wrong-method and not-found handling
    • public route-class classification for auth, browser bootstrap, browser asset, and misc traffic
    • isolation of browser/public-auth rate-limit buckets
    • rejection of oversized public request bodies
    • RemoteAddr-based public IP derivation that ignores forwarded proxy headers
    • public rate-limit behavior
    • stable projection of upstream public auth errors
    • sensitive-field redaction in public-auth logs
    • public OpenAPI contract validation
    • admin /metrics availability only on the private admin listener
  • Authenticated gRPC envelope validation tests:

    • missing required fields
    • unsupported protocol_version
    • parsed envelope attachment before delegate execution
    • malformed payload_hash
    • mismatched payload_hash
    • invalid signature
    • stale timestamp
    • replay detection
    • unknown session
    • revoked session
  • Session cache behavior tests:

    • cache hit
    • cache miss
    • malformed cached record
    • read-through local-cache warming after first fallback lookup
    • local hit skips fallback lookup
    • cache invalidation/update handling
  • Response signing tests:

    • signed unary response generation
    • unary response fails closed when the response signer is unavailable
    • signed bootstrap push event generation
    • bootstrap push fails closed when the response signer is unavailable
    • signed stream event generation
  • Routing tests:

    • unrouted message_type
    • downstream timeout mapping
    • downstream availability mapping
    • authenticated internal command context construction
    • verified trace/span context propagation downstream
    • graceful drain of in-flight unary requests on shutdown
    • sensitive transport material redaction in authenticated logs
  • Push tests:

    • SubscribeEvents binds user_id and device_session_id
    • bootstrap server-time event is emitted
    • user-targeted events fan out to all matching user sessions
    • session-targeted events reach only the addressed session
    • stream queue overflow closes only the affected stream
    • revoked session closes matching streams only
    • revoked-session stream reopen is rejected
    • active streams close with deterministic status on gateway shutdown
  • Anti-abuse tests:

    • IP/session/user/message-class buckets
    • interaction between rate limits and verification order
    • authenticated/public anti-abuse bucket isolation
    • authenticated policy-hook input and reject mapping
  • Redis adapter tests:

    • session cache lookup
    • replay reservation
    • client event stream consumption
    • session event stream consumption
    • subscriber start-from-tail semantics
    • malformed-event drop/evict-and-continue behavior
    • later-event-wins behavior for session snapshots
    • subscriber shutdown interrupts blocking reads

Inter-service integration tests at this stage

  • Gateway <-> Redis

    • session cache compatibility
    • replay reservation semantics
    • session update warms local cache without repeated fallback lookups
    • revoked snapshot invalidates authenticated requests without fallback lookup
    • client-event stream consumption for push fan-out
    • session-event stream consumption for revoke propagation and push teardown
  • Gateway <-> stub Auth adapter

    • public auth passthrough
    • timeout/error projection
  • Gateway <-> fake downstream

    • verified authenticated command routing
    • signed response generation after downstream success

Regression tests to keep from this stage onward

  • Authenticated request verification pipeline remains stable.
  • Public auth routes remain mounted and deterministic.
  • Public route classes and anti-abuse buckets remain isolated.
  • Admin metrics stay off the public ingress.
  • Push bootstrap event remains signed and schema-compatible.
  • Push revoke and shutdown close streams with stable status mapping.
  • Gateway logs remain free of sensitive request/auth material.

2. Auth / Session Service

Service tests

  • Challenge lifecycle tests:

    • challenge creation
    • TTL expiration
    • resend throttling
    • delivery_throttled challenge creation without UserDirectory or MailSender calls
    • delivery_suppressed behavior for blocked subjects
    • expiry grace-window transition from challenge_expired to challenge_not_found
    • delivery state transitions
    • invalid confirm attempt limits
    • success-shaped send-email-code behavior
  • Confirm flow tests:

    • valid challenge_id + code + client_public_key
    • malformed client_public_key
    • blocked user
    • existing user
    • creatable user
    • short-window idempotent confirm retry
    • projection repair on repeated confirm after prior publish failure
    • same challenge plus different public key failure
    • confirm-race cleanup of superseded sessions
    • session-limit exceeded
  • Session lifecycle tests:

    • create session
    • revoke one session
    • revoke all sessions
    • block user/email and revoke implied sessions
    • already_revoked, no_active_sessions, and already_blocked acknowledgement semantics
  • Projection tests:

    • source-of-truth session write
    • gateway KV snapshot write
    • gateway session stream event publish
    • repeated publish idempotency
    • stored session reread before publish to avoid stale active projection
  • Public API tests:

    • JSON decoding and unknown field rejection
    • public error mapping
    • stable success DTO shape
  • Internal API tests:

    • GetSession
    • ListUserSessions
    • RevokeDeviceSession
    • RevokeAllUserSessions
    • BlockUser
  • Redis adapter tests:

    • challenge store
    • session store
    • config provider
    • projection publisher

Inter-service integration tests with already implemented components

  • Gateway <-> Auth / Session

    • public send-email-code
    • public confirm-email-code
    • upstream timeout handling
    • public error passthrough
  • Auth / Session <-> Redis

    • challenge persistence
    • session persistence
    • session projection compatibility
  • Gateway <-> Auth / Session <-> Redis

    • login creates session
    • session projection becomes visible to gateway
    • repeated confirm repairs a previously failed projection publish
    • revoked session invalidates gateway authentication path
    • revoked session closes gateway push stream
  • Auth / Session <-> stub Mail

    • auth code send path
    • suppression path
    • explicit mail failure path

Regression tests to keep from this stage onward

  • confirm-email-code always returns a ready device_session_id.
  • Gateway continues authenticating from cache rather than synchronous auth lookups.
  • Confirm idempotency window behavior remains stable.
  • Projection repair-on-retry remains safe after source-of-truth commits.
  • Confirm-race cleanup does not leave multiple active winner sessions.
  • Session projection remains compatible with gateway expectations.

3. User Service

Service tests

  • User creation and identity tests:

    • create user
    • find by email
    • normalized email uniqueness
    • generated default race_name for new users
    • race_name uniqueness and confusable-substitution policy
    • role assignment
    • tariff/entitlement fields
  • Profile tests:

    • allowed profile reads
    • allowed profile edits
    • forbidden profile edits
    • self-service rejection for e-mail and declared_country mutations
    • profile_update_block sanction gating for profile/settings writes
    • settings reads/writes
    • BCP 47 and IANA validation for settings values
  • Restriction/sanction tests:

    • block flags
    • user limits
    • override fields
    • declared current sanctions view
    • effective sanction/limit snapshot shaping for downstream consumers
  • Entitlement tests:

    • free user
    • paid placeholder states
    • default simultaneous-game limit and per-user overrides
    • entitlement, sanction, and limit interaction rules
  • Internal/admin-oriented tests:

    • resolve existing/creatable/blocked decision for auth
    • ensure-by-email create-only registration_context semantics
    • current declared_country read/write path
    • exact lookup by user_id, normalized email, and race_name
    • paginated filtered listing with deterministic ordering
  • Storage and API contract tests:

    • public/trusted endpoints
    • stable DTO mapping
    • Redis persistence if used directly in v1

Inter-service integration tests with already implemented components

  • Auth / Session <-> User

    • resolve existing user
    • create new user during confirm
    • blocked-by-policy outcome
  • Gateway <-> User

    • authenticated profile read
    • authenticated allowed profile update
    • tariff and settings read paths
  • Gateway <-> Auth / Session <-> User

    • first registration by email
    • repeat login by same email
    • blocked email/user behavior

Regression tests to keep from this stage onward

  • User resolution outcomes remain stable for auth flow.
  • User-facing profile APIs do not bypass auth/session rules.
  • registration_context stays create-only and does not overwrite existing users.
  • race_name uniqueness policy remains stable for self-service and auth-created users.
  • User limit and sanction data stay compatible with downstream consumers.

4. Mail Service

Service tests

  • Mail command validation tests:

    • recipient validation
    • template selection
    • payload rendering
  • Internal queue tests:

    • enqueue
    • dequeue
    • retry
    • permanent failure
    • idempotent duplicate suppression where applicable
  • Delivery adapter tests:

    • stub adapter behavior
    • future SMTP adapter smoke behavior
  • Operational tests:

    • queue backlog metrics
    • dead-letter or failure recording behavior
    • timeout handling

Inter-service integration tests with already implemented components

  • Auth / Session <-> Mail

    • direct auth-code send
    • explicit mail failure behavior
    • suppression path still preserves correct auth semantics
  • Gateway <-> Auth / Session <-> Mail

    • public auth flow still behaves correctly with mail delivery involved
  • Keep Mail Service stubbed in most broader suites.

  • Add only a small dedicated smoke suite for the real mail adapter.

Regression tests to keep from this stage onward

  • Auth code mail remains a direct dependency of auth flow.
  • Mail failures do not corrupt auth challenge/session state.
  • Stub mail remains the default for most non-mail-focused suites.

5. Notification Service

Service tests

  • Event intake tests:

    • accepted event types
    • malformed event rejection
    • idempotent duplicate handling
  • Routing decision tests:

    • push only
    • email only
    • push and email
    • discard/no-delivery cases
  • Rendering tests:

    • event-to-notification mapping
    • payload shaping for push
    • payload shaping for email
  • Failure isolation tests:

    • push failure does not corrupt email route decision
    • email failure does not corrupt push route decision
    • retriable delivery behavior
  • Redis/event bus tests:

    • consume domain/integration events
    • publish client-facing events for gateway
    • enqueue mail commands for mail service

Inter-service integration tests with already implemented components

  • Notification <-> Gateway

    • client-facing event publication and push delivery
    • user-targeted vs session-targeted push routing
  • Notification <-> Mail

    • non-auth email delivery
    • retry/failure isolation
  • Lobby/other fake producers <-> Notification

    • domain event intake compatibility
  • Assert explicitly that auth-code emails still bypass notification and go directly from auth to mail.

Regression tests to keep from this stage onward

  • Notification stays delivery/orchestration-only and does not become source of truth.
  • Non-auth notifications consistently go through notification service.
  • Gateway push compatibility remains stable.

6. Game Lobby Service

Service tests

  • Game lifecycle tests:

    • draft
    • enrollment_open
    • enrollment_closed
    • ready_to_start
    • starting
    • running
    • paused
    • finished
    • cancelled
  • Public/private game rules:

    • public game creation by admin only
    • private game creation entitlement checks
    • visibility rules for private games
  • Invite lifecycle tests:

    • invite code creation
    • invite code redemption
    • invite approval/rejection
    • invite expiration if applicable later
  • Application and approval tests:

    • public game application
    • manual approval
    • duplicate application handling
  • Membership tests:

    • invited
    • pending
    • accepted
    • removed
    • blocked from party
  • User list/read-model tests:

    • active games
    • finished games
    • pending applications
    • invited games
  • Start-preparation tests:

    • roster validation
    • schedule validation
    • engine version target validation
    • readiness to start
  • Runtime snapshot import tests:

    • current_turn
    • runtime_status
    • engine_health_summary

Inter-service integration tests with already implemented components

  • Gateway <-> Game Lobby

    • authenticated platform-level command routing
    • owner-only commands before start
  • Lobby <-> User

    • entitlement checks for private game creation
    • per-user simultaneous-game limits
    • sanctions affecting join/create flows
  • Lobby <-> Notification

    • invite events
    • approval/rejection events
    • game status change events at platform level
  • Lobby <-> Auth / Session

    • authenticated context correctly propagated from gateway
  • Keep runtime launch boundaries stubbed until Runtime Manager exists.

Regression tests to keep from this stage onward

  • Lobby remains source of truth for platform game metadata and membership.
  • Lobby user-facing game lists remain independent from Game Master.
  • Private-game visibility and invite semantics remain stable.

7. Runtime Manager

Service tests

  • Runtime job tests:

    • start container
    • stop container
    • restart container
    • patch container
    • inspect/status
  • Invariant tests:

    • one game -> one container
    • one container -> one game
  • Monitoring tests:

    • health probe collection
    • health event publication
    • container disappearance handling
    • restart/patch result reporting
  • Failure tests:

    • Docker API unavailable
    • image missing
    • startup timeout
    • stop timeout
    • patch failure
  • Event publication tests:

    • runtime job completion events
    • technical health events
    • duplicate event safety

Inter-service integration tests with already implemented components

  • Lobby <-> Runtime Manager

    • async start job request
    • completion event consumption
    • full fail-start path
  • Runtime Manager <-> Notification

    • optional operational event routing if enabled
  • Use a fake or test runtime backend first, then a targeted smoke suite against a real local Docker backend.

Regression tests to keep from this stage onward

  • Runtime Manager remains the only component talking to Docker API.
  • Runtime job event contracts remain stable for Lobby and later Game Master.

8. Game Master

Service tests

  • Runtime registry tests:

    • register running game
    • unregister/stop game
    • runtime state transitions
  • Engine version registry tests:

    • version registration
    • patch compatibility policy
    • version-specific options
  • Runtime metadata tests:

    • current turn
    • runtime status
    • generation status
    • engine health summary
    • patch state
  • Membership/runtime mapping tests:

    • user_id -> engine player UUID
    • game-scoped engine identifiers
  • Scheduling tests:

    • scheduled turn generation
    • cutoff enforcement
    • manual force-next-turn
    • skip-next-scheduled-slot after manual generation
  • Failure tests:

    • generation_failed
    • engine_unreachable
    • runtime recovery from engine errors
  • Post-start administrative tests:

    • stop game
    • patch engine
    • temporary player removal at platform gate only
    • final player removal/deactivation inside engine
  • Engine mediation tests:

    • engine setup after lobby metadata persistence
    • engine finish notification handling

Inter-service integration tests with already implemented components

  • Gateway <-> Game Master

    • running-game command routing with game_id
    • runtime-admin commands for running games
    • system admin vs private-owner privileges where applicable
  • Game Master <-> Lobby

    • running-game registration after successful container start
    • membership lookup/cached authorization
    • runtime snapshot backfill into lobby
    • finished-game notification to lobby
  • Game Master <-> Runtime Manager

    • patch/stop/restart jobs
    • runtime health event consumption
  • Game Master <-> Notification

    • new turn event publication
    • game finished event publication
    • generation failure admin notification
  • Game Master <-> test engine container

    • command proxying
    • status read
    • setup call
    • finish callback

Regression tests to keep from this stage onward

  • Game Master remains the only service allowed to call game engine containers.
  • Turn cutoff logic stays authoritative at platform level.
  • Manual next-turn generation always suppresses the next scheduled slot.
  • Runtime snapshot compatibility with Lobby remains stable.

9. Admin Service

Service tests

  • Admin API surface tests:

    • admin-only route handling
    • DTO validation
    • aggregation/read models
  • Orchestration tests:

    • forwards trusted operations to downstream services
    • error aggregation and normalization
    • partial failure handling for multi-step admin workflows
  • Role-handling tests:

    • admin-only enforcement assumptions
    • no accidental privilege leak into normal user flows

Inter-service integration tests with already implemented components

  • Gateway <-> Admin

    • separate admin REST surface
    • admin-authenticated request handling
  • Admin <-> User

    • user restriction/sanction/admin reads
  • Admin <-> Lobby

    • public game administration
    • global read of private games
  • Admin <-> Game Master

    • runtime administration
    • global status reads
    • patch/stop/force-next-turn
  • Admin <-> Auth / Session

    • session revoke/block operations if exposed through admin workflows
  • Admin <-> Notification

    • admin-generated notifications where needed

Regression tests to keep from this stage onward

  • Admin Service remains orchestration/backend only.
  • System admin capabilities remain separate from private-owner capabilities.

10. Geo Profile Service

Service tests

  • Ingest tests:

    • enqueue authenticated observation
    • ingest validation
    • malformed FlatBuffers payload rejection
    • required-scalar-field validation
    • non-blocking acceptance
  • Worker pipeline tests:

    • geo lookup
    • geo lookup miss handling
    • country aggregation
    • usual_connection_country derivation
    • suspicious multi-country detection
    • review recommendation calculation
    • queue retry-safe processing
  • State tests:

    • durable country_review_recommended
    • declared-country version history
    • declared-country version lifecycle: recorded, applied, sync_failed
    • session block action history
  • Admin/query API tests:

    • list review candidates
    • stable ordering and pagination for candidate queries
    • read user geo profile
    • grouping by device_session_id in review/read responses
    • apply approved declared-country change
  • Queue and lag tests:

    • backlog observability
    • duplicate observation safety
    • delayed processing behavior
    • retry and failure observability

Inter-service integration tests with already implemented components

  • Gateway <-> Geo

    • async observation publish from authenticated request context
    • fail-open edge behavior when geo ingest is unavailable
  • Geo <-> Auth / Session

    • suspicious session block request
    • subsequent-request effect rather than current-request effect
  • Geo <-> User

    • synchronous update of current declared_country
    • no divergence between history and current value
  • Geo <-> Notification

    • review-recommended event fan-out
    • optional admin notification flow
  • Keep geo processing fail-open relative to gameplay in all integration tests.

Regression tests to keep from this stage onward

  • Geo processing never blocks the current gameplay request.
  • Review-recommended state remains queryable even when event/mail side effects fail.
  • Session suspicion affects only later requests via auth/session.
  • Geo owns history, while user service owns current effective declared country.

11. Billing Service

Service tests

  • Payment event intake tests:

    • accepted event types
    • malformed event rejection
    • idempotent duplicate handling
  • Entitlement mapping tests:

    • free
    • monthly-paid
    • annual-paid
    • once-forever-paid
  • Lifecycle tests:

    • activate paid entitlement
    • expire renewable entitlement
    • cancel paid entitlement
    • preserve perpetual entitlement
  • Failure tests:

    • unknown user
    • invalid payment state
    • downstream user update failure

Inter-service integration tests with already implemented components

  • Billing <-> User

    • entitlement updates become current source of truth in user service
  • Billing <-> Notification

    • optional billing-related user/admin notifications
  • Gateway <-> User regression:

    • user-facing entitlement reads reflect billing-fed updates correctly

Regression tests to keep from this stage onward

  • Other services never depend directly on billing for live entitlement decisions.
  • User Service remains the source of truth for current entitlement.

Full System Tests

These tests are added only after all major components are implemented.

By default, they should use:

  • real gateway;
  • real auth/session;
  • real user;
  • real notification;
  • real lobby;
  • real runtime manager;
  • real game master;
  • real admin;
  • real geo;
  • real Redis;
  • stub Mail Service by default;
  • test engine container or stable test engine image.

A. Authentication and session lifecycle

  • Register/login via email code through gateway.

  • Confirm that device_session_id becomes usable through gateway without synchronous auth lookups on every request.

  • Confirm that repeated confirm-email-code within the idempotency window returns the same device_session_id.

  • Revoke one session and verify:

    • authenticated requests fail for that session;
    • only push streams bound to that session are closed.
  • Revoke all sessions of a user and verify all sessions are rejected afterward.

B. User profile and entitlement flow

  • Read and update allowed user profile fields through gateway.
  • Read tariff/entitlement and user limits through gateway.
  • Verify that private-party creation entitlement decisions reflect current user-service state.
  • Later, verify billing-fed entitlement changes become visible through user-service reads.

C. Public game lifecycle

  • Admin creates a public game.
  • Users see it in public lists.
  • Users apply.
  • Admin approves roster.
  • Lobby validates readiness.
  • Runtime Manager starts container.
  • Lobby persists metadata.
  • Game Master registers the running game and initializes engine.
  • Game becomes visible as running in user lists.

D. Private game lifecycle

  • Eligible user creates private game.
  • Owner creates invite code.
  • Another user redeems invite code and applies.
  • Owner approves application.
  • Owner starts game.
  • Running registration completes.
  • Only authorized users see the private game.

E. Running-game command and push flow

  • Player sends valid game command before cutoff.
  • Gateway authenticates and routes to Game Master.
  • Game Master verifies access and forwards to engine.
  • Scheduled turn generation occurs.
  • Player receives lightweight push notification through gateway.
  • Player separately fetches updated per-player game state.

F. Force-next-turn flow

  • Running game has a fixed schedule.
  • Owner or admin triggers manual next-turn generation.
  • Current turn increments.
  • Next scheduled slot is skipped.
  • Subsequent scheduled generation happens only after the following valid slot.

G. Runtime failure flow

  • Scheduled turn generation fails.
  • Game Master marks generation_failed.
  • Lobby receives updated runtime snapshot.
  • Only administrators are notified through notification flow.
  • Users can still observe degraded problem state through status reads.

H. Start failure and recovery flow

  • Lobby requests runtime start.
  • Runtime Manager starts container.
  • Simulate metadata persistence failure in Lobby.
  • Verify container is removed and game is not left half-started.
  • Simulate successful metadata persistence but Game Master registration failure.
  • Verify game is marked paused and admin is notified.

I. Temporary vs final player removal flow

  • Temporarily remove player after game start.
  • Verify player can no longer send commands through platform.
  • Verify engine still keeps the slot.
  • Final-remove or account-block the player.
  • Verify Game Master sends engine admin command to deactivate/remove the player.

J. Notification routing flow

  • Lobby emits invite/application/approval events.
  • Notification Service sends push through gateway.
  • Non-auth email notifications route through Notification Service to Mail Service.
  • Auth-code emails remain direct Auth / Session -> Mail.

K. Geo auxiliary flow

  • Authenticated traffic generates geo observations.
  • Suspicious multi-country pattern is detected.
  • Current triggering request still succeeds.
  • Auth / Session blocks the suspicious session.
  • Next request from that session is rejected.

L. Admin supervision flow

  • System admin uses admin REST through gateway.
  • Admin can view public and private games.
  • Admin can inspect running-game runtime state.
  • Admin can stop game, patch engine, and force next turn.
  • Admin can block users and revoke sessions through appropriate downstream APIs.

Ongoing Regression Policy

  • Every time a new service is added, its service tests are mandatory before merging.

  • Every new service boundary must add at least one inter-service integration suite against already implemented neighbors.

  • Every bug found in integration or system testing must produce:

    • one narrow regression test at the lowest useful level;
    • and, if applicable, one broader integration or system scenario.
  • The full system suite should stay intentionally limited to high-value vertical slices, not explode into a giant matrix.

Practical Rule of Execution

  • During early development:

    • run service tests on every change;
    • run inter-service tests for affected neighboring services on every branch;
    • run a reduced smoke subset of system tests in CI.
  • During stabilization:

    • keep service and integration tests mandatory in CI;
    • expand system tests around the critical product flows only.

Summary

The project-wide testing strategy is fixed as follows:

  • first, service tests inside each component;
  • then, as components appear, inter-service integration tests between real neighboring services;
  • finally, after all major components are implemented, full system tests for complete end-to-end platform flows.

This order is mandatory for the project because the architecture contains several critical stateful and asynchronous seams:

  • gateway verification and routing;
  • auth/session projection into gateway cache;
  • push delivery through gateway;
  • Redis Streams event propagation;
  • runtime job completion;
  • lobby/game-master synchronization;
  • geo post-factum protective actions.