16 KiB
Runtime and Components
The diagram below focuses on the deployed galaxy/rtmanager process
and its runtime dependencies. The current-state contract for every
listener, worker, and adapter lives in ../README.md;
this document is the navigation aid that points at the right code path
and the right design-rationale record.
flowchart LR
subgraph Clients
GM["Game Master"]
Admin["Admin Service"]
Lobby["Game Lobby"]
end
subgraph RTM["Runtime Manager process"]
InternalHTTP["Internal HTTP listener\n:8096 /healthz /readyz + REST"]
StartJobs["startjobsconsumer"]
StopJobs["stopjobsconsumer"]
DockerEvents["dockerevents listener"]
HealthProbe["healthprobe worker"]
DockerInspect["dockerinspect worker"]
Reconcile["reconcile worker"]
Cleanup["containercleanup worker"]
Services["lifecycle services\n(start, stop, restart, patch, cleanupcontainer)"]
IntentPublisher["notification:intents publisher"]
ResultsPublisher["runtime:job_results publisher"]
HealthPublisher["runtime:health_events publisher"]
Telemetry["Logs, traces, metrics"]
end
Docker["Docker Daemon"]
Engine["galaxy-game-{game_id} container"]
Postgres["PostgreSQL\nschema rtmanager"]
Redis["Redis\nstreams + leases + offsets"]
LobbyHTTP["Lobby internal HTTP"]
Lobby -. runtime:start_jobs .-> StartJobs
Lobby -. runtime:stop_jobs .-> StopJobs
GM --> InternalHTTP
Admin --> InternalHTTP
StartJobs --> Services
StopJobs --> Services
InternalHTTP --> Services
Services --> Docker
Services --> Postgres
Services --> Redis
Services --> ResultsPublisher
Services --> HealthPublisher
Services --> IntentPublisher
Services -. GET diagnostic .-> LobbyHTTP
DockerEvents --> Docker
DockerInspect --> Docker
HealthProbe --> Engine
Reconcile --> Docker
Reconcile --> Postgres
Cleanup --> Postgres
Cleanup --> Services
DockerEvents --> HealthPublisher
DockerInspect --> HealthPublisher
HealthProbe --> HealthPublisher
HealthPublisher --> Redis
ResultsPublisher --> Redis
IntentPublisher --> Redis
StartJobs --> Redis
StopJobs --> Redis
InternalHTTP --> Postgres
Docker -->|create / start / stop / rm| Engine
Engine -. bind mount .- StateDir["host:\n<RTMANAGER_GAME_STATE_ROOT>/{game_id}"]
InternalHTTP --> Telemetry
Services --> Telemetry
StartJobs --> Telemetry
StopJobs --> Telemetry
DockerEvents --> Telemetry
HealthProbe --> Telemetry
DockerInspect --> Telemetry
Reconcile --> Telemetry
Cleanup --> Telemetry
Notes:
cmd/rtmanagerrefuses startup when PostgreSQL is unreachable, when goose migrations fail, when Redis ping fails, when the Docker daemon ping fails, or when the configured Docker network is missing. Lobby reachability is not verified at boot — the start service's diagnosticGET /api/v1/internal/games/{game_id}call is a no-op outside of debug logging (services.md§7).- The reconciler runs synchronously once on startup before
app.App.Runregisters any other component, then re-runs periodically as a regularComponent. The synchronous pass is the reason why orphaned containers from a prior process can never be observed by the events listener with no PG record (workers.md§17). - A single internal HTTP listener exposes both probes
(
/healthz,/readyz) and the trusted REST surface for Game Master and Admin Service. There is no public listener — RTM does not face end users.
Listeners
| Listener | Default addr | Purpose |
|---|---|---|
| Internal HTTP | :8096 |
Probes (/healthz, /readyz) plus the trusted REST surface for Game Master and Admin Service |
Shared listener defaults from RTMANAGER_INTERNAL_HTTP_*:
- read timeout:
5s - write timeout:
15s - idle timeout:
60s
The listener is unauthenticated and assumes a trusted network segment.
The X-Galaxy-Caller request header carries an optional caller
identity (gm or admin) that the handler records as
operation_log.op_source
(services.md §18).
Probe routes:
GET /healthz— process liveness; returns{"status":"ok"}while the listener is up.GET /readyz— live-pings PostgreSQL primary, Redis master, and the Docker daemon, then asserts the configured Docker network exists. Returns{"status":"ready"}only when every check passes; otherwise returns503with the canonical error envelope.
Background Workers
Every worker runs as an app.Component and is registered in the
order below by internal/app/runtime.go.
| Worker | Source | Trigger | Function |
|---|---|---|---|
| Start jobs consumer | internal/worker/startjobsconsumer |
Redis XREAD runtime:start_jobs |
Decodes {game_id, image_ref, requested_at_ms} and invokes startruntime.Service; publishes the outcome to runtime:job_results |
| Stop jobs consumer | internal/worker/stopjobsconsumer |
Redis XREAD runtime:stop_jobs |
Decodes {game_id, reason, requested_at_ms} and invokes stopruntime.Service; publishes the outcome to runtime:job_results |
| Docker events listener | internal/worker/dockerevents |
Docker /events API filtered by com.galaxy.owner=rtmanager |
Emits runtime:health_events for container_exited, container_oom, container_disappeared. Reconnects on transport errors with a fixed 5s backoff (workers.md §7) |
| Health probe worker | internal/worker/healthprobe |
Periodic RTMANAGER_PROBE_INTERVAL |
GET {engine_endpoint}/healthz for every running runtime; in-memory hysteresis emits probe_failed after RTMANAGER_PROBE_FAILURES_THRESHOLD consecutive failures and probe_recovered on the first success thereafter (workers.md §5–§6) |
| Docker inspect worker | internal/worker/dockerinspect |
Periodic RTMANAGER_INSPECT_INTERVAL |
Calls InspectContainer for every running runtime; emits inspect_unhealthy on RestartCount growth, unexpected status, or Docker HEALTHCHECK=unhealthy |
| Reconciler | internal/worker/reconcile |
Synchronous startup pass + periodic RTMANAGER_RECONCILE_INTERVAL |
Adopts unrecorded containers (reconcile_adopt), disposes records whose container vanished (reconcile_dispose), records observed exits (observed_exited); every mutation runs under the per-game lease (workers.md §14–§15) |
| Container cleanup | internal/worker/containercleanup |
Periodic RTMANAGER_CLEANUP_INTERVAL |
Lists runtime_records rows with status=stopped AND last_op_at < now - retention, delegates to cleanupcontainer.Service per game (workers.md §19) |
The events listener and the inspect worker do not emit
container_started — that event is owned by the start service
(workers.md §1). The events listener and the inspect
worker also do not emit container_disappeared autonomously when a
record is missing or stale; the conditional emission rules live in
workers.md §2 and §4.
Lifecycle Services
The five lifecycle services are pure orchestrators called from both the stream consumers and the REST handlers. Each service owns the per-game lease for the duration of its operation.
| Service | Source | Triggers | Failure envelope |
|---|---|---|---|
startruntime |
internal/service/startruntime |
runtime:start_jobs, POST /api/v1/internal/runtimes/{id}/start |
start_config_invalid, image_pull_failed, container_start_failed, conflict, service_unavailable, internal_error (services.md §4) |
stopruntime |
internal/service/stopruntime |
runtime:stop_jobs, POST /api/v1/internal/runtimes/{id}/stop |
conflict, service_unavailable, internal_error, not_found (services.md §17) |
restartruntime |
internal/service/restartruntime |
POST /api/v1/internal/runtimes/{id}/restart |
inherited from inner stop / start; lease covers both inner ops (services.md §12, §17) |
patchruntime |
internal/service/patchruntime |
POST /api/v1/internal/runtimes/{id}/patch |
image_ref_not_semver, semver_patch_only, plus inherited start/stop codes (services.md §14, §17) |
cleanupcontainer |
internal/service/cleanupcontainer |
DELETE /api/v1/internal/runtimes/{id}/container, periodic cleanup worker |
not_found, conflict, service_unavailable, internal_error (services.md §17) |
All services share three behaviours captured in
services.md:
- the per-game Redis lease (
rtmanager:game_lease:{game_id}, TTLRTMANAGER_GAME_LEASE_TTL_SECONDS) is acquired by the service, not by the caller — which keeps consumer and REST callers symmetric (services.md§1); - the canonical
Resultshape (Outcome,ErrorCode,Record,ContainerID,EngineEndpoint) is what consumers and REST handlers translate into job_results / HTTP responses (services.md§3); - failures pass through one
operation_logwrite before returning, and three of the failure codes (start_config_invalid,image_pull_failed,container_start_failed) also publish aruntime.*admin notification intent (services.md§4).
Synchronous Upstream Client
| Client | Endpoint | Failure mapping |
|---|---|---|
Game Lobby internal |
GET {RTMANAGER_LOBBY_INTERNAL_BASE_URL}/api/v1/internal/games/{game_id} |
Diagnostic-only in v1; the start service ignores the body and absorbs network failures with a debug log. Decision: services.md §7 |
Lobby's outbound transport is the only synchronous client RTM holds. Every other interaction (Notification Service, Game Master, Admin Service) crosses an asynchronous boundary or is initiated by the peer.
Stream Offsets
Each consumer persists its position under a fixed label so process restart preserves stream progress.
| Stream | Offset key | Block timeout env |
|---|---|---|
runtime:start_jobs |
rtmanager:stream_offsets:startjobs |
RTMANAGER_STREAM_BLOCK_TIMEOUT |
runtime:stop_jobs |
rtmanager:stream_offsets:stopjobs |
RTMANAGER_STREAM_BLOCK_TIMEOUT |
The labels startjobs and stopjobs are stable identifiers — they
are decoupled from the underlying stream key. An operator who renames
a stream via RTMANAGER_REDIS_START_JOBS_STREAM /
RTMANAGER_REDIS_STOP_JOBS_STREAM does not lose the persisted offset.
Decision: workers.md §9.
The runtime:job_results, runtime:health_events, and
notification:intents streams are outbound; RTM does not consume them
itself.
Configuration Groups
The full env-var list with defaults lives in
../README.md §Configuration. The groups below
summarise the structure:
- Required —
RTMANAGER_INTERNAL_HTTP_ADDR,RTMANAGER_POSTGRES_PRIMARY_DSN,RTMANAGER_REDIS_MASTER_ADDR,RTMANAGER_REDIS_PASSWORD,RTMANAGER_DOCKER_HOST,RTMANAGER_DOCKER_NETWORK,RTMANAGER_GAME_STATE_ROOT. - Listener —
RTMANAGER_INTERNAL_HTTP_*timeouts. - Docker —
RTMANAGER_DOCKER_HOST,RTMANAGER_DOCKER_API_VERSION,RTMANAGER_DOCKER_NETWORK,RTMANAGER_DOCKER_LOG_DRIVER,RTMANAGER_DOCKER_LOG_OPTS,RTMANAGER_IMAGE_PULL_POLICY. - Container defaults —
RTMANAGER_DEFAULT_CPU_QUOTA,RTMANAGER_DEFAULT_MEMORY,RTMANAGER_DEFAULT_PIDS_LIMIT,RTMANAGER_CONTAINER_STOP_TIMEOUT_SECONDS,RTMANAGER_CONTAINER_RETENTION_DAYS,RTMANAGER_ENGINE_STATE_MOUNT_PATH,RTMANAGER_ENGINE_STATE_ENV_NAME,RTMANAGER_GAME_STATE_DIR_MODE,RTMANAGER_GAME_STATE_OWNER_UID,RTMANAGER_GAME_STATE_OWNER_GID. - PostgreSQL connectivity —
RTMANAGER_POSTGRES_PRIMARY_DSN,RTMANAGER_POSTGRES_REPLICA_DSNS,RTMANAGER_POSTGRES_OPERATION_TIMEOUT,RTMANAGER_POSTGRES_MAX_OPEN_CONNS,RTMANAGER_POSTGRES_MAX_IDLE_CONNS,RTMANAGER_POSTGRES_CONN_MAX_LIFETIME. - Redis connectivity —
RTMANAGER_REDIS_MASTER_ADDR,RTMANAGER_REDIS_REPLICA_ADDRS,RTMANAGER_REDIS_PASSWORD,RTMANAGER_REDIS_DB,RTMANAGER_REDIS_OPERATION_TIMEOUT. - Streams —
RTMANAGER_REDIS_START_JOBS_STREAM,RTMANAGER_REDIS_STOP_JOBS_STREAM,RTMANAGER_REDIS_JOB_RESULTS_STREAM,RTMANAGER_REDIS_HEALTH_EVENTS_STREAM,RTMANAGER_NOTIFICATION_INTENTS_STREAM,RTMANAGER_STREAM_BLOCK_TIMEOUT. - Health monitoring —
RTMANAGER_INSPECT_INTERVAL,RTMANAGER_PROBE_INTERVAL,RTMANAGER_PROBE_TIMEOUT,RTMANAGER_PROBE_FAILURES_THRESHOLD. - Reconciler / cleanup —
RTMANAGER_RECONCILE_INTERVAL,RTMANAGER_CLEANUP_INTERVAL. - Coordination —
RTMANAGER_GAME_LEASE_TTL_SECONDS. - Lobby internal client —
RTMANAGER_LOBBY_INTERNAL_BASE_URL,RTMANAGER_LOBBY_INTERNAL_TIMEOUT. - Process and logging —
RTMANAGER_LOG_LEVEL,RTMANAGER_SHUTDOWN_TIMEOUT. - Telemetry — standard
OTEL_*.
Runtime Notes
- Single-instance v1. Multi-instance Runtime Manager with Redis Streams consumer groups is explicitly out of scope for the current iteration. The per-game lease serialises operations on one game across the consumer + REST entry points; cross-instance coordination is deferred until a real workload demands it.
- Lease semantics.
rtmanager:game_lease:{game_id}isSET ... NX PX <ttl>with TTLRTMANAGER_GAME_LEASE_TTL_SECONDS(default60s). The lease is not renewed mid-operation in v1; long pulls of multi-GB images can therefore expire the lease before the operation finishes — the trade-off is documented inservices.md§1. The reconciler honours the same lease around every drift mutation (workers.md§14). - Operation log is the source of truth. Every lifecycle and
reconcile mutation appends one row to
rtmanager.operation_log. Theruntime:health_eventsstream and thenotification:intentsemissions are best-effort — a publish failure logs atErrorand proceeds, never rolling back the recorded operation (workers.md§8). - In-memory probe hysteresis. The active HTTP probe keeps
per-game
consecutiveFailuresandfailurePublishedcounters in a mutex-guarded map. State is non-persistent: a process restart that loses the counters re-establishes hysteresis from scratch, and state for a game that transitions throughstopped → runningis pruned at the start of every probe tick (workers.md§5). - Pull policy fallbacks.
RTMANAGER_IMAGE_PULL_POLICYacceptsif_missing(default),always, andnever. Image labels (com.galaxy.cpu_quota,com.galaxy.memory,com.galaxy.pids_limit) drive resource limits when present; the matchingRTMANAGER_DEFAULT_*env vars supply the fallback when a label is absent or unparseable. Producers never pass limits. - State directory ownership. RTM creates per-game state
directories under
RTMANAGER_GAME_STATE_ROOTwith the configured mode and uid/gid, but never deletes them. Removing the directory is operator domain (backup tooling, a future Admin Service workflow). A cleanup that removes the container leaves the directory intact.