306 lines
12 KiB
Markdown
306 lines
12 KiB
Markdown
# Flows
|
|
|
|
This document collects the lifecycle and observability flows that
|
|
span Runtime Manager and its synchronous and asynchronous neighbours.
|
|
Narrative descriptions of the rules these flows enforce live in
|
|
[`../README.md`](../README.md); the diagrams here focus on the message
|
|
order across the boundary. Design-rationale records linked from each
|
|
section explain the *why*.
|
|
|
|
## Start (happy path)
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Lobby as Lobby publisher
|
|
participant Stream as runtime:start_jobs
|
|
participant Consumer as startjobsconsumer
|
|
participant Service as startruntime
|
|
participant Lease as Redis lease
|
|
participant Docker
|
|
participant PG as Postgres
|
|
participant Health as runtime:health_events
|
|
participant Results as runtime:job_results
|
|
|
|
Lobby->>Stream: XADD {game_id, image_ref, requested_at_ms}
|
|
Consumer->>Stream: XREAD
|
|
Consumer->>Service: Handle(game_id, image_ref, OpSourceLobbyStream, entry_id)
|
|
Service->>Lease: SET NX PX rtmanager:game_lease:{game_id}
|
|
Service->>PG: SELECT runtime_records WHERE game_id
|
|
Service->>Docker: PullImage(image_ref) per pull policy
|
|
Service->>Docker: InspectImage → resource limits
|
|
Service->>Service: prepareStateDir(<root>/{game_id})
|
|
Service->>Docker: ContainerCreate + ContainerStart
|
|
Service->>PG: Upsert runtime_records (status=running)
|
|
Service->>PG: INSERT operation_log (op_kind=start, outcome=success)
|
|
Service->>Health: XADD container_started
|
|
Service-->>Consumer: Result{Outcome=success, ContainerID, EngineEndpoint}
|
|
Consumer->>Results: XADD {outcome=success, container_id, engine_endpoint}
|
|
Service->>Lease: DEL rtmanager:game_lease:{game_id}
|
|
```
|
|
|
|
REST callers (Game Master, Admin Service) drive the same service
|
|
through `POST /api/v1/internal/runtimes/{game_id}/start`; the
|
|
diagram's last two arrows collapse to an HTTP `200` response carrying
|
|
the runtime record. Sources:
|
|
[`../README.md` §Lifecycles → Start](../README.md#start),
|
|
[`services.md` §3](services.md).
|
|
|
|
## Start failure (image pull)
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Service as startruntime
|
|
participant Docker
|
|
participant PG as Postgres
|
|
participant Intents as notification:intents
|
|
participant Results as runtime:job_results
|
|
|
|
Service->>Docker: PullImage(image_ref)
|
|
Docker-->>Service: error
|
|
Service->>PG: INSERT operation_log (op_kind=start, outcome=failure, error_code=image_pull_failed)
|
|
Service->>Intents: XADD runtime.image_pull_failed {game_id, image_ref, error_code, error_message, attempted_at_ms}
|
|
Service-->>Service: Result{Outcome=failure, ErrorCode=image_pull_failed}
|
|
Service->>Results: XADD {outcome=failure, error_code=image_pull_failed}
|
|
```
|
|
|
|
The same shape applies to the configuration-validation failures
|
|
(`start_config_invalid` from `EnsureNetwork(ErrNetworkMissing)`,
|
|
`prepareStateDir`, or invalid `image_ref` shape) and the Docker
|
|
create/start failure (`container_start_failed`); only the error code
|
|
and the matching `runtime.*` notification type differ. Three failure
|
|
codes do **not** raise an admin notification: `conflict`,
|
|
`service_unavailable`, `internal_error`
|
|
([`services.md` §4](services.md)).
|
|
|
|
## Start failure (orphan / Upsert-after-Run rollback)
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Service as startruntime
|
|
participant Docker
|
|
participant PG as Postgres
|
|
participant Intents as notification:intents
|
|
|
|
Service->>Docker: ContainerCreate + ContainerStart
|
|
Docker-->>Service: container running
|
|
Service->>PG: Upsert runtime_records
|
|
PG-->>Service: error (transport / constraint)
|
|
Note over Service: container is now an orphan<br/>(running, no PG record)
|
|
Service->>Docker: Remove(container_id) [fresh background context]
|
|
Docker-->>Service: ok or logged failure
|
|
Service->>PG: INSERT operation_log (outcome=failure, error_code=container_start_failed)
|
|
Service->>Intents: XADD runtime.container_start_failed
|
|
Service-->>Service: Result{Outcome=failure, ErrorCode=container_start_failed}
|
|
```
|
|
|
|
The Docker adapter already removes the container when `Run` itself
|
|
fails after a successful `ContainerCreate`
|
|
([`adapters.md` §3](adapters.md)); the start service adds the
|
|
post-`Run` rollback for the `Upsert` path. A `Remove` failure is
|
|
logged but not propagated; the reconciler adopts surviving orphans on
|
|
its periodic pass ([`services.md` §5](services.md)).
|
|
|
|
## Stop
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Caller as Lobby / GM / Admin
|
|
participant Service as stopruntime
|
|
participant Lease as Redis lease
|
|
participant PG as Postgres
|
|
participant Docker
|
|
participant Results as runtime:job_results
|
|
|
|
Caller->>Service: stop(game_id, reason)
|
|
Service->>Lease: SET NX PX rtmanager:game_lease:{game_id}
|
|
Service->>PG: SELECT runtime_records WHERE game_id
|
|
alt status in {stopped, removed}
|
|
Service->>PG: INSERT operation_log (outcome=success, error_code=replay_no_op)
|
|
Service-->>Caller: success / replay_no_op
|
|
else status = running
|
|
Service->>Docker: ContainerStop(container_id, RTMANAGER_CONTAINER_STOP_TIMEOUT_SECONDS)
|
|
Docker-->>Service: ok
|
|
Service->>PG: UpdateStatus running→stopped (CAS by container_id)
|
|
Service->>PG: INSERT operation_log (op_kind=stop, outcome=success)
|
|
Service-->>Caller: success
|
|
end
|
|
Service->>Lease: DEL rtmanager:game_lease:{game_id}
|
|
```
|
|
|
|
Lobby callers receive the outcome through `runtime:job_results`; REST
|
|
callers receive an HTTP `200`. The `reason` enum
|
|
(`orphan_cleanup | cancelled | finished | admin_request | timeout`)
|
|
is recorded in `operation_log` and is otherwise opaque to the stop
|
|
service — RTM does not branch on the reason in v1
|
|
([`services.md` §15, §17](services.md)).
|
|
|
|
## Restart
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Admin as GM / Admin
|
|
participant Service as restartruntime
|
|
participant Stop as stopruntime.Run
|
|
participant Start as startruntime.Run
|
|
participant Docker
|
|
participant PG as Postgres
|
|
|
|
Admin->>Service: POST /restart
|
|
Service->>PG: SELECT runtime_records WHERE game_id
|
|
Note over Service: capture current image_ref
|
|
Service->>Service: acquire per-game lease (held across both inner ops)
|
|
Service->>Stop: Run(game_id) [lease bypass]
|
|
Stop->>Docker: ContainerStop
|
|
Stop->>PG: UpdateStatus running→stopped
|
|
Service->>Docker: ContainerRemove
|
|
Service->>Start: Run(game_id, image_ref) [lease bypass]
|
|
Start->>Docker: PullImage / Run
|
|
Start->>PG: Upsert runtime_records (status=running)
|
|
Service->>PG: INSERT operation_log (op_kind=restart, outcome=success, source_ref=correlation_id)
|
|
Service-->>Admin: 200 {runtime_record}
|
|
Service->>Service: release lease
|
|
```
|
|
|
|
The lease is acquired by `restartruntime` and held across both inner
|
|
operations; `stopruntime.Run` and `startruntime.Run` are
|
|
lease-bypass entry points that skip the inner lease acquisition
|
|
([`services.md` §12](services.md)). The single `operation_log` row
|
|
uses `Input.SourceRef` as a correlation id linking the implicit stop
|
|
and start entries ([`services.md` §13](services.md)).
|
|
|
|
## Patch
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Admin as GM / Admin
|
|
participant Service as patchruntime
|
|
participant Restart as restartruntime.Run
|
|
|
|
Admin->>Service: POST /patch {image_ref: "galaxy/game:1.4.2"}
|
|
Service->>Service: parse new image_ref + current image_ref
|
|
alt either ref not semver
|
|
Service-->>Admin: 422 image_ref_not_semver
|
|
else major or minor differ
|
|
Service-->>Admin: 422 semver_patch_only
|
|
else major.minor match, patch differs (or equal)
|
|
Service->>Restart: Run(game_id, new_image_ref)
|
|
Restart-->>Service: Result
|
|
Service-->>Admin: 200 {runtime_record}
|
|
end
|
|
```
|
|
|
|
The semver gate uses the tag fragment of the Docker reference; the
|
|
extraction strategy is recorded in [`services.md` §14](services.md).
|
|
The restart delegate already owns the lease, the inner stop/start,
|
|
the operation log, and the `runtime:health_events container_started`
|
|
emission ([`workers.md` §1](workers.md)).
|
|
|
|
## Cleanup TTL
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Worker as containercleanup worker
|
|
participant PG as Postgres
|
|
participant Service as cleanupcontainer
|
|
participant Lease as Redis lease
|
|
participant Docker
|
|
|
|
loop every RTMANAGER_CLEANUP_INTERVAL
|
|
Worker->>PG: SELECT runtime_records WHERE status='stopped' AND last_op_at < now - retention
|
|
loop per game
|
|
Worker->>Service: cleanup(game_id, op_source=auto_ttl)
|
|
Service->>Lease: SET NX PX rtmanager:game_lease:{game_id}
|
|
Service->>PG: re-read runtime_records WHERE game_id
|
|
alt status = running
|
|
Service-->>Worker: refused / conflict
|
|
else status in {stopped, removed}
|
|
Service->>Docker: ContainerRemove(container_id)
|
|
Service->>PG: UpdateStatus stopped→removed (CAS)
|
|
Service->>PG: INSERT operation_log (op_kind=cleanup_container)
|
|
Service-->>Worker: success
|
|
end
|
|
Service->>Lease: DEL rtmanager:game_lease:{game_id}
|
|
end
|
|
end
|
|
```
|
|
|
|
Admin-driven cleanup follows the same path through
|
|
`DELETE /api/v1/internal/runtimes/{game_id}/container` with
|
|
`op_source=admin_rest` instead of `auto_ttl`. The host state directory
|
|
is **never** removed by this flow
|
|
([`../README.md` §Cleanup](../README.md#cleanup),
|
|
[`services.md` §17](services.md),
|
|
[`workers.md` §19](workers.md)).
|
|
|
|
## Reconcile drift adopt
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Reconciler as reconcile worker
|
|
participant Docker
|
|
participant PG as Postgres
|
|
participant Lease as Redis lease
|
|
|
|
Note over Reconciler: read pass (lockless)
|
|
Reconciler->>Docker: List({label=com.galaxy.owner=rtmanager})
|
|
Reconciler->>PG: ListByStatus(running)
|
|
Note over Reconciler: write pass (per-game lease)
|
|
loop per Docker container without matching record
|
|
Reconciler->>Lease: SET NX PX rtmanager:game_lease:{game_id}
|
|
Reconciler->>PG: re-read runtime_records WHERE game_id
|
|
alt record now exists
|
|
Reconciler-->>Reconciler: skip (state changed since read pass)
|
|
else record still missing
|
|
Reconciler->>PG: Upsert runtime_records (status=running, image_ref, started_at)
|
|
Reconciler->>PG: INSERT operation_log (op_kind=reconcile_adopt, op_source=auto_reconcile)
|
|
end
|
|
Reconciler->>Lease: DEL rtmanager:game_lease:{game_id}
|
|
end
|
|
```
|
|
|
|
The reconciler **never** stops or removes an unrecorded container —
|
|
operators may have started one manually for diagnostics. The
|
|
`reconcile_dispose` and `observed_exited` paths follow the same
|
|
read-pass / write-pass split, with `dispose` updating the orphaned
|
|
record to `removed` and emitting `container_disappeared`, and
|
|
`observed_exited` updating to `stopped` and emitting `container_exited`
|
|
([`../README.md` §Reconciliation](../README.md#reconciliation),
|
|
[`workers.md` §14–§16](workers.md)).
|
|
|
|
## Health probe hysteresis
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Worker as healthprobe worker
|
|
participant State as in-memory probe state
|
|
participant Engine as galaxy-game-{id}:8080
|
|
participant Health as runtime:health_events
|
|
|
|
loop every RTMANAGER_PROBE_INTERVAL
|
|
Worker->>Worker: ListByStatus(running)
|
|
Worker->>State: prune entries for games no longer running
|
|
loop per game (semaphore cap = 16)
|
|
Worker->>Engine: GET /healthz (RTMANAGER_PROBE_TIMEOUT)
|
|
alt success
|
|
State->>State: consecutiveFailures = 0
|
|
opt failurePublished was true
|
|
Worker->>Health: XADD probe_recovered {prior_failure_count}
|
|
State->>State: failurePublished = false
|
|
end
|
|
else failure
|
|
State->>State: consecutiveFailures++
|
|
opt consecutiveFailures == RTMANAGER_PROBE_FAILURES_THRESHOLD AND not failurePublished
|
|
Worker->>Health: XADD probe_failed {consecutive_failures, last_status, last_error}
|
|
State->>State: failurePublished = true
|
|
end
|
|
end
|
|
end
|
|
end
|
|
```
|
|
|
|
Hysteresis prevents a single transient failure from emitting a
|
|
`probe_failed` event, and prevents repeated emission while the failure
|
|
persists. State is non-persistent: a process restart re-establishes
|
|
the counters from scratch; a game's state is pruned when it transitions
|
|
out of the running list ([`workers.md` §5–§6](workers.md)).
|