444 lines
22 KiB
Markdown
444 lines
22 KiB
Markdown
# Lifecycle Services
|
|
|
|
This document explains the design of the five lifecycle services
|
|
(`startruntime`, `stopruntime`, `restartruntime`, `patchruntime`,
|
|
`cleanupcontainer`) under [`../internal/service/`](../internal/service)
|
|
plus the per-handler REST glue under
|
|
[`../internal/api/internalhttp/`](../internal/api/internalhttp).
|
|
|
|
The current-state behaviour (lifecycle steps, failure tables, the
|
|
per-game lease semantics, the wire contracts) lives in
|
|
[`../README.md`](../README.md), the OpenAPI spec at
|
|
[`../api/internal-openapi.yaml`](../api/internal-openapi.yaml), and the
|
|
AsyncAPI spec at
|
|
[`../api/runtime-jobs-asyncapi.yaml`](../api/runtime-jobs-asyncapi.yaml).
|
|
This file records the *why*.
|
|
|
|
## 1. Per-game lease lives at the service layer
|
|
|
|
Every lifecycle service acquires `rtmanager:game_lease:{game_id}` via
|
|
[`ports.GameLeaseStore`](../internal/ports/gamelease.go) before doing
|
|
any work, and releases it on the way out:
|
|
|
|
- the lease primitive serialises operations on a single game across
|
|
every entry point (stream consumers and REST handlers);
|
|
- holding the lease at the service layer keeps the consumer / REST
|
|
callers symmetric — neither acquires the lease itself, both call
|
|
the service the same way;
|
|
- the Redis-backed adapter
|
|
([`../internal/adapters/redisstate/gamelease/store.go`](../internal/adapters/redisstate/gamelease/store.go))
|
|
uses `SET NX PX` on acquire, Lua compare-and-delete on release; a
|
|
release whose caller-supplied token no longer matches is a silent
|
|
no-op.
|
|
|
|
The lease key shape is `rtmanager:game_lease:{base64url(game_id)}` so
|
|
opaque game ids may contain any characters without leaking through
|
|
the key syntax.
|
|
|
|
The lease TTL is `RTMANAGER_GAME_LEASE_TTL_SECONDS` (default `60s`)
|
|
and is **not renewed mid-operation** in v1. A multi-GB image pull can
|
|
theoretically expire the lease before the start service finishes;
|
|
operators see this as a `reconcile_adopt` event later because the
|
|
container is created with the standard owner labels. A renewal helper
|
|
is deliberately deferred until a workload makes it necessary.
|
|
|
|
The reconciler ([`workers.md`](workers.md) §4) honours the same lease
|
|
around every drift mutation, which closes the
|
|
restart-vs-`reconcile_dispose` race documented in §6 below.
|
|
|
|
## 2. Health-events publisher lands with the start service
|
|
|
|
The start service publishes `container_started` after `docker run`
|
|
returns; the events listener intentionally does **not** duplicate
|
|
the event ([`workers.md`](workers.md) §1). Centralising the publisher
|
|
on the start service avoids a "who emits what" ambiguity and lets the
|
|
publisher be a thin port wrapper rather than a worker-specific
|
|
helper.
|
|
|
|
The publisher port lives next to the snapshot-upsert rule
|
|
([`adapters.md`](adapters.md) §8): one Publish call updates both
|
|
surfaces.
|
|
|
|
## 3. `Result`-shaped contract
|
|
|
|
`Service.Handle` returns `(Result, error)`. The Go-level `error` is
|
|
reserved for system-level / programmer faults (nil context, nil
|
|
service). All business outcomes flow through `Result`:
|
|
|
|
- `Outcome=success`, `ErrorCode=""` — fresh start succeeded;
|
|
- `Outcome=success`, `ErrorCode="replay_no_op"` — idempotent replay;
|
|
- `Outcome=failure`, `ErrorCode` set — business failure
|
|
(`start_config_invalid` / `image_pull_failed` /
|
|
`container_start_failed` / `conflict` / `service_unavailable` /
|
|
`internal_error`).
|
|
|
|
The stream consumer uses `Outcome` and `ErrorCode` to populate
|
|
`runtime:job_results` directly; the REST handler maps `Outcome=failure`
|
|
plus `ErrorCode` to the matching HTTP status. Both callers are simpler
|
|
with this contract than with an `errors.Is`-driven sentinel taxonomy.
|
|
|
|
`ports.JobResult` and the two `JobOutcome*` string constants live in
|
|
the ports package next to `JobResultPublisher` so the wire shape is
|
|
defined exactly once. The constants are intentionally not aliases of
|
|
`operation.Outcome` — the audit-log enum is allowed to grow without
|
|
breaking the wire format.
|
|
|
|
## 4. Start service failure-mode mapping
|
|
|
|
| Failure | Error code | Notification intent |
|
|
| --- | --- | --- |
|
|
| Invalid input (empty fields, unknown op_source) | `start_config_invalid` | `runtime.start_config_invalid` |
|
|
| Lease busy | `conflict` | — |
|
|
| Existing record running with a different image_ref | `conflict` | — |
|
|
| Get returns a non-NotFound transport error | `internal_error` | — |
|
|
| `image_ref` shape rejected by `distribution/reference` | `start_config_invalid` | `runtime.start_config_invalid` |
|
|
| `EnsureNetwork` returns `ErrNetworkMissing` | `start_config_invalid` | `runtime.start_config_invalid` |
|
|
| `EnsureNetwork` returns any other error | `service_unavailable` | — |
|
|
| `PullImage` failure | `image_pull_failed` | `runtime.image_pull_failed` |
|
|
| `InspectImage` failure | `image_pull_failed` | `runtime.image_pull_failed` |
|
|
| `prepareStateDir` failure | `start_config_invalid` | `runtime.start_config_invalid` |
|
|
| `Run` failure | `container_start_failed` | `runtime.container_start_failed` |
|
|
| `Upsert` failure after successful Run | `container_start_failed` | `runtime.container_start_failed` |
|
|
|
|
Three error codes do **not** raise an admin notification: `conflict`,
|
|
`service_unavailable`, and `internal_error` are operational classes
|
|
(another caller is in flight, a dependency is down, an unclassified
|
|
fault) where the corrective action is not a configuration change. The
|
|
operator already sees them through telemetry and structured logs; an
|
|
email per occurrence would be noise.
|
|
|
|
## 5. Upsert-after-Run rollback
|
|
|
|
A `Run` that succeeded but whose `Upsert` failed leaves a running
|
|
container with no PG record. The service issues a best-effort
|
|
`docker.Remove(containerID)` in a fresh `context.Background()` (the
|
|
request context may already be cancelled) before recording the failure.
|
|
A Remove failure is logged but not propagated; the reconciler adopts
|
|
surviving orphans on its periodic pass.
|
|
|
|
The Docker adapter already removes the container when `Run` itself
|
|
returns an error after a successful `ContainerCreate` ([`adapters.md`](adapters.md) §3).
|
|
The service-layer rollback covers the additional post-`Run` Upsert
|
|
failure path.
|
|
|
|
## 6. Pre-existing record handling
|
|
|
|
Only `status=running` + same `image_ref` is a `replay_no_op`.
|
|
`running` + a different `image_ref` returns `failure / conflict` (use
|
|
`patch` to change the image of a running container).
|
|
|
|
Anything else (`stopped`, `removed`, missing record) proceeds with a
|
|
fresh start that ends in `Upsert`. `Upsert` overwrites verbatim and is
|
|
not bound by the transitions table, so installing a `running` record
|
|
over a `removed` row is permitted — the `removed` terminus rule lives
|
|
in `runtime.AllowedTransitions` (which guards `UpdateStatus`), not in
|
|
`Upsert`.
|
|
|
|
`created_at` is preserved across re-starts: the start service reuses
|
|
`existing.CreatedAt` when the record was found, so the
|
|
"first time RTM saw the game" semantics from
|
|
[`postgres-migration.md`](postgres-migration.md) §9 hold even when the
|
|
start path goes through `Upsert` rather than through the runtime
|
|
adapter's `INSERT ... ON CONFLICT DO UPDATE` EXCLUDED list.
|
|
|
|
A residual `galaxy-game-{game_id}` container left over from a previous
|
|
start that was stopped but never cleaned up will fail at `docker run`
|
|
with a name conflict. The service surfaces that as
|
|
`container_start_failed`; cleanup plus the reconciler is the standard
|
|
remedy. A pre-emptive Remove inside the start service was rejected
|
|
because it would silently undo manual operator inspection on stopped
|
|
containers.
|
|
|
|
## 7. `LobbyInternalClient.GetGame` is best-effort
|
|
|
|
The fetch happens after the lease is acquired and before the Docker
|
|
work, with the configured `RTMANAGER_LOBBY_INTERNAL_TIMEOUT`.
|
|
`ErrLobbyUnavailable` and `ErrLobbyGameNotFound` are logged at
|
|
`debug`; the start operation continues either way. The fetched
|
|
`Status` and `TargetEngineVersion` enrich logs only — the start
|
|
envelope already carries the only required field (`image_ref`), and
|
|
the port docstring fixes the recoverable-failure contract.
|
|
|
|
## 8. `image_ref` validation
|
|
|
|
Validation uses `github.com/distribution/reference.ParseNormalizedNamed`
|
|
before any Docker round-trip. Rejected shapes surface as
|
|
`start_config_invalid` plus a `runtime.start_config_invalid` intent.
|
|
Daemon-side rejections after a valid parse (manifest unknown,
|
|
authentication required) surface as `image_pull_failed` plus a
|
|
`runtime.image_pull_failed` intent. The split keeps operator-actionable
|
|
configuration mistakes distinct from registry-side failures.
|
|
|
|
## 9. State-directory preparer is overrideable
|
|
|
|
`Dependencies.PrepareStateDir` is a `func(gameID string) (string, error)`
|
|
injection point that defaults to `os.MkdirAll` + `os.Chmod` +
|
|
`os.Chown` against `RTMANAGER_GAME_STATE_ROOT`. Tests override it to
|
|
point at a `t.TempDir()`-style fake without exercising the real
|
|
filesystem permissions (which require either matching uid/gid or
|
|
root). This is a deliberate non-port abstraction: the start service
|
|
does no other filesystem work and the cost of a new port for one
|
|
helper is not worth the indirection.
|
|
|
|
## 10. Container env: both `GAME_STATE_PATH` and `STORAGE_PATH`
|
|
|
|
Both names are accepted by the v1 engine. The start service always
|
|
sets both; the configured `RTMANAGER_ENGINE_STATE_ENV_NAME` controls
|
|
the primary. When the operator overrides the primary to `STORAGE_PATH`,
|
|
the deduplicating map collapses the two entries into one.
|
|
|
|
## 11. Wiring layer construction
|
|
|
|
`internal/app/wiring.go` is the single point that builds every
|
|
production store, adapter, and service from `config.Config`. The
|
|
struct exposes typed fields so handlers and workers can grab the
|
|
singletons without re-wiring; an `addCloser` slice releases adapter
|
|
resources (currently the Lobby HTTP client's idle-connection pool) at
|
|
runtime shutdown. The `runtimeRecordsProbe` adapter installed during
|
|
construction registers the `rtmanager.runtime_records_by_status`
|
|
gauge documented in [`../README.md` §Observability](../README.md).
|
|
|
|
The persistence-only `CountByStatus` method on the `runtimerecordstore`
|
|
adapter is **not** part of `ports.RuntimeRecordStore` because it is
|
|
only used by the gauge probe; widening the port for one caller would
|
|
force every adapter and test fake to grow with no benefit. The adapter
|
|
exposes it directly and the wiring composes a concrete-typed wrapper.
|
|
|
|
## 12. Shared lease across composed operations (restart, patch)
|
|
|
|
Restart and patch must hold the lease across the inner
|
|
`stop → docker rm → start` sequence, otherwise a concurrent stop or
|
|
restart could observe a half-recreated runtime.
|
|
|
|
`startruntime.Service` and `stopruntime.Service` therefore expose a
|
|
second public method:
|
|
|
|
```go
|
|
// Run executes the lifecycle assuming the per-game lease is already
|
|
// held by the caller. Reserved for orchestrator services that compose
|
|
// stop or start with another operation under a single outer lease.
|
|
// External callers must use Handle.
|
|
func (service *Service) Run(ctx context.Context, input Input) (Result, error)
|
|
```
|
|
|
|
`Handle` acquires the lease, defers its release, and calls `Run`.
|
|
Restart and patch acquire the outer lease themselves and call `Run`
|
|
on the inner services. The inner services record their own
|
|
`operation_log` entries, telemetry counters, health events, and admin
|
|
notification intents identically to a top-level `Handle`.
|
|
|
|
A typed `LeaseTicket` parameter (a small internal-package zero-size
|
|
struct that only the lease store can construct) was considered and
|
|
rejected for v1: only sister services in `internal/service/` ever call
|
|
`Run`, the docstring is loud about the precondition, and the pattern
|
|
can be tightened later without breaking the public surface that
|
|
consumers and handlers consume.
|
|
|
|
## 13. Correlation id on `source_ref`
|
|
|
|
The outer restart and patch services reuse the existing
|
|
`Input.SourceRef` as a correlation key:
|
|
|
|
- when `Input.SourceRef` is non-empty (REST request id, stream entry
|
|
id), all three entries — outer restart / patch + inner stop +
|
|
inner start — share that value;
|
|
- when empty, the outer service generates a 32-byte base64url string
|
|
via the same `NewToken` generator that produces lease tokens, and
|
|
uses it as the correlation key for all three entries.
|
|
|
|
The outer entry's `source_ref` keeps its dual semantics: actor ref
|
|
when the caller supplied one, generated correlation id otherwise. Pure
|
|
top-level operations (caller invokes start, stop, or cleanup directly)
|
|
keep the original meaning. Composed operations (restart, patch) use
|
|
the same value in three places to make audit queries trivial.
|
|
|
|
This is not the cleanest end-state — a dedicated `correlation_id`
|
|
column would carry the link without ambiguity — but it is the smallest
|
|
change that does not touch the schema. A future stage that adds the
|
|
column can rename the field and clear up the dual role in one move.
|
|
|
|
## 14. Semver validation for patch
|
|
|
|
`internal/service/patchruntime/semver.go` enforces the
|
|
patch-precondition (current and new `image_ref` parse as semver, share
|
|
major and minor):
|
|
|
|
- `extractSemverTag(imageRef)` parses with
|
|
`github.com/distribution/reference.ParseNormalizedNamed`, casts to
|
|
`reference.NamedTagged`, then validates the tag with
|
|
`golang.org/x/mod/semver.IsValid` (after prepending `v` when the tag
|
|
omits it). Failures map to `image_ref_not_semver`;
|
|
- `samePatchSeries(currentSemver, newSemver)` compares
|
|
`semver.MajorMinor` of the two canonical strings; mismatch maps to
|
|
`semver_patch_only`.
|
|
|
|
`golang.org/x/mod` is a direct require to avoid a transitive-version
|
|
surprise. `github.com/Masterminds/semver/v3` (also in the module
|
|
graph) was rejected to avoid two semver libraries on disk for the
|
|
same job; `x/mod/semver` already covers Lobby. A hand-rolled
|
|
`vMajor.Minor.Patch` parser was rejected as premature.
|
|
|
|
Pre-checks run before any inner stop or `docker rm`: a rejected patch
|
|
never disturbs the running runtime. Patch with
|
|
`new_image_ref == current_image_ref` proceeds through the recreate
|
|
flow unchanged (not `replay_no_op`: the inner start still runs); the
|
|
outer `op_kind=patch` entry records the no-op patch for audit.
|
|
|
|
## 15. `StopReason` placement
|
|
|
|
The reason enum mirrors `lobby/internal/ports/runtimemanager.go`
|
|
verbatim and lives at `internal/service/stopruntime/stopreason.go`.
|
|
The stream consumer and the REST handler import `stopruntime` for
|
|
the same enum the service requires.
|
|
|
|
Inner stop calls from restart and patch always pass
|
|
`StopReasonAdminRequest`. Restart and patch are platform-internal
|
|
recreate flows; `admin_request` is the closest semantic match in the
|
|
five-value vocabulary. The actor that originated the recreate (REST
|
|
request id, admin user id) flows through the `op_source` /
|
|
`source_ref` pair, not through the stop reason.
|
|
|
|
## 16. Error code centralisation
|
|
|
|
`internal/service/startruntime/errors.go` is the canonical home for
|
|
the stable error codes returned in `Result.ErrorCode`. The other four
|
|
services (`stopruntime`, `restartruntime`, `patchruntime`,
|
|
`cleanupcontainer`) import the constants from `startruntime` rather
|
|
than redeclaring them. The package comment of `errors.go` flags the
|
|
shared usage so future readers do not chase per-service declarations.
|
|
|
|
`start_config_invalid` is reserved for start because every start
|
|
validation failure also raises an admin notification intent. The
|
|
other services use the more general `invalid_request` for input
|
|
validation failures.
|
|
|
|
## 17. Stop / restart / patch / cleanup failure tables
|
|
|
|
### `stopruntime`
|
|
|
|
| Failure | Error code | Notes |
|
|
| --- | --- | --- |
|
|
| Invalid input | `invalid_request` | No notification intent. |
|
|
| Lease busy | `conflict` | Lease release skipped because acquire returned false. |
|
|
| Lease error | `service_unavailable` | Redis unreachable. |
|
|
| Record missing | `not_found` | |
|
|
| Status `stopped` / `removed` | success / `replay_no_op` | Idempotent re-stop. |
|
|
| `docker.Stop` returns `ErrContainerNotFound` | success | Record transitions `running → removed`, `container_disappeared` health event published. |
|
|
| `docker.Stop` other error | `service_unavailable` | Record untouched; caller may retry. |
|
|
| `UpdateStatus` returns `ErrConflict` (CAS race) | success / `replay_no_op` | The desired state was reached by another path (reconciler / restart). |
|
|
| `UpdateStatus` returns `ErrNotFound` | `not_found` | Record vanished mid-stop. |
|
|
| `UpdateStatus` other error | `internal_error` | |
|
|
|
|
### `restartruntime`
|
|
|
|
| Failure | Error code | Notes |
|
|
| --- | --- | --- |
|
|
| Invalid input | `invalid_request` | |
|
|
| Lease busy / lease error | `conflict` / `service_unavailable` | Same as stop. |
|
|
| Record missing | `not_found` | |
|
|
| Status `removed` | `conflict` | Image_ref may be empty; restart cannot proceed. |
|
|
| Inner stop fails | inner `ErrorCode` | Outer `ErrorMessage` prefixes "inner stop failed: ". |
|
|
| `docker.Remove` fails | `service_unavailable` | Inner stop already moved record to `stopped`; runtime stays in `stopped`. Admin must call `cleanup_container` before retrying restart. |
|
|
| Inner start fails | inner `ErrorCode` | Outer `ErrorMessage` prefixes "inner start failed: ". |
|
|
|
|
The post-stop `docker rm` failure is the only path that leaves the
|
|
runtime in a state from which the same operation cannot recover by
|
|
itself: a residual `galaxy-game-{game_id}` container blocks a fresh
|
|
inner start (the start service surfaces this as
|
|
`container_start_failed`). The runbook entry — "call cleanup, then
|
|
restart again" — is the standard remedy.
|
|
|
|
### `patchruntime`
|
|
|
|
| Failure | Error code | Notes |
|
|
| --- | --- | --- |
|
|
| Invalid input | `invalid_request` | |
|
|
| Lease busy / lease error | `conflict` / `service_unavailable` | |
|
|
| Record missing | `not_found` | |
|
|
| Status `removed` | `conflict` | |
|
|
| Current `image_ref` not parseable as semver tag | `image_ref_not_semver` | Pre-check; no inner ops fired. |
|
|
| New `image_ref` not parseable as semver tag | `image_ref_not_semver` | Pre-check; no inner ops fired. |
|
|
| Major / minor mismatch | `semver_patch_only` | Pre-check; no inner ops fired. |
|
|
| Inner stop / `docker rm` / inner start fails | inherits inner code | Same propagation as restart. |
|
|
|
|
### `cleanupcontainer`
|
|
|
|
| Failure | Error code | Notes |
|
|
| --- | --- | --- |
|
|
| Invalid input | `invalid_request` | |
|
|
| Lease busy / lease error | `conflict` / `service_unavailable` | |
|
|
| Record missing | `not_found` | |
|
|
| Status `removed` | success / `replay_no_op` | |
|
|
| Status `running` | `conflict` | Error message: "stop the runtime first". |
|
|
| Status `stopped` | proceed | |
|
|
| `docker.Remove` returns `ErrContainerNotFound` | success | Adapter swallows not-found into nil. |
|
|
| `docker.Remove` other error | `service_unavailable` | Record untouched; caller may retry. |
|
|
| `UpdateStatus` returns `ErrConflict` | success / `replay_no_op` | Race with reconciler dispose. |
|
|
| `UpdateStatus` returns `ErrNotFound` | `not_found` | |
|
|
| `UpdateStatus` other error | `internal_error` | |
|
|
|
|
## 18. REST handler conventions
|
|
|
|
The internal HTTP handlers under
|
|
[`../internal/api/internalhttp/handlers/`](../internal/api/internalhttp/handlers)
|
|
follow these rules:
|
|
|
|
- **`X-Galaxy-Caller` header.** The optional header carries the
|
|
calling service identity (`gm` / `admin`); the handler records the
|
|
value as `op_source` in `operation_log` (`gm_rest` / `admin_rest`).
|
|
Missing or unknown values default to `admin_rest` because every
|
|
audit-log query already filters on the cleanup endpoint
|
|
(`op_source ∈ {auto_ttl, admin_rest}`); making the default match
|
|
the most-restricted surface keeps existing dashboards correct when
|
|
an unconfigured client hits the listener. The header is declared as
|
|
a reusable parameter (`components.parameters.XGalaxyCallerHeader`)
|
|
in the OpenAPI spec and is referenced from each runtime operation
|
|
but not from `/healthz` and `/readyz`.
|
|
- **Error code → HTTP status mapping.** One canonical table in
|
|
`handlers/common.go`:
|
|
|
|
| ErrorCode | HTTP status |
|
|
| --- | ---: |
|
|
| (success, including `replay_no_op`) | 200 |
|
|
| `invalid_request`, `start_config_invalid`, `image_ref_not_semver` | 400 |
|
|
| `not_found` | 404 |
|
|
| `conflict`, `semver_patch_only` | 409 |
|
|
| `service_unavailable`, `docker_unavailable` | 503 |
|
|
| `internal_error`, `image_pull_failed`, `container_start_failed` | 500 |
|
|
|
|
`image_pull_failed` and `container_start_failed` are operational
|
|
failures that originate inside RTM (registry / daemon problems),
|
|
not client-side validation issues; they map to `500` so callers
|
|
retry through their normal resilience paths instead of treating
|
|
the call as a 4xx that must be fixed at the source.
|
|
`docker_unavailable` is reserved for future producers; today the
|
|
start service emits `service_unavailable` for Docker-daemon
|
|
failures. Unknown error codes default to `500`.
|
|
- **List and Get bypass the service layer.** `internalListRuntimes`
|
|
and `internalGetRuntime` read directly from
|
|
`ports.RuntimeRecordStore`. Reads do not produce `operation_log`
|
|
rows, do not change Docker state, do not need the per-game lease,
|
|
and do not have a stream-side counterpart — none of the lifecycle
|
|
service machinery is justified.
|
|
- **`RuntimeRecordStore.List(ctx)` returns every record regardless
|
|
of status.** A single SELECT ordered by
|
|
`(last_op_at DESC, game_id ASC)` — the same direction the
|
|
`runtime_records_status_last_op_idx` index supports, so freshly
|
|
active games surface first. Pagination is intentionally not
|
|
modelled in v1; the working set is bounded by the games tracked
|
|
by Lobby.
|
|
- **Per-handler service ports use `mockgen`.** The handler layer
|
|
depends on five narrow interfaces — one per lifecycle service —
|
|
declared in `handlers/services.go`. Production wiring passes the
|
|
concrete `*<lifecycle>.Service` pointers (each satisfies the
|
|
matching interface implicitly); tests pass the mockgen-generated
|
|
mocks under `handlers/mocks/`.
|
|
- **Conformance test scope.** `internalhttp/conformance_test.go`
|
|
drives every documented runtime operation against a real
|
|
`internalhttp.Server` whose service deps are deterministic stubs.
|
|
The test uses `kin-openapi/routers/legacy.NewRouter`, calls
|
|
`openapi3filter.ValidateRequest` and
|
|
`openapi3filter.ValidateResponse` so both directions match the
|
|
contract. The scope is happy-path only; the failure-path response
|
|
shapes are validated by the per-handler tests.
|