Files
galaxy-game/rtmanager/docs/adapters.md
T
2026-04-28 20:39:18 +02:00

193 lines
8.6 KiB
Markdown

# Adapters
This document explains why the production adapters under
[`../internal/adapters/`](../internal/adapters) — Docker SDK,
Lobby internal HTTP client, notification-intent publisher, health-event
publisher, job-result publisher — are shaped the way they are. The
PostgreSQL stores and the Redis-coordination adapters live in
[`postgres-migration.md`](postgres-migration.md).
## 1. `mockgen` is the repo-wide convention for wide ports
The Docker port has nine methods plus eight value types in the
signatures, and most lifecycle services exercise nearly every method
pair (start, stop, restart, patch, cleanup, reconcile, events, probe).
A hand-rolled fake would either miss methods or balloon to a per-test
fixture.
`internal/adapters/docker/` therefore uses `go.uber.org/mock` mocks:
- `//go:generate` directives live next to the interface declaration in
`internal/ports/dockerclient.go`;
- generated code is committed under `internal/adapters/docker/mocks/`
(matching the `internal/adapters/postgres/jet/` discipline);
- `make -C rtmanager mocks` is the single command operators run after
a port-signature change.
The maintained `go.uber.org/mock` fork is preferred over the archived
`github.com/golang/mock`. This convention applies to wide / recorder
ports across the repository — Lobby uses the same pipeline for its
narrow recorder ports (`RuntimeManager`, `IntentPublisher`,
`GMClient`, `UserService`); see
[`../../ARCHITECTURE.md`](../../ARCHITECTURE.md) for the cross-service
rule.
The other two RTM ports (`LobbyInternalClient`,
`NotificationIntentPublisher`) keep inline `_test.go` fakes: small
surfaces, easy to fake by hand inside a single test file when needed.
## 2. `EngineEndpoint` is built inside the Docker adapter
The engine port is fixed at `8080`. Pushing it into `RunSpec` would
force the start service to know an engine implementation detail;
pushing it into config would give operators a knob that the engine
image already does not honour. The Docker adapter exposes
`EnginePort = 8080` as a package constant and constructs
`RunResult.EngineEndpoint = "http://" + spec.Hostname + ":8080"`
itself.
The adapter also leaves `container.Config.ExposedPorts` empty: RTM
never publishes ports to the host. The user-defined Docker bridge
network gives every container in the network DNS access to the engine
via `galaxy-game-{game_id}:8080`.
## 3. `Run` removes the container on `ContainerStart` failure
`README.md §Lifecycles → Start` requires no orphan to remain after a
failed start path. If `ContainerCreate` succeeds but `ContainerStart`
fails, the adapter calls `ContainerRemove(force=true)` inside a fresh
`context.Background()` (with a 10s timeout) so the cleanup runs even
when the original ctx is already cancelled. The cleanup is best-effort:
a remove failure is silently discarded because the original start
failure is the actionable error returned to the caller.
The alternative — leaving rollback to the start service — would either
duplicate the same code in every caller or invite a service that forgets
to do it. Centralising the rule in the adapter keeps the port contract
simple. The start service adds an additional rollback layer for the
post-`Run` `Upsert` failure path; see [`services.md`](services.md) §5.
## 4. `RunSpec.Cmd` is optional
`ports.RunSpec` exposes an optional `Cmd []string`. Production callers
leave it `nil` so the engine image's own `CMD` runs;
`internal/adapters/docker/smoke_test.go` uses it to drive
`["/bin/sh","-c","sleep 60"]` against `alpine:3.21`.
The alternative — building a dedicated test image with a pre-baked
`sleep` command — would require an extra `Dockerfile` under testdata
and a build step inside the smoke test. The single new field is
documented as optional and ignored when empty; production behaviour is
unchanged.
## 5. `EventsListen` filters at the adapter boundary
The Docker `/events` API accepts a `filters` query parameter, but the
daemon treats it as a hint, not a guarantee. The adapter therefore
double-checks at the boundary: only `Type == events.ContainerEventType`
messages are passed through to the typed `<-chan ports.DockerEvent`.
Doing the filter at the SDK level would still require a defensive
recheck on the consumer side; consolidating the check in the adapter
keeps the contract crisp and the consumer free of Docker-internal type
discriminants.
The decoded event copies the actor's full `Attributes` map into
`DockerEvent.Labels`. Docker mixes container labels and runtime
attributes (`exitCode`, `image`, `name`, etc.) flat in the same map;
RTM consumers filter by the `com.galaxy.` prefix when they care about
labels, and the adapter extracts `exitCode` separately for `die`
events.
## 6. Lobby HTTP client error mapping
`ports.LobbyInternalClient.GetGame` fixes:
- `200``LobbyGameRecord` decoded tolerantly (unknown fields
ignored);
- `404``ports.ErrLobbyGameNotFound`;
- transport, timeout, or any other non-2xx → `ports.ErrLobbyUnavailable`
wrapped with the original error so callers can `errors.Is` and still
log the cause.
The start service treats `ErrLobbyUnavailable` as recoverable: it
continues without the diagnostic data because the start envelope
already carries the only required field (`image_ref`). The client
mirrors `notification/internal/adapters/userservice/client.go`: cloned
`*http.Transport`, `otelhttp.NewTransport` wrap, per-request
`context.WithTimeout`, idempotent `Close()` releasing idle connections.
JSON decoding is tolerant: unknown fields in the success body do not
break the call, so additive changes to Lobby's `GameRecord` schema do
not require an RTM release.
## 7. Notification publisher wrapper signature
The wrapper drops the entry id returned by
`notificationintent.Publisher.Publish` (rationale in
[`domain-and-ports.md`](domain-and-ports.md) §7). The adapter is a
thin shim:
- `NewPublisher(cfg)` constructs the inner publisher and forwards
validation;
- `Publish(ctx, intent)` calls the inner publisher and discards the
entry id.
The compile-time assertion `var _ ports.NotificationIntentPublisher =
(*Publisher)(nil)` lives in `publisher.go`.
## 8. Health-events publisher: snapshot upsert before stream XADD
Every emission goes through
`ports.HealthEventPublisher.Publish`, which both XADDs to
`runtime:health_events` and upserts `health_snapshots`. The snapshot
upsert runs **before** the XADD: a successful Publish always leaves
the snapshot store at least as fresh as the stream, and a partial
failure leaves the snapshot a best-effort lower bound. Reversing the
order would let consumers observe a stream entry whose
`health_snapshots` row reflects the prior observation — a misleading
inversion.
The `event_type → SnapshotStatus / SnapshotSource` mapping mirrors the
table in [`../README.md` §Health Monitoring](../README.md). In
particular, `container_started` collapses to `SnapshotStatusHealthy`
and `probe_recovered` does the same (rationale in
[`domain-and-ports.md`](domain-and-ports.md) §4).
## 9. Unit-test strategy
Both HTTP-backed adapters (Docker SDK, Lobby client) use
`httptest.Server` fixtures. The Docker SDK speaks HTTP under the hood
for both unix sockets and TCP, so adapter unit tests construct a
Docker client with `client.WithHost(server.URL)` and
`client.WithHTTPClient(server.Client())`, which lets table-driven
handlers fake every Docker API endpoint without touching the real
daemon. The Docker API version is pinned to `1.45`
(`client.WithVersion("1.45")`) so the URL prefix is stable across CI
machines whose daemon advertises a different default. Production
wiring (in `internal/app/bootstrap.go`) keeps API negotiation enabled.
The notification publisher uses `miniredis` directly because the
adapter's only side effect is an `XADD`, which `miniredis` reproduces
faithfully and matches every other Galaxy intent test.
## 10. Docker smoke test
`internal/adapters/docker/smoke_test.go` runs on the default
`go test ./...` invocation and calls `t.Skip` unless the local daemon
is reachable (`/var/run/docker.sock` exists or `DOCKER_HOST` is set).
The covered sequence:
1. provision a temporary user-defined bridge network;
2. assert `EnsureNetwork` for present and missing names;
3. pull `alpine:3.21` (`PullPolicyIfMissing`);
4. subscribe to events;
5. run a sleep container with the full `RunSpec` field set;
6. observe a `start` event for the new container id;
7. inspect, stop, remove, and verify `ErrContainerNotFound` is
reported afterwards.
This is the production adapter's only end-to-end check that runs from
the default `go test` pass; the broader service-local integration
suite ([`integration-tests.md`](integration-tests.md)) is gated
behind `-tags=integration`.