7.5 KiB
Domain and Ports
This document explains why the rtmanager domain layer
(../internal/domain/) and the port interfaces
(../internal/ports/) are shaped the way they are.
The current-state types and method signatures are the source of truth in
the code; this file records the rationale so future readers do not
re-litigate the same trade-offs.
For the surrounding behaviour see
../README.md, the SQL CHECK constraints in
../internal/adapters/postgres/migrations/00001_init.sql,
the wire contracts under ../api/, and
postgres-migration.md for the persistence
layer.
1. String-typed status enums
runtime.Status, operation.OpKind, operation.OpSource,
operation.Outcome, health.EventType, health.SnapshotStatus, and
health.SnapshotSource are all type X string.
The string approach wins on three counts:
- the SQL CHECK constraints already store the values as
text, so a string domain type maps one-to-one with no codec layer; - it matches Lobby (
game.Status,membership.Status,application.Status), so reviewers do not switch encoding mental models when crossing service boundaries; IsKnownkeeps the invariant cheap (a single switch); atype X uint8with stringer-generated names would pay a constant lookup and make raw SQL columns harder to read in diagnostics.
2. Plain string for CurrentContainerID and CurrentImageRef
The PostgreSQL columns are nullable. The domain model uses plain
string with empty == NULL and bridges the SQL nullability inside the
adapter. Pointer fields would force every consumer to dereference
defensively even though business logic rarely cares about the
NULL/empty distinction (removed records may legitimately carry either
form depending on whether the record passed through stopped first).
The adapter's job is to translate sql.NullString ⇄ string; the rest
of the codebase reads the field as a regular value.
3. *time.Time for nullable timestamps
StartedAt, StoppedAt, RemovedAt retain pointer types. time.Time{}
is a real, comparable value in Go (IsZero only reports the canonical
zero time); mixing "missing" and "set to UTC zero" through plain
time.Time would invite bugs. The jet-generated model.RuntimeRecords
already declares the same fields as *time.Time, so the domain type
aligns with the persistence type and the adapter does not re-shape
pointers.
4. EventType and SnapshotStatus are deliberately distinct
runtime-health-asyncapi.yaml.EventType enumerates seven values; the
SQL CHECK on health_snapshots.status enumerates six. The two sets
overlap but are not identical:
container_startedis an event; the snapshot collapses it tohealthy(a successful start is observed as the container being live, not as an ongoing event);probe_recoveredis an event; it does not become a snapshot row of its own — the next inspect/probe overwrites the priorprobe_failedwithhealthy.
Modelling them as one shared enum would require a separate "event vs
snapshot" boolean and invite accidental mismatches. Two distinct types
with explicit IsKnown matrices keep each surface honest at compile
time.
5. Inspect split into InspectImage + InspectContainer
Two narrow methods replace a single polymorphic Inspect. The surface
RTM exercises has two shapes:
- the start service inspects the image by reference to read resource limits from labels;
- the periodic inspect worker, the reconciler, and the events listener inspect containers by id to read state, health, restart count, and exit code.
The inputs differ (ref vs id), and the result types differ
(ImageInspect.Labels is the only field used at start time, while
ContainerInspect carries a dozen state fields). One polymorphic
method would either split internally on input type or return a tagged
union; either is messier than two narrow methods.
6. LobbyGameRecord is intentionally minimal
LobbyInternalClient.GetGame returns GameID, Status, and
TargetEngineVersion. The fetch is classified as ancillary diagnostics
because the start envelope already carries the only required field
(image_ref).
Anything more would invite RTM consumers to depend on Lobby's schema in
ways that violate the "RTM never resolves engine versions" rule.
Future fields are additive: each new field is opt-in to the consumer
and does not break existing call sites. The minimalism is also a hedge
against schema drift — Lobby's GameRecord is large and changes more
often than RTM needs to track.
7. NotificationIntentPublisher.Publish returns error, not (string, error)
Lobby's IntentPublisher.Publish returns the Redis Stream entry id so
business workflows that key on it (idempotency keys, audit
correlation) can capture it. RTM publishes admin-only failure intents
where the entry id has no consumer — failing starts do not loop back
to RTM, and notification routing keys on the producer-supplied
idempotency_key rather than the stream id. The adapter wraps
pkg/notificationintent.Publisher and discards the entry id at the
wrapper boundary.
8. Exactly four allowed runtime transitions
runtime.AllowedTransitions covers:
running → stopped— graceful stop, observed exit, reconcile observed exited;running → removed—reconcile_disposewhen the container vanished;stopped → running— restart and patch inner start;stopped → removed— cleanup TTL or admin DELETE.
Other pairs are intentionally rejected:
running → runningandstopped → stoppedwould mean Upsert overwrote state without a CAS guard. Idempotent re-start / re-stop never transitions; the service layer returnsreplay_no_opand the record is left untouched.removed → *is forbidden becauseremovedis terminal. The reconciler creates fresh records withreconcile_adoptrather than resurrecting old ones.
Encoding the table this way means a future bug where a service tries to revive a removed record is rejected at the domain layer rather than the adapter, which keeps the failure mode close to the offending code.
9. PullPolicy re-declared inside ports/dockerclient.go
The same enum exists as config.ImagePullPolicy. Importing
internal/config from the ports package would couple two unrelated
layers and create a cyclic risk once the wiring layer pulls both in.
The runtime/wiring layer (in internal/app) is the single point that
translates between the two type aliases — both are string-typed, the
value sets are identical, and the validation lives on each side
independently.
10. Compile-time interface assertions live with adapters
Every interface has a var _ ports.X = (*Y)(nil) assertion, but the
assertion lives in the adapter package (e.g.
var _ ports.RuntimeRecordStore = (*Store)(nil) inside
internal/adapters/postgres/runtimerecordstore). Putting the
assertions in the port package would force the port package to import
its own implementations and create an obvious import cycle.
11. RunSpec.Validate lives on the request type
The Docker port carries a non-trivial request type (RunSpec) with
eight required fields and per-mount invariants. Putting Validate on
the request struct keeps the rule next to the type definition, mirrors
the pattern used by lobby/internal/ports/gmclient.go
(RegisterGameRequest.Validate), and lets the adapter call it as the
first defensive check before invoking the Docker SDK.