168 lines
7.5 KiB
Markdown
168 lines
7.5 KiB
Markdown
# Domain and Ports
|
|
|
|
This document explains why the `rtmanager` domain layer
|
|
([`../internal/domain/`](../internal/domain)) and the port interfaces
|
|
([`../internal/ports/`](../internal/ports)) are shaped the way they are.
|
|
The current-state types and method signatures are the source of truth in
|
|
the code; this file records the rationale so future readers do not
|
|
re-litigate the same trade-offs.
|
|
|
|
For the surrounding behaviour see
|
|
[`../README.md`](../README.md), the SQL CHECK constraints in
|
|
[`../internal/adapters/postgres/migrations/00001_init.sql`](../internal/adapters/postgres/migrations/00001_init.sql),
|
|
the wire contracts under [`../api/`](../api), and
|
|
[`postgres-migration.md`](postgres-migration.md) for the persistence
|
|
layer.
|
|
|
|
## 1. String-typed status enums
|
|
|
|
`runtime.Status`, `operation.OpKind`, `operation.OpSource`,
|
|
`operation.Outcome`, `health.EventType`, `health.SnapshotStatus`, and
|
|
`health.SnapshotSource` are all `type X string`.
|
|
|
|
The string approach wins on three counts:
|
|
|
|
- the SQL CHECK constraints already store the values as `text`, so a
|
|
string domain type maps one-to-one with no codec layer;
|
|
- it matches Lobby (`game.Status`, `membership.Status`,
|
|
`application.Status`), so reviewers do not switch encoding mental
|
|
models when crossing service boundaries;
|
|
- `IsKnown` keeps the invariant cheap (a single switch); a `type X uint8`
|
|
with stringer-generated names would pay a constant lookup and make raw
|
|
SQL columns harder to read in diagnostics.
|
|
|
|
## 2. Plain `string` for `CurrentContainerID` and `CurrentImageRef`
|
|
|
|
The PostgreSQL columns are nullable. The domain model uses plain
|
|
`string` with empty == NULL and bridges the SQL nullability inside the
|
|
adapter. Pointer fields would force every consumer to dereference
|
|
defensively even though business logic rarely cares about the
|
|
NULL/empty distinction (removed records may legitimately carry either
|
|
form depending on whether the record passed through `stopped` first).
|
|
|
|
The adapter's job is to translate `sql.NullString` ⇄ `string`; the rest
|
|
of the codebase reads the field as a regular value.
|
|
|
|
## 3. `*time.Time` for nullable timestamps
|
|
|
|
`StartedAt`, `StoppedAt`, `RemovedAt` retain pointer types. `time.Time{}`
|
|
is a real, comparable value in Go (`IsZero` only reports the canonical
|
|
zero time); mixing "missing" and "set to UTC zero" through plain
|
|
`time.Time` would invite bugs. The jet-generated `model.RuntimeRecords`
|
|
already declares the same fields as `*time.Time`, so the domain type
|
|
aligns with the persistence type and the adapter does not re-shape
|
|
pointers.
|
|
|
|
## 4. `EventType` and `SnapshotStatus` are deliberately distinct
|
|
|
|
`runtime-health-asyncapi.yaml.EventType` enumerates seven values; the
|
|
SQL CHECK on `health_snapshots.status` enumerates six. The two sets
|
|
overlap but are not identical:
|
|
|
|
- `container_started` is an *event*; the snapshot collapses it to
|
|
`healthy` (a successful start is observed as the container being
|
|
live, not as an ongoing event);
|
|
- `probe_recovered` is an *event*; it does not become a snapshot row of
|
|
its own — the next inspect/probe overwrites the prior `probe_failed`
|
|
with `healthy`.
|
|
|
|
Modelling them as one shared enum would require a separate "event vs
|
|
snapshot" boolean and invite accidental mismatches. Two distinct types
|
|
with explicit `IsKnown` matrices keep each surface honest at compile
|
|
time.
|
|
|
|
## 5. `Inspect` split into `InspectImage` + `InspectContainer`
|
|
|
|
Two narrow methods replace a single polymorphic `Inspect`. The surface
|
|
RTM exercises has two shapes:
|
|
|
|
- the start service inspects the *image* by reference to read resource
|
|
limits from labels;
|
|
- the periodic inspect worker, the reconciler, and the events listener
|
|
inspect *containers* by id to read state, health, restart count, and
|
|
exit code.
|
|
|
|
The inputs differ (ref vs id), and the result types differ
|
|
(`ImageInspect.Labels` is the only field used at start time, while
|
|
`ContainerInspect` carries a dozen state fields). One polymorphic
|
|
method would either split internally on input type or return a tagged
|
|
union; either is messier than two narrow methods.
|
|
|
|
## 6. `LobbyGameRecord` is intentionally minimal
|
|
|
|
`LobbyInternalClient.GetGame` returns `GameID`, `Status`, and
|
|
`TargetEngineVersion`. The fetch is classified as ancillary diagnostics
|
|
because the start envelope already carries the only required field
|
|
(`image_ref`).
|
|
|
|
Anything more would invite RTM consumers to depend on Lobby's schema in
|
|
ways that violate the "RTM never resolves engine versions" rule.
|
|
Future fields are additive: each new field is opt-in to the consumer
|
|
and does not break existing call sites. The minimalism is also a hedge
|
|
against schema drift — Lobby's `GameRecord` is large and changes more
|
|
often than RTM needs to track.
|
|
|
|
## 7. `NotificationIntentPublisher.Publish` returns `error`, not `(string, error)`
|
|
|
|
Lobby's `IntentPublisher.Publish` returns the Redis Stream entry id so
|
|
business workflows that key on it (idempotency keys, audit
|
|
correlation) can capture it. RTM publishes admin-only failure intents
|
|
where the entry id has no consumer — failing starts do not loop back
|
|
to RTM, and notification routing keys on the producer-supplied
|
|
`idempotency_key` rather than the stream id. The adapter wraps
|
|
`pkg/notificationintent.Publisher` and discards the entry id at the
|
|
wrapper boundary.
|
|
|
|
## 8. Exactly four allowed runtime transitions
|
|
|
|
`runtime.AllowedTransitions` covers:
|
|
|
|
- `running → stopped` — graceful stop, observed exit, reconcile
|
|
observed exited;
|
|
- `running → removed` — `reconcile_dispose` when the container
|
|
vanished;
|
|
- `stopped → running` — restart and patch inner start;
|
|
- `stopped → removed` — cleanup TTL or admin DELETE.
|
|
|
|
Other pairs are intentionally rejected:
|
|
|
|
- `running → running` and `stopped → stopped` would mean Upsert
|
|
overwrote state without a CAS guard. Idempotent re-start / re-stop
|
|
never transitions; the service layer returns `replay_no_op` and the
|
|
record is left untouched.
|
|
- `removed → *` is forbidden because `removed` is terminal. The
|
|
reconciler creates fresh records with `reconcile_adopt` rather than
|
|
resurrecting old ones.
|
|
|
|
Encoding the table this way means a future bug where a service tries
|
|
to revive a removed record is rejected at the domain layer rather than
|
|
the adapter, which keeps the failure mode close to the offending code.
|
|
|
|
## 9. `PullPolicy` re-declared inside `ports/dockerclient.go`
|
|
|
|
The same enum exists as `config.ImagePullPolicy`. Importing
|
|
`internal/config` from the ports package would couple two unrelated
|
|
layers and create a cyclic risk once the wiring layer pulls both in.
|
|
The runtime/wiring layer (in `internal/app`) is the single point that
|
|
translates between the two type aliases — both are `string`-typed, the
|
|
value sets are identical, and the validation lives on each side
|
|
independently.
|
|
|
|
## 10. Compile-time interface assertions live with adapters
|
|
|
|
Every interface has a `var _ ports.X = (*Y)(nil)` assertion, but the
|
|
assertion lives in the adapter package (e.g.
|
|
`var _ ports.RuntimeRecordStore = (*Store)(nil)` inside
|
|
`internal/adapters/postgres/runtimerecordstore`). Putting the
|
|
assertions in the port package would force the port package to import
|
|
its own implementations and create an obvious import cycle.
|
|
|
|
## 11. `RunSpec.Validate` lives on the request type
|
|
|
|
The Docker port carries a non-trivial request type (`RunSpec`) with
|
|
eight required fields and per-mount invariants. Putting `Validate` on
|
|
the request struct keeps the rule next to the type definition, mirrors
|
|
the pattern used by `lobby/internal/ports/gmclient.go`
|
|
(`RegisterGameRequest.Validate`), and lets the adapter call it as the
|
|
first defensive check before invoking the Docker SDK.
|