chore(ci): tidy CI/dev infra — drop local-ci, lift migration rule, scope by galaxy.stack label
Five connected cleanups across the dev/CI infrastructure:
1. Drop tools/local-ci/. The standalone Gitea + act_runner stack was
the legacy "offline workflow validator"; the per-stage CI gate now
runs on gitea.lan and the directory was only retained as a
fallback. Removing it leaves no operational dependency: backend,
gateway, and game code have no references; documentation that
pointed at it (CLAUDE.md, docs/ARCHITECTURE.md, ui/docs/testing.md,
tools/dev-deploy/README.md, tools/local-dev/README.md) is updated
in this same change. Historical "Verified on local-ci run N"
markers in ui/PLAN.md are preserved unchanged.
2. Lift the pre-production single-migration rule. The rule forced
every schema delta into 00001_init.sql and required a manual
make clean-data wipe on every backward-incompatible change in
tools/dev-deploy/. Future schema deltas now land as additive
sequence-numbered files (00002_*.sql, …) that goose applies
automatically on backend startup; 00001_init.sql becomes an
immutable baseline. Authoring conventions live in
backend/internal/postgres/migrations/README.md. The chain may be
squashed back into a fresh 00001 as a deliberate one-time
operation before the first production deployment.
3. Document the deployment cadence. The dev environment is
single-tenant: pushes to feature/* run the test workflows
(go-unit, ui-test, integration) only; dev-deploy.yaml fires on
push to development. A workflow_dispatch override on
dev-deploy.yaml lets a developer preview a feature branch on the
shared dev environment before merge; the next merge into
development overwrites the manual deploy idempotently.
4. Scope compose-managed resources by an explicit
galaxy.stack=<local-dev|dev-deploy> label. Both compose files
stamp the label on every service, network, and named volume.
Makefiles in tools/local-dev/ and tools/dev-deploy/ filter their
engine-cleanup operations by (stack-label AND engine OCI title)
so they never touch unrelated workloads on the same daemon.
dev-deploy.yaml gains a pre-`compose up` step that reaps stale
exited/dead containers under the dev-deploy stack label.
5. Backend now stamps the same galaxy.stack=<value> label on every
engine container it spawns, sourced from a new BACKEND_STACK_LABEL
env var (empty → label not applied; legacy-safe). Both compose
files set it to their stack name (local-dev / dev-deploy). The
contract is recorded in docs/ARCHITECTURE.md under
"Container labels". A package-level test in
backend/internal/runtime exercises both the label-present and
label-absent paths.
No tests intentionally regressed: go test ./backend/internal/{config,
runtime,dockerclient} is green, both compose files validate cleanly,
and the backend, gateway, and game modules all build.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+5
-4
@@ -129,6 +129,7 @@ fast.
|
||||
| `BACKEND_RUNTIME_CONTAINER_PIDS_LIMIT` | no | `256` | Engine container `--pids-limit`. |
|
||||
| `BACKEND_RUNTIME_CONTAINER_STATE_MOUNT` | no | `/var/lib/galaxy-game` | Absolute in-container path for the per-game state bind mount. |
|
||||
| `BACKEND_RUNTIME_STOP_GRACE_PERIOD` | no | `10s` | SIGTERM-to-SIGKILL grace period for engine container stop. |
|
||||
| `BACKEND_STACK_LABEL` | no | — | Optional value stamped as `galaxy.stack=<value>` on every engine container backend spawns. Lets host-side tooling (Makefile / CI) scope cleanup to one dev stack. Empty → label is not applied. |
|
||||
| `BACKEND_NOTIFICATION_ADMIN_EMAIL` | no | — | Recipient address for admin-channel notifications (`runtime.*` kinds). When empty, admin-channel routes are recorded as `skipped` and the catalog is partially silenced. |
|
||||
| `BACKEND_NOTIFICATION_WORKER_INTERVAL` | no | `5s` | Notification route worker scan interval. |
|
||||
| `BACKEND_NOTIFICATION_MAX_ATTEMPTS` | no | `8` | Notification route delivery attempts before dead-lettering. |
|
||||
@@ -153,10 +154,10 @@ seeded `admin_accounts` ahead of time.
|
||||
before the HTTP listener opens. The startup path also issues a
|
||||
`CREATE SCHEMA IF NOT EXISTS backend` so a fresh database does not
|
||||
trip goose's bookkeeping table on the first migration.
|
||||
- Pre-production uses one migration file (`00001_init.sql`) covering
|
||||
every backend domain (auth, user, admin, lobby, runtime, mail,
|
||||
notification, geo). Future migrations are sequence-numbered and
|
||||
additive.
|
||||
- Migrations are sequence-numbered (`0000N_*.sql`) and applied
|
||||
additively. `00001_init.sql` is the historical baseline; every
|
||||
schema change after it is a new file with a higher prefix. See
|
||||
`internal/postgres/migrations/README.md` for the authoring rules.
|
||||
- Queries are written through `go-jet/jet/v2`. The generated code is in
|
||||
`internal/postgres/jet/backend/` and is committed; `internal/postgres/jet/jet.go`
|
||||
carries package metadata that survives regeneration.
|
||||
|
||||
@@ -28,10 +28,11 @@ test stack. The list mirrors the steady-state behaviour documented in
|
||||
## Migrations
|
||||
|
||||
`pressly/goose/v3` applies embedded migrations from
|
||||
`internal/postgres/migrations/`. The pre-production set ships as
|
||||
`00001_init.sql` plus additive numbered files. Backend always runs
|
||||
`CREATE SCHEMA IF NOT EXISTS backend` before goose so a fresh database
|
||||
does not trip the bookkeeping table on the first migration.
|
||||
`internal/postgres/migrations/`. Migrations are additive,
|
||||
sequence-numbered files (`00001_init.sql` is the baseline). Backend
|
||||
always runs `CREATE SCHEMA IF NOT EXISTS backend` before goose so a
|
||||
fresh database does not trip the bookkeeping table on the first
|
||||
migration.
|
||||
|
||||
`internal/postgres/migrations_test.go` asserts that the migration
|
||||
produces the expected table set; adding a table without updating the
|
||||
|
||||
@@ -91,6 +91,7 @@ const (
|
||||
envRuntimeContainerPIDsLimit = "BACKEND_RUNTIME_CONTAINER_PIDS_LIMIT"
|
||||
envRuntimeContainerStateMount = "BACKEND_RUNTIME_CONTAINER_STATE_MOUNT"
|
||||
envRuntimeStopGracePeriod = "BACKEND_RUNTIME_STOP_GRACE_PERIOD"
|
||||
envRuntimeStackLabel = "BACKEND_STACK_LABEL"
|
||||
|
||||
envNotificationAdminEmail = "BACKEND_NOTIFICATION_ADMIN_EMAIL"
|
||||
envNotificationWorkerInterval = "BACKEND_NOTIFICATION_WORKER_INTERVAL"
|
||||
@@ -409,6 +410,14 @@ type RuntimeConfig struct {
|
||||
// StopGracePeriod is the docker stop SIGTERM-to-SIGKILL grace period
|
||||
// applied during stop / cancel / restart / patch.
|
||||
StopGracePeriod time.Duration
|
||||
|
||||
// StackLabel is the optional value backend stamps as
|
||||
// `galaxy.stack=<value>` on every engine container it spawns. It
|
||||
// lets host-side tooling (Makefile, CI workflows) scope cleanup
|
||||
// operations to a single dev stack without touching unrelated
|
||||
// workloads on the same Docker daemon. When empty, the label is
|
||||
// not applied.
|
||||
StackLabel string
|
||||
}
|
||||
|
||||
// DiplomailConfig bounds the diplomatic-mail subsystem. Both limits
|
||||
@@ -705,6 +714,7 @@ func LoadFromEnv() (Config, error) {
|
||||
if cfg.Runtime.StopGracePeriod, err = loadDuration(envRuntimeStopGracePeriod, cfg.Runtime.StopGracePeriod); err != nil {
|
||||
return Config{}, err
|
||||
}
|
||||
cfg.Runtime.StackLabel = strings.TrimSpace(loadString(envRuntimeStackLabel, cfg.Runtime.StackLabel))
|
||||
|
||||
cfg.Notification.AdminEmail = loadString(envNotificationAdminEmail, cfg.Notification.AdminEmail)
|
||||
if cfg.Notification.WorkerInterval, err = loadDuration(envNotificationWorkerInterval, cfg.Notification.WorkerInterval); err != nil {
|
||||
|
||||
@@ -1,26 +1,46 @@
|
||||
# Backend migrations
|
||||
|
||||
Goose migrations embedded into the backend binary by `embed.go`. Applied
|
||||
at startup before any listener opens (see `internal/postgres`).
|
||||
Goose (`pressly/goose/v3`) migrations embedded into the backend binary
|
||||
by `embed.go`. Applied at startup before any listener opens — see
|
||||
`internal/postgres`.
|
||||
|
||||
## Pre-production single-file rule
|
||||
## Authoring conventions
|
||||
|
||||
**While the platform is not yet in production, every schema change goes
|
||||
into the existing `00001_init.sql` file** rather than a new
|
||||
`00002_*`-prefixed file. The intent is to keep the schema in one
|
||||
canonical place so reviewers and developers do not have to reconstruct
|
||||
the latest shape from a chain of incremental migrations.
|
||||
- Each schema change is a new file with a monotonically increasing
|
||||
numeric prefix and a snake-case slug:
|
||||
`0000N_short_description.sql`. Reuse of a prefix is forbidden once
|
||||
the file is merged.
|
||||
- `00001_init.sql` is the historical baseline. Treat it as immutable
|
||||
history; do not edit it to land new schema. Squashing the chain back
|
||||
into a fresh `00001` is reserved for the explicit pre-production
|
||||
cut-over.
|
||||
- Every file MUST contain both an `-- +goose Up` and `-- +goose Down`
|
||||
section, even if Down is a single `DROP …` for the same artefacts.
|
||||
Down migrations are exercised by the schema test and serve as the
|
||||
documented rollback path.
|
||||
- Destructive changes (dropping columns/tables, renaming with data
|
||||
loss) MUST be split into at least two migrations so the chain stays
|
||||
rollable forward and backward without coordinated code+schema
|
||||
windows:
|
||||
1. add the new shape, dual-write the data, leave the old shape in
|
||||
place;
|
||||
2. once all readers have switched, drop the old shape in a follow-up
|
||||
migration.
|
||||
- Migrations are applied automatically on backend startup, so a fresh
|
||||
push to `development` plus the `dev-deploy.yaml` workflow brings the
|
||||
long-lived dev database up to head without manual intervention.
|
||||
`make -C tools/dev-deploy clean-data` is only needed when a developer
|
||||
deliberately wants a fresh database.
|
||||
- The integration harness (`backend/internal/postgres/migrations_test.go`)
|
||||
spins up a disposable Postgres per run and asserts the final table
|
||||
set. When a migration adds or removes tables, update the expected
|
||||
list in the same patch.
|
||||
|
||||
Operationally this means that pulling a branch with schema changes
|
||||
requires a fresh database — the only consumer today is local development
|
||||
and integration tests, both of which spin up disposable Postgres
|
||||
instances.
|
||||
## Pre-production squash
|
||||
|
||||
> **Remove this rule before the first production deployment.** From
|
||||
> that point on every schema change must be a new migration file with a
|
||||
> monotonically increasing prefix, and `00001_init.sql` becomes
|
||||
> immutable history.
|
||||
|
||||
If you need to make a change, edit `00001_init.sql` directly. Down
|
||||
migrations should still be kept in sync (they live at the bottom of the
|
||||
file — currently a single `DROP SCHEMA backend CASCADE`).
|
||||
The chain may be squashed back into one clean `00001_init.sql` before
|
||||
the first production deployment. That is a deliberate, one-time
|
||||
operation; until then, additive numbered files are the rule. After the
|
||||
squash this file gets a short note that `00001_init.sql` represents
|
||||
the production baseline and the policy above continues to apply for
|
||||
every later migration.
|
||||
|
||||
@@ -537,10 +537,7 @@ func (s *Service) runStart(ctx context.Context, op OperationLog) error {
|
||||
Env: map[string]string{
|
||||
"GAME_STATE_PATH": statePath,
|
||||
},
|
||||
Labels: map[string]string{
|
||||
"galaxy.game_id": gameID.String(),
|
||||
"galaxy.engine_version": version.Version,
|
||||
},
|
||||
Labels: s.engineLabels(gameID.String(), version.Version),
|
||||
BindMounts: []dockerclient.BindMount{
|
||||
{
|
||||
HostPath: hostStatePath,
|
||||
@@ -735,10 +732,7 @@ func (s *Service) runPatch(ctx context.Context, op OperationLog, target EngineVe
|
||||
Env: map[string]string{
|
||||
"GAME_STATE_PATH": statePath,
|
||||
},
|
||||
Labels: map[string]string{
|
||||
"galaxy.game_id": op.GameID.String(),
|
||||
"galaxy.engine_version": target.Version,
|
||||
},
|
||||
Labels: s.engineLabels(op.GameID.String(), target.Version),
|
||||
BindMounts: []dockerclient.BindMount{
|
||||
{HostPath: hostStatePath, MountPath: s.deps.Config.ContainerStateMount},
|
||||
},
|
||||
@@ -938,6 +932,30 @@ func (s *Service) upsertRuntimeRecord(ctx context.Context, in runtimeRecordInser
|
||||
// containers attach to. Wired from cfg.Docker.Network through Deps.
|
||||
func (s *Service) dockerNetwork() string { return s.deps.DockerNetwork }
|
||||
|
||||
// engineLabels returns the label set stamped on every engine container
|
||||
// spawned for gameID running engineVersion. The runtime adapter merges
|
||||
// `dockerclient.ManagedLabel` separately; this helper covers the
|
||||
// game-scoped labels plus an optional `galaxy.stack=<value>` from the
|
||||
// runtime config so host-side tooling can scope cleanup by dev stack
|
||||
// without touching unrelated workloads.
|
||||
func (s *Service) engineLabels(gameID, engineVersion string) map[string]string {
|
||||
return engineLabels(gameID, engineVersion, s.deps.Config.StackLabel)
|
||||
}
|
||||
|
||||
// engineLabels is the side-effect-free part of `(*Service).engineLabels`,
|
||||
// exposed at package scope so unit tests can exercise the labelling
|
||||
// rules without building a full Service.
|
||||
func engineLabels(gameID, engineVersion, stackLabel string) map[string]string {
|
||||
labels := map[string]string{
|
||||
"galaxy.game_id": gameID,
|
||||
"galaxy.engine_version": engineVersion,
|
||||
}
|
||||
if stackLabel != "" {
|
||||
labels["galaxy.stack"] = stackLabel
|
||||
}
|
||||
return labels
|
||||
}
|
||||
|
||||
// waitForEngineHealthz polls the engine `/healthz` endpoint until it
|
||||
// responds 2xx or until the timeout elapses. The Docker daemon
|
||||
// reports a container as `running` as soon as the entrypoint starts,
|
||||
|
||||
@@ -0,0 +1,51 @@
|
||||
package runtime
|
||||
|
||||
import "testing"
|
||||
|
||||
func TestEngineLabels(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
cases := []struct {
|
||||
name string
|
||||
gameID string
|
||||
version string
|
||||
stackLabel string
|
||||
want map[string]string
|
||||
}{
|
||||
{
|
||||
name: "stack label omitted when empty",
|
||||
gameID: "11111111-1111-1111-1111-111111111111",
|
||||
version: "0.1.0",
|
||||
stackLabel: "",
|
||||
want: map[string]string{
|
||||
"galaxy.game_id": "11111111-1111-1111-1111-111111111111",
|
||||
"galaxy.engine_version": "0.1.0",
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "stack label included when set",
|
||||
gameID: "22222222-2222-2222-2222-222222222222",
|
||||
version: "0.2.3",
|
||||
stackLabel: "dev-deploy",
|
||||
want: map[string]string{
|
||||
"galaxy.game_id": "22222222-2222-2222-2222-222222222222",
|
||||
"galaxy.engine_version": "0.2.3",
|
||||
"galaxy.stack": "dev-deploy",
|
||||
},
|
||||
},
|
||||
}
|
||||
for _, tc := range cases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
t.Parallel()
|
||||
got := engineLabels(tc.gameID, tc.version, tc.stackLabel)
|
||||
if len(got) != len(tc.want) {
|
||||
t.Fatalf("len(labels) = %d, want %d (got %v)", len(got), len(tc.want), got)
|
||||
}
|
||||
for k, v := range tc.want {
|
||||
if got[k] != v {
|
||||
t.Errorf("labels[%q] = %q, want %q", k, got[k], v)
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user