From a9087691a3b108fbea0a2a936b952a28cefebbb0 Mon Sep 17 00:00:00 2001 From: Ilia Denisov Date: Mon, 18 May 2026 23:32:42 +0200 Subject: [PATCH] =?UTF-8?q?chore(ci):=20tidy=20CI/dev=20infra=20=E2=80=94?= =?UTF-8?q?=20drop=20local-ci,=20lift=20migration=20rule,=20scope=20by=20g?= =?UTF-8?q?alaxy.stack=20label?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Five connected cleanups across the dev/CI infrastructure: 1. Drop tools/local-ci/. The standalone Gitea + act_runner stack was the legacy "offline workflow validator"; the per-stage CI gate now runs on gitea.lan and the directory was only retained as a fallback. Removing it leaves no operational dependency: backend, gateway, and game code have no references; documentation that pointed at it (CLAUDE.md, docs/ARCHITECTURE.md, ui/docs/testing.md, tools/dev-deploy/README.md, tools/local-dev/README.md) is updated in this same change. Historical "Verified on local-ci run N" markers in ui/PLAN.md are preserved unchanged. 2. Lift the pre-production single-migration rule. The rule forced every schema delta into 00001_init.sql and required a manual make clean-data wipe on every backward-incompatible change in tools/dev-deploy/. Future schema deltas now land as additive sequence-numbered files (00002_*.sql, …) that goose applies automatically on backend startup; 00001_init.sql becomes an immutable baseline. Authoring conventions live in backend/internal/postgres/migrations/README.md. The chain may be squashed back into a fresh 00001 as a deliberate one-time operation before the first production deployment. 3. Document the deployment cadence. The dev environment is single-tenant: pushes to feature/* run the test workflows (go-unit, ui-test, integration) only; dev-deploy.yaml fires on push to development. A workflow_dispatch override on dev-deploy.yaml lets a developer preview a feature branch on the shared dev environment before merge; the next merge into development overwrites the manual deploy idempotently. 4. Scope compose-managed resources by an explicit galaxy.stack= label. Both compose files stamp the label on every service, network, and named volume. Makefiles in tools/local-dev/ and tools/dev-deploy/ filter their engine-cleanup operations by (stack-label AND engine OCI title) so they never touch unrelated workloads on the same daemon. dev-deploy.yaml gains a pre-`compose up` step that reaps stale exited/dead containers under the dev-deploy stack label. 5. Backend now stamps the same galaxy.stack= label on every engine container it spawns, sourced from a new BACKEND_STACK_LABEL env var (empty → label not applied; legacy-safe). Both compose files set it to their stack name (local-dev / dev-deploy). The contract is recorded in docs/ARCHITECTURE.md under "Container labels". A package-level test in backend/internal/runtime exercises both the label-present and label-absent paths. No tests intentionally regressed: go test ./backend/internal/{config, runtime,dockerclient} is green, both compose files validate cleanly, and the backend, gateway, and game modules all build. Co-Authored-By: Claude Opus 4.7 (1M context) --- .gitea/workflows/dev-deploy.yaml | 18 +++ CLAUDE.md | 46 +++++--- backend/README.md | 9 +- backend/docs/runbook.md | 9 +- backend/internal/config/config.go | 10 ++ .../internal/postgres/migrations/README.md | 60 ++++++---- backend/internal/runtime/service.go | 34 ++++-- .../internal/runtime/service_internal_test.go | 51 +++++++++ docs/ARCHITECTURE.md | 34 +++++- tools/dev-deploy/Makefile | 9 +- tools/dev-deploy/README.md | 36 ++++-- tools/dev-deploy/docker-compose.yml | 21 ++++ tools/local-ci/.gitignore | 1 - tools/local-ci/Makefile | 42 ------- tools/local-ci/README.md | 106 ------------------ tools/local-ci/bootstrap.sh | 86 -------------- tools/local-ci/config.yaml | 35 ------ tools/local-ci/docker-compose.override.yml | 16 --- tools/local-ci/docker-compose.yml | 78 ------------- tools/local-dev/Makefile | 23 +++- tools/local-dev/README.md | 25 +++-- tools/local-dev/docker-compose.yml | 17 +++ ui/docs/testing.md | 91 ++------------- 23 files changed, 325 insertions(+), 532 deletions(-) create mode 100644 backend/internal/runtime/service_internal_test.go delete mode 100644 tools/local-ci/.gitignore delete mode 100644 tools/local-ci/Makefile delete mode 100644 tools/local-ci/README.md delete mode 100755 tools/local-ci/bootstrap.sh delete mode 100644 tools/local-ci/config.yaml delete mode 100644 tools/local-ci/docker-compose.override.yml delete mode 100644 tools/local-ci/docker-compose.yml diff --git a/.gitea/workflows/dev-deploy.yaml b/.gitea/workflows/dev-deploy.yaml index 3eb6305..589cd7d 100644 --- a/.gitea/workflows/dev-deploy.yaml +++ b/.gitea/workflows/dev-deploy.yaml @@ -104,6 +104,24 @@ jobs: -v "${{ gitea.workspace }}/ui/frontend/build:/src:ro" \ alpine sh -c 'rm -rf /dst/* /dst/.??* 2>/dev/null; cp -a /src/. /dst/' + - name: Reap stray dev-deploy containers + run: | + # Remove any non-running compose-managed containers from + # earlier deploys before `compose up`. Filter by the stack + # label so we never touch unrelated workloads on the same + # daemon. Running containers (incl. engine instances backend + # spawned itself with the same label) are left intact — + # those are reattached by the backend reconciler on boot. + ids=$(docker ps -aq \ + --filter "label=galaxy.stack=dev-deploy" \ + --filter "status=exited" \ + --filter "status=created" \ + --filter "status=dead") + if [ -n "$ids" ]; then + echo "reaping: $ids" + docker rm -f $ids + fi + - name: Bring up the stack working-directory: tools/dev-deploy run: | diff --git a/CLAUDE.md b/CLAUDE.md index a58e51e..e1e330e 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -46,7 +46,7 @@ Branches: it auto-deploys to the dev environment via `dev-deploy.yaml` (reachable at `https://www.galaxy.lan` / `https://api.galaxy.lan`). - `feature/*` — short-lived branches off `development`. Merged back - via PR; only then do they reach the dev environment. + via PR; only then do they reach the dev environment automatically. Workflows in `.gitea/workflows/`: @@ -55,10 +55,24 @@ Workflows in `.gitea/workflows/`: | `go-unit.yaml` | push + PR matching Go paths | Fast Go unit tests. | | `ui-test.yaml` | push + PR matching `ui/**` | Vitest + Playwright. | | `integration.yaml` | PR to `development`/`main`; push to `development` | testcontainers integration suite. | -| `dev-deploy.yaml` | push to `development` | Build images + (re)deploy to `tools/dev-deploy/`. | +| `dev-deploy.yaml` | push to `development`; `workflow_dispatch` on any ref | Build images + (re)deploy to `tools/dev-deploy/`. | | `prod-build.yaml` | push to `main` | Build prod images and `docker save` into artifacts. | | `deploy-prod.yaml` | `workflow_dispatch` | Manual rollout (placeholder until prod host exists). | +### Deployment cadence + +The long-lived dev environment (`tools/dev-deploy/`) is single-tenant: +one live deployment, redeployed on every merge into `development`. +While a PR is open the dev environment stays on whatever was last +merged — pushes to `feature/*` only fire the test workflows +(`go-unit`, `ui-test`, `integration`), not `dev-deploy.yaml`. + +To preview an unmerged feature branch on the shared dev environment, +trigger `dev-deploy.yaml` manually from the Gitea UI +(**Actions → Deploy · Dev → Run workflow**) and pick the feature ref. +The deploy is idempotent: the next merge into `development` simply +overwrites whatever the manual dispatch left behind. + ## Per-stage CI gate Every completed stage from any `PLAN.md` (per-service or `ui/PLAN.md`) @@ -72,10 +86,6 @@ short version: 4. Only after every workflow that fired is `success` may the stage be marked done in the corresponding `PLAN.md`. -`tools/local-ci/` is now an opt-in fallback for testing workflow -changes without `gitea.lan` (offline iterations, runner-isolation -debugging). It is no longer required for the per-stage gate. - ## Decisions during stage implementation Stages from `PLAN.md` produce decisions. Those decisions never live in a @@ -102,18 +112,22 @@ The existing codebase of `galaxy/` may be modified or extended when a plan stage requires it. All such changes must be covered by new or updated tests and reflected in documentation when they affect documented behavior. -## Pre-production migration rule +## Migrations -The platform is not yet in production. Schema changes for `backend` go -into the existing `backend/internal/postgres/migrations/00001_init.sql` -file rather than into new `00002_*`-prefixed files. Local databases and -integration test harnesses are recreated from scratch on every pull. +Schema changes for `backend` go into a new `0000N_*.sql` file under +`backend/internal/postgres/migrations/` with a monotonically increasing +prefix. `00001_init.sql` is the historical baseline and stays +immutable; every subsequent change is its own additive migration with +matching Up/Down sides. `pressly/goose/v3` (embedded into the backend +binary) applies pending migrations on startup, so the long-lived dev +environment picks up schema deltas without a manual reset. -**This rule is removed before the first production deployment.** From -that point on every schema change becomes a new migration file with a -monotonically increasing prefix, and `00001_init.sql` becomes immutable -history. See `backend/internal/postgres/migrations/README.md` for -details. +Before the first production deployment the migration chain may be +squashed back into a single fresh `00001_init.sql` for a clean slate; +plan that work as an explicit task when it lands. See +`backend/internal/postgres/migrations/README.md` for the local +authoring conventions (file naming, transactional vs. non-transactional +sections, backward-compatible deletes, rollback expectations). ## Documentation discipline diff --git a/backend/README.md b/backend/README.md index 27505cf..a27452e 100644 --- a/backend/README.md +++ b/backend/README.md @@ -129,6 +129,7 @@ fast. | `BACKEND_RUNTIME_CONTAINER_PIDS_LIMIT` | no | `256` | Engine container `--pids-limit`. | | `BACKEND_RUNTIME_CONTAINER_STATE_MOUNT` | no | `/var/lib/galaxy-game` | Absolute in-container path for the per-game state bind mount. | | `BACKEND_RUNTIME_STOP_GRACE_PERIOD` | no | `10s` | SIGTERM-to-SIGKILL grace period for engine container stop. | +| `BACKEND_STACK_LABEL` | no | — | Optional value stamped as `galaxy.stack=` on every engine container backend spawns. Lets host-side tooling (Makefile / CI) scope cleanup to one dev stack. Empty → label is not applied. | | `BACKEND_NOTIFICATION_ADMIN_EMAIL` | no | — | Recipient address for admin-channel notifications (`runtime.*` kinds). When empty, admin-channel routes are recorded as `skipped` and the catalog is partially silenced. | | `BACKEND_NOTIFICATION_WORKER_INTERVAL` | no | `5s` | Notification route worker scan interval. | | `BACKEND_NOTIFICATION_MAX_ATTEMPTS` | no | `8` | Notification route delivery attempts before dead-lettering. | @@ -153,10 +154,10 @@ seeded `admin_accounts` ahead of time. before the HTTP listener opens. The startup path also issues a `CREATE SCHEMA IF NOT EXISTS backend` so a fresh database does not trip goose's bookkeeping table on the first migration. -- Pre-production uses one migration file (`00001_init.sql`) covering - every backend domain (auth, user, admin, lobby, runtime, mail, - notification, geo). Future migrations are sequence-numbered and - additive. +- Migrations are sequence-numbered (`0000N_*.sql`) and applied + additively. `00001_init.sql` is the historical baseline; every + schema change after it is a new file with a higher prefix. See + `internal/postgres/migrations/README.md` for the authoring rules. - Queries are written through `go-jet/jet/v2`. The generated code is in `internal/postgres/jet/backend/` and is committed; `internal/postgres/jet/jet.go` carries package metadata that survives regeneration. diff --git a/backend/docs/runbook.md b/backend/docs/runbook.md index 9d28e38..3ab5869 100644 --- a/backend/docs/runbook.md +++ b/backend/docs/runbook.md @@ -28,10 +28,11 @@ test stack. The list mirrors the steady-state behaviour documented in ## Migrations `pressly/goose/v3` applies embedded migrations from -`internal/postgres/migrations/`. The pre-production set ships as -`00001_init.sql` plus additive numbered files. Backend always runs -`CREATE SCHEMA IF NOT EXISTS backend` before goose so a fresh database -does not trip the bookkeeping table on the first migration. +`internal/postgres/migrations/`. Migrations are additive, +sequence-numbered files (`00001_init.sql` is the baseline). Backend +always runs `CREATE SCHEMA IF NOT EXISTS backend` before goose so a +fresh database does not trip the bookkeeping table on the first +migration. `internal/postgres/migrations_test.go` asserts that the migration produces the expected table set; adding a table without updating the diff --git a/backend/internal/config/config.go b/backend/internal/config/config.go index bd981ab..ee6cae9 100644 --- a/backend/internal/config/config.go +++ b/backend/internal/config/config.go @@ -91,6 +91,7 @@ const ( envRuntimeContainerPIDsLimit = "BACKEND_RUNTIME_CONTAINER_PIDS_LIMIT" envRuntimeContainerStateMount = "BACKEND_RUNTIME_CONTAINER_STATE_MOUNT" envRuntimeStopGracePeriod = "BACKEND_RUNTIME_STOP_GRACE_PERIOD" + envRuntimeStackLabel = "BACKEND_STACK_LABEL" envNotificationAdminEmail = "BACKEND_NOTIFICATION_ADMIN_EMAIL" envNotificationWorkerInterval = "BACKEND_NOTIFICATION_WORKER_INTERVAL" @@ -409,6 +410,14 @@ type RuntimeConfig struct { // StopGracePeriod is the docker stop SIGTERM-to-SIGKILL grace period // applied during stop / cancel / restart / patch. StopGracePeriod time.Duration + + // StackLabel is the optional value backend stamps as + // `galaxy.stack=` on every engine container it spawns. It + // lets host-side tooling (Makefile, CI workflows) scope cleanup + // operations to a single dev stack without touching unrelated + // workloads on the same Docker daemon. When empty, the label is + // not applied. + StackLabel string } // DiplomailConfig bounds the diplomatic-mail subsystem. Both limits @@ -705,6 +714,7 @@ func LoadFromEnv() (Config, error) { if cfg.Runtime.StopGracePeriod, err = loadDuration(envRuntimeStopGracePeriod, cfg.Runtime.StopGracePeriod); err != nil { return Config{}, err } + cfg.Runtime.StackLabel = strings.TrimSpace(loadString(envRuntimeStackLabel, cfg.Runtime.StackLabel)) cfg.Notification.AdminEmail = loadString(envNotificationAdminEmail, cfg.Notification.AdminEmail) if cfg.Notification.WorkerInterval, err = loadDuration(envNotificationWorkerInterval, cfg.Notification.WorkerInterval); err != nil { diff --git a/backend/internal/postgres/migrations/README.md b/backend/internal/postgres/migrations/README.md index fdb85cf..d131f49 100644 --- a/backend/internal/postgres/migrations/README.md +++ b/backend/internal/postgres/migrations/README.md @@ -1,26 +1,46 @@ # Backend migrations -Goose migrations embedded into the backend binary by `embed.go`. Applied -at startup before any listener opens (see `internal/postgres`). +Goose (`pressly/goose/v3`) migrations embedded into the backend binary +by `embed.go`. Applied at startup before any listener opens — see +`internal/postgres`. -## Pre-production single-file rule +## Authoring conventions -**While the platform is not yet in production, every schema change goes -into the existing `00001_init.sql` file** rather than a new -`00002_*`-prefixed file. The intent is to keep the schema in one -canonical place so reviewers and developers do not have to reconstruct -the latest shape from a chain of incremental migrations. +- Each schema change is a new file with a monotonically increasing + numeric prefix and a snake-case slug: + `0000N_short_description.sql`. Reuse of a prefix is forbidden once + the file is merged. +- `00001_init.sql` is the historical baseline. Treat it as immutable + history; do not edit it to land new schema. Squashing the chain back + into a fresh `00001` is reserved for the explicit pre-production + cut-over. +- Every file MUST contain both an `-- +goose Up` and `-- +goose Down` + section, even if Down is a single `DROP …` for the same artefacts. + Down migrations are exercised by the schema test and serve as the + documented rollback path. +- Destructive changes (dropping columns/tables, renaming with data + loss) MUST be split into at least two migrations so the chain stays + rollable forward and backward without coordinated code+schema + windows: + 1. add the new shape, dual-write the data, leave the old shape in + place; + 2. once all readers have switched, drop the old shape in a follow-up + migration. +- Migrations are applied automatically on backend startup, so a fresh + push to `development` plus the `dev-deploy.yaml` workflow brings the + long-lived dev database up to head without manual intervention. + `make -C tools/dev-deploy clean-data` is only needed when a developer + deliberately wants a fresh database. +- The integration harness (`backend/internal/postgres/migrations_test.go`) + spins up a disposable Postgres per run and asserts the final table + set. When a migration adds or removes tables, update the expected + list in the same patch. -Operationally this means that pulling a branch with schema changes -requires a fresh database — the only consumer today is local development -and integration tests, both of which spin up disposable Postgres -instances. +## Pre-production squash -> **Remove this rule before the first production deployment.** From -> that point on every schema change must be a new migration file with a -> monotonically increasing prefix, and `00001_init.sql` becomes -> immutable history. - -If you need to make a change, edit `00001_init.sql` directly. Down -migrations should still be kept in sync (they live at the bottom of the -file — currently a single `DROP SCHEMA backend CASCADE`). +The chain may be squashed back into one clean `00001_init.sql` before +the first production deployment. That is a deliberate, one-time +operation; until then, additive numbered files are the rule. After the +squash this file gets a short note that `00001_init.sql` represents +the production baseline and the policy above continues to apply for +every later migration. diff --git a/backend/internal/runtime/service.go b/backend/internal/runtime/service.go index d4495e1..a7d13e8 100644 --- a/backend/internal/runtime/service.go +++ b/backend/internal/runtime/service.go @@ -537,10 +537,7 @@ func (s *Service) runStart(ctx context.Context, op OperationLog) error { Env: map[string]string{ "GAME_STATE_PATH": statePath, }, - Labels: map[string]string{ - "galaxy.game_id": gameID.String(), - "galaxy.engine_version": version.Version, - }, + Labels: s.engineLabels(gameID.String(), version.Version), BindMounts: []dockerclient.BindMount{ { HostPath: hostStatePath, @@ -735,10 +732,7 @@ func (s *Service) runPatch(ctx context.Context, op OperationLog, target EngineVe Env: map[string]string{ "GAME_STATE_PATH": statePath, }, - Labels: map[string]string{ - "galaxy.game_id": op.GameID.String(), - "galaxy.engine_version": target.Version, - }, + Labels: s.engineLabels(op.GameID.String(), target.Version), BindMounts: []dockerclient.BindMount{ {HostPath: hostStatePath, MountPath: s.deps.Config.ContainerStateMount}, }, @@ -938,6 +932,30 @@ func (s *Service) upsertRuntimeRecord(ctx context.Context, in runtimeRecordInser // containers attach to. Wired from cfg.Docker.Network through Deps. func (s *Service) dockerNetwork() string { return s.deps.DockerNetwork } +// engineLabels returns the label set stamped on every engine container +// spawned for gameID running engineVersion. The runtime adapter merges +// `dockerclient.ManagedLabel` separately; this helper covers the +// game-scoped labels plus an optional `galaxy.stack=` from the +// runtime config so host-side tooling can scope cleanup by dev stack +// without touching unrelated workloads. +func (s *Service) engineLabels(gameID, engineVersion string) map[string]string { + return engineLabels(gameID, engineVersion, s.deps.Config.StackLabel) +} + +// engineLabels is the side-effect-free part of `(*Service).engineLabels`, +// exposed at package scope so unit tests can exercise the labelling +// rules without building a full Service. +func engineLabels(gameID, engineVersion, stackLabel string) map[string]string { + labels := map[string]string{ + "galaxy.game_id": gameID, + "galaxy.engine_version": engineVersion, + } + if stackLabel != "" { + labels["galaxy.stack"] = stackLabel + } + return labels +} + // waitForEngineHealthz polls the engine `/healthz` endpoint until it // responds 2xx or until the timeout elapses. The Docker daemon // reports a container as `running` as soon as the entrypoint starts, diff --git a/backend/internal/runtime/service_internal_test.go b/backend/internal/runtime/service_internal_test.go new file mode 100644 index 0000000..12a5420 --- /dev/null +++ b/backend/internal/runtime/service_internal_test.go @@ -0,0 +1,51 @@ +package runtime + +import "testing" + +func TestEngineLabels(t *testing.T) { + t.Parallel() + + cases := []struct { + name string + gameID string + version string + stackLabel string + want map[string]string + }{ + { + name: "stack label omitted when empty", + gameID: "11111111-1111-1111-1111-111111111111", + version: "0.1.0", + stackLabel: "", + want: map[string]string{ + "galaxy.game_id": "11111111-1111-1111-1111-111111111111", + "galaxy.engine_version": "0.1.0", + }, + }, + { + name: "stack label included when set", + gameID: "22222222-2222-2222-2222-222222222222", + version: "0.2.3", + stackLabel: "dev-deploy", + want: map[string]string{ + "galaxy.game_id": "22222222-2222-2222-2222-222222222222", + "galaxy.engine_version": "0.2.3", + "galaxy.stack": "dev-deploy", + }, + }, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + got := engineLabels(tc.gameID, tc.version, tc.stackLabel) + if len(got) != len(tc.want) { + t.Fatalf("len(labels) = %d, want %d (got %v)", len(got), len(tc.want), got) + } + for k, v := range tc.want { + if got[k] != v { + t.Errorf("labels[%q] = %q, want %q", k, got[k], v) + } + } + }) + } +} diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 53665ae..a2dec3e 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -808,10 +808,17 @@ Workflows under `.gitea/workflows/`: | `go-unit.yaml` | push + PR matching Go paths | Fast Go unit tests. | | `ui-test.yaml` | push + PR matching `ui/**` | Vitest + Playwright. | | `integration.yaml` | PR to `development` / `main`; push to `development` | testcontainers integration suite. | -| `dev-deploy.yaml` | push to `development` | Build images, seed UI volume, `compose up` against `tools/dev-deploy/`. | +| `dev-deploy.yaml` | push to `development`; `workflow_dispatch` on any ref | Build images, seed UI volume, `compose up` against `tools/dev-deploy/`. | | `prod-build.yaml` | push to `main` | Build production images and persist `docker save` bundles as artifacts. | | `deploy-prod.yaml` | manual `workflow_dispatch` | Placeholder for the future SSH-based production rollout. | +Deployment cadence: the dev environment is single-tenant. Pushes to +`feature/*` branches run only the test workflows; `dev-deploy.yaml` +does not auto-fire. To preview a feature branch on the shared dev +environment, trigger `dev-deploy.yaml` manually from the Gitea UI +against the desired ref. The deploy is idempotent — the next merge +into `development` overwrites the manually deployed state. + Environments: - **`tools/local-dev/`** — single-developer playground. Bound to @@ -823,9 +830,28 @@ Environments: and are shipped to the production host via `docker save` → `ssh prod docker load` → `docker compose up -d`. -`tools/local-ci/` remains as an opt-in fallback runner for testing -workflow changes without `gitea.lan`. It is no longer part of the -per-stage CI gate; see `CLAUDE.md` for the gate definition. +### Container labels + +Every Docker resource Galaxy creates carries an opinionated label so +that host-side tooling (Makefiles, CI workflows, `preclean.sh`) can +scope its operations to Galaxy-owned objects and never touch unrelated +workloads on the shared daemon. + +| Label | Values | Set by | Used by | +|-------|--------|--------|---------| +| `galaxy.stack` | `local-dev`, `dev-deploy`, `integration` | `tools/{local-dev,dev-deploy}/docker-compose.yml` for compose-managed resources; backend reads `BACKEND_STACK_LABEL` and stamps engines it spawns. | `tools/{local-dev,dev-deploy}/Makefile`, `.gitea/workflows/dev-deploy.yaml`. | +| `galaxy.backend` | `1` | `backend/internal/dockerclient` adapter on every engine container. | `integration/scripts/preclean.sh`. | +| `galaxy.game_id` | `` | Backend on engine create. | Reconciler reattach loop. | +| `galaxy.engine_version` | `` | Backend on engine create. | Reconciler version checks. | +| `galaxy.test.kind` | `integration-image` | `integration/testenv/images.go` on local image builds. | `integration/scripts/preclean.sh` (filter for `docker rmi`). | +| `org.testcontainers` | `true` | `testcontainers-go` (automatic). | `integration/scripts/preclean.sh`. | + +The contract: any Makefile target, CI step, or script that issues +`docker rm` / `docker rmi` / `docker network rm` MUST scope itself via +one of the labels above. Compose-managed resources are additionally +scoped by their compose project name (`galaxy-dev`, `galaxy-local-dev`), +which Compose enforces on `docker compose up/down`; the labels make the +contract explicit and survive hand-rolled cleanup commands as well. ## 19. Deployment Topology (informational) diff --git a/tools/dev-deploy/Makefile b/tools/dev-deploy/Makefile index 4e154b8..e9f1260 100644 --- a/tools/dev-deploy/Makefile +++ b/tools/dev-deploy/Makefile @@ -4,6 +4,7 @@ REPO_ROOT := $(realpath $(CURDIR)/../..) ENGINE_IMAGE := galaxy-engine:dev +STACK_LABEL := galaxy.stack=dev-deploy ENGINE_LABEL := org.opencontainers.image.title=galaxy-game-engine # Game-state root lives under the invoking user's home by default so # `make up` works without sudo. Override `GALAXY_DEV_GAME_STATE_DIR` @@ -93,12 +94,14 @@ psql: clean-data: @echo "Stopping containers and engines, then wiping volumes + game-state…" - @ids=$$(docker ps -aq --filter label=$(ENGINE_LABEL)); \ + $(COMPOSE) down -v + @ids=$$(docker ps -aq \ + --filter "label=$(STACK_LABEL)" \ + --filter "label=$(ENGINE_LABEL)"); \ if [ -n "$$ids" ]; then \ - echo "stopping engine containers…"; \ + echo "stopping engine containers for $(STACK_LABEL)…"; \ docker rm -f $$ids >/dev/null; \ fi - $(COMPOSE) down -v @if [ -d "$(GALAXY_DEV_GAME_STATE_DIR)" ]; then \ echo "wiping $(GALAXY_DEV_GAME_STATE_DIR)…"; \ docker run --rm -v "$(GALAXY_DEV_GAME_STATE_DIR):/state" alpine sh -c 'rm -rf /state/*' 2>/dev/null \ diff --git a/tools/dev-deploy/README.md b/tools/dev-deploy/README.md index 5d3f68c..fb308a7 100644 --- a/tools/dev-deploy/README.md +++ b/tools/dev-deploy/README.md @@ -135,17 +135,20 @@ exec galaxy-mailpit wget -qO- localhost:8025/messages` and similar. ## Persistent state and schema changes The dev Postgres volume `galaxy-dev-postgres-data` survives redeploys. -Until the pre-production migration rule is lifted, every -backward-incompatible change to `backend/internal/postgres/migrations/00001_init.sql` -needs a manual wipe before the next deploy succeeds: +Schema deltas land as additive, sequence-numbered migration files +(`backend/internal/postgres/migrations/0000N_*.sql`) and `pressly/goose` +applies them on backend startup without operator action. + +Use `make -C tools/dev-deploy clean-data` only when you deliberately +want a fresh database (debugging schema drift, exercising the +bootstrap path from scratch, etc.): ```sh make -C tools/dev-deploy clean-data make -C tools/dev-deploy up ``` -This is the same caveat as `tools/local-dev/`, just with a different -volume name. +The same volume-persistence model applies to `tools/local-dev/`. ## Make targets @@ -183,13 +186,30 @@ See [`KNOWN-ISSUES.md`](KNOWN-ISSUES.md) for symptoms that surface in the long-lived dev environment but are not yet fixed (currently: the sandbox game flipping to `cancelled` after a redispatch). +## Deployment cadence + +This environment is single-tenant: one live deployment, redeployed by +the `dev-deploy.yaml` workflow on every merge into `development`. PR +branches do not auto-deploy here — pushes to `feature/*` only run the +test workflows (`go-unit`, `ui-test`, `integration`). + +To put a feature branch on the shared dev environment before its PR +merges (e.g. to validate a UI flow against the real Caddy edge), run +the workflow manually: + +1. Push the branch (`git push gitea HEAD`). +2. Gitea UI → **Actions → Deploy · Dev → Run workflow**, pick the + feature ref. + +The deploy is idempotent — when the PR later merges into +`development`, the regular push trigger fires the same packaging and +healthcheck steps, overwriting whatever the manual dispatch left +behind. There is no separate state to clean up between the two paths. + ## Relationship to other infrastructure - `tools/local-dev/` — single-developer playground, host-port mapped, Vite dev server on the side. Recommended for active UI work. -- `tools/local-ci/` — Gitea + act runner for **fallback** workflow - testing without `gitea.lan`. Optional, not part of the per-stage CI - gate anymore. - `.gitea/workflows/dev-deploy.yaml` — the CI side of this stack: builds images, seeds the UI volume, runs `docker compose up -d` on every merge into `development`. The Makefile in this directory is diff --git a/tools/dev-deploy/docker-compose.yml b/tools/dev-deploy/docker-compose.yml index 0962b3e..dbc0cc1 100644 --- a/tools/dev-deploy/docker-compose.yml +++ b/tools/dev-deploy/docker-compose.yml @@ -22,6 +22,8 @@ services: image: postgres:16-alpine container_name: galaxy-dev-postgres restart: unless-stopped + labels: + galaxy.stack: dev-deploy environment: POSTGRES_USER: galaxy POSTGRES_PASSWORD: galaxy @@ -41,6 +43,8 @@ services: image: redis:7-alpine container_name: galaxy-dev-redis restart: unless-stopped + labels: + galaxy.stack: dev-deploy command: - redis-server - --requirepass @@ -62,6 +66,8 @@ services: image: axllent/mailpit:v1.21 container_name: galaxy-dev-mailpit restart: unless-stopped + labels: + galaxy.stack: dev-deploy networks: - galaxy-internal healthcheck: @@ -78,6 +84,8 @@ services: image: galaxy/backend:dev container_name: galaxy-dev-backend restart: unless-stopped + labels: + galaxy.stack: dev-deploy user: "0:0" depends_on: galaxy-postgres: @@ -94,6 +102,7 @@ services: BACKEND_SMTP_FROM: "galaxy-backend@galaxy.lan" BACKEND_SMTP_TLS_MODE: none BACKEND_DOCKER_NETWORK: galaxy-dev-internal + BACKEND_STACK_LABEL: dev-deploy BACKEND_GAME_STATE_ROOT: ${GALAXY_DEV_GAME_STATE_DIR} BACKEND_GEOIP_DB_PATH: /var/lib/galaxy/geoip.mmdb BACKEND_NOTIFICATION_ADMIN_EMAIL: admin@galaxy.lan @@ -152,6 +161,8 @@ services: image: galaxy/gateway:dev container_name: galaxy-dev-api restart: unless-stopped + labels: + galaxy.stack: dev-deploy depends_on: galaxy-backend: condition: service_healthy @@ -209,6 +220,8 @@ services: image: caddy:2.11.2-alpine container_name: galaxy-dev-caddy restart: unless-stopped + labels: + galaxy.stack: dev-deploy depends_on: galaxy-api: condition: service_healthy @@ -225,6 +238,8 @@ networks: name: galaxy-dev-internal driver: bridge internal: false + labels: + galaxy.stack: dev-deploy edge: name: ${GALAXY_EDGE_NETWORK:-edge} external: true @@ -232,7 +247,13 @@ networks: volumes: galaxy-dev-postgres-data: name: galaxy-dev-postgres-data + labels: + galaxy.stack: dev-deploy galaxy-dev-caddy-data: name: galaxy-dev-caddy-data + labels: + galaxy.stack: dev-deploy galaxy-dev-ui-dist: name: galaxy-dev-ui-dist + labels: + galaxy.stack: dev-deploy diff --git a/tools/local-ci/.gitignore b/tools/local-ci/.gitignore deleted file mode 100644 index 4c49bd7..0000000 --- a/tools/local-ci/.gitignore +++ /dev/null @@ -1 +0,0 @@ -.env diff --git a/tools/local-ci/Makefile b/tools/local-ci/Makefile deleted file mode 100644 index 82be826..0000000 --- a/tools/local-ci/Makefile +++ /dev/null @@ -1,42 +0,0 @@ -.PHONY: help up down logs status clean push - -.DEFAULT_GOAL := help - -COMPOSE := docker compose -GITEA_USER := galaxy -GITEA_PASS := galaxy-dev -REPO_NAME := galaxy -REMOTE_NAME := local-gitea -REPO_ROOT := $(realpath $(CURDIR)/../..) -GIT := git -C $(REPO_ROOT) -REMOTE_URL := http://$(GITEA_USER):$(GITEA_PASS)@localhost:3000/$(GITEA_USER)/$(REPO_NAME).git - -help: - @echo "Local Gitea CI for galaxy:" - @echo " make up Bring up Gitea + runner (idempotent)" - @echo " make down Stop both containers" - @echo " make logs Tail logs" - @echo " make status Show container status" - @echo " make push Push current branch to local Gitea" - @echo " make clean Stop and wipe all local state" - -up: - @./bootstrap.sh - -down: - $(COMPOSE) down - -logs: - $(COMPOSE) logs -f --tail=50 - -status: - $(COMPOSE) ps - -push: - @$(GIT) remote get-url $(REMOTE_NAME) >/dev/null 2>&1 || \ - $(GIT) remote add $(REMOTE_NAME) $(REMOTE_URL) - $(GIT) push $(REMOTE_NAME) HEAD - -clean: - $(COMPOSE) down -v - rm -f .env diff --git a/tools/local-ci/README.md b/tools/local-ci/README.md deleted file mode 100644 index 1115f01..0000000 --- a/tools/local-ci/README.md +++ /dev/null @@ -1,106 +0,0 @@ -# Local Gitea CI (fallback) - -> **Status:** fallback / opt-in. The primary CI target is now -> `gitea.lan` with its host-mode `act_runner`. The per-stage CI gate -> closes against `gitea.lan`, not against this stack. Use this -> directory when you want to validate `.gitea/workflows/*` without -> reaching `gitea.lan` — for example, when iterating on a workflow -> file from a flight without LAN access — or when isolating a runner -> issue from production-shaped infrastructure. - -Self-contained Gitea + Actions runner for verifying -`.gitea/workflows/*` honestly before pushing to `gitea.lan`. Runs -natively on arm64 (Apple Silicon) — every image below has an arm64 -variant, so Docker pulls the right architecture and the runner -executes workflow steps without QEMU emulation. - -## Prerequisites - -- Docker (Colima or Docker Desktop) -- `python3`, `curl`, `bash` — all built into macOS - -## First time - -```sh -make -C tools/local-ci up -``` - -This: - -1. brings up the Gitea container; -2. creates an admin user (`galaxy` / `galaxy-dev`); -3. creates the `galaxy/galaxy` repo; -4. fetches a runner registration token from the Gitea API; -5. brings up the runner with that token (the runner persists its - credentials in a Docker volume and ignores the token on subsequent - restarts). - -The script is idempotent — re-running it is safe. - -## Pushing a branch - -```sh -make -C tools/local-ci push -``` - -This adds a `local-gitea` remote on the first run and then pushes the -current `HEAD`. Equivalent manual flow: - -```sh -git remote add local-gitea \ - http://galaxy:galaxy-dev@localhost:3000/galaxy/galaxy.git -git push local-gitea HEAD -``` - -The Tier 1 workflow fires on `push` to any branch and the Tier 2 -workflow fires on tags matching `v*`. Watch runs at: - - - -## Operational targets - -| Target | What it does | -| ---------------- | -------------------------------------------- | -| `make up` | Bring up Gitea + runner (idempotent) | -| `make down` | Stop both containers (state preserved) | -| `make logs` | Tail logs from both containers | -| `make status` | Show container status | -| `make push` | Push current `HEAD` to local Gitea | -| `make clean` | Stop and wipe all local state (full reset) | - -## What's in the box - -| Component | Image | Role | -| ---------- | ---------------------------------- | ------------------------------------------- | -| Gitea | `gitea/gitea:1.23` | Server with SQLite backend | -| act_runner | `gitea/act_runner:0.6.1` | Single-capacity runner registered on boot | -| Workflow | `catthehacker/ubuntu:act-latest` | Image spawned per job (multi-arch) | - -The runner mounts the host Docker socket and spawns workflow -containers on the same Docker network as Gitea, so -`actions/checkout` reaches the server at `http://gitea:3000` from -inside spawned containers. - -## Caveats - -- Gitea's `ROOT_URL` is set to `http://gitea:3000/` so spawned - workflow containers reach the server through the compose network. - The web UI works at `http://localhost:3000` via port mapping, but - copy-paste URLs in the UI may show `gitea:3000` instead of - `localhost:3000`. Harmless for local dev; switch the host part by - hand when copying. -- The runner is single-capacity (`runner.capacity: 1` in - `config.yaml`). Concurrent jobs queue. Bump if you need parallel - jobs. -- First push from a fresh checkout uploads the full repo history - (~tens of MB). Subsequent pushes are deltas. -- `actions/upload-artifact@v4` requires Gitea ≥ 1.21 — we pin - `1.23` to stay above the cutoff. -- Workflow steps run as `root` inside the spawned container; this - matches the upstream catthehacker behaviour. Keep that in mind if - you add steps that touch host-mounted directories. -- On Apple Silicon the runner image and its catthehacker child run - natively as arm64. Some pre-built tools that ship in the image are - amd64-only and would fall back to QEMU; `setup-go`, `setup-node`, - and `pnpm/action-setup` all download arm64 binaries themselves, so - the workflow steps we care about stay native. diff --git a/tools/local-ci/bootstrap.sh b/tools/local-ci/bootstrap.sh deleted file mode 100755 index 7e81dc1..0000000 --- a/tools/local-ci/bootstrap.sh +++ /dev/null @@ -1,86 +0,0 @@ -#!/usr/bin/env bash -# Bring up Gitea, create the admin user and the galaxy/galaxy repo, -# fetch a runner registration token, bring up the runner. -# Idempotent — re-runnable. -set -euo pipefail - -cd "$(dirname "$0")" - -GITEA_USER=galaxy -GITEA_PASS=galaxy-dev -GITEA_EMAIL=galaxy@local -REPO_NAME=galaxy -GITEA_URL=http://localhost:3000 - -echo ">>> Bringing up Gitea..." -docker compose up -d gitea - -echo ">>> Waiting for Gitea API..." -for _ in $(seq 1 120); do - if curl -fsS "${GITEA_URL}/api/v1/version" >/dev/null 2>&1; then - echo "Gitea is up." - break - fi - sleep 1 -done - -if ! curl -fsS "${GITEA_URL}/api/v1/version" >/dev/null 2>&1; then - echo "Gitea did not come up within 120 seconds." >&2 - docker compose logs gitea | tail -30 >&2 - exit 1 -fi - -echo ">>> Creating admin user (idempotent)..." -docker compose exec -T gitea su git -c " - gitea admin user create \ - --username ${GITEA_USER} \ - --password ${GITEA_PASS} \ - --email ${GITEA_EMAIL} \ - --admin \ - --must-change-password=false 2>&1 || true -" - -echo ">>> Creating repo (idempotent)..." -HTTP_CODE=$(curl -s -o /dev/null -w '%{http_code}' \ - -u "${GITEA_USER}:${GITEA_PASS}" \ - -H "Content-Type: application/json" \ - -d "{\"name\":\"${REPO_NAME}\",\"private\":true,\"auto_init\":false}" \ - "${GITEA_URL}/api/v1/user/repos") -case "${HTTP_CODE}" in - 201) echo "Repo created." ;; - 409) echo "Repo already exists." ;; - *) - echo "Unexpected response (${HTTP_CODE}) creating repo." >&2 - exit 1 - ;; -esac - -echo ">>> Fetching runner registration token..." -RUNNER_TOKEN=$(curl -fsS \ - -u "${GITEA_USER}:${GITEA_PASS}" \ - "${GITEA_URL}/api/v1/admin/runners/registration-token" \ - | python3 -c "import json, sys; print(json.load(sys.stdin)['token'])") - -# act_runner uses RUNNER_TOKEN only on the first boot. After registration -# it persists credentials in the named runner-data volume (/data/.runner) -# and ignores the env token on subsequent restarts. Writing a fresh token -# every time is harmless. -echo "RUNNER_TOKEN=${RUNNER_TOKEN}" > .env - -echo ">>> Bringing up runner..." -docker compose up -d runner - -cat </dev/null || true - git push local-gitea HEAD - - open http://localhost:3000/${GITEA_USER}/${REPO_NAME}/actions - -Or use \`make push\` from this directory. -EOF diff --git a/tools/local-ci/config.yaml b/tools/local-ci/config.yaml deleted file mode 100644 index 8f34468..0000000 --- a/tools/local-ci/config.yaml +++ /dev/null @@ -1,35 +0,0 @@ -# act_runner configuration. -# -# The `ubuntu-latest` label is mapped to catthehacker/ubuntu:act-latest, -# which is multi-arch — Docker on Apple Silicon pulls the arm64 variant -# and runs it natively (no QEMU). The same image is what `act` uses -# locally, so workflows behave the same. - -log: - level: info - -runner: - file: /data/.runner - capacity: 1 - fetch_timeout: 5s - fetch_interval: 2s - labels: - - "ubuntu-latest:docker://catthehacker/ubuntu:act-latest" - -cache: - enabled: true - dir: /data/cache - -container: - # Spawned workflow containers join the same network as Gitea so - # actions/checkout and other steps can reach the server at - # http://gitea:3000. - network: galaxy-local-gitea-net - privileged: false - options: "" - workdir_parent: "" - valid_volumes: [] - force_pull: false - -host: - workdir_parent: "" diff --git a/tools/local-ci/docker-compose.override.yml b/tools/local-ci/docker-compose.override.yml deleted file mode 100644 index f1555ff..0000000 --- a/tools/local-ci/docker-compose.override.yml +++ /dev/null @@ -1,16 +0,0 @@ -# Local-only override: this developer's host already runs another -# Gitea instance bound to 0.0.0.0:3000 and 0.0.0.0:2222, so the -# default port mappings in docker-compose.yml conflict. Remap the -# local-ci Gitea to 13000 (HTTP) and 12222 (SSH) on the host. The -# in-network ports stay 3000 / 22 — runners and workflow containers -# keep reaching Gitea by hostname through the compose network. -# -# This file is intentionally NOT committed to the repo; it captures -# per-host port allocation. Use `make -C tools/local-ci push` only -# after pointing the `local-gitea` git remote at the override port. - -services: - gitea: - ports: !override - - "13000:3000" - - "12222:22" diff --git a/tools/local-ci/docker-compose.yml b/tools/local-ci/docker-compose.yml deleted file mode 100644 index 2586dcb..0000000 --- a/tools/local-ci/docker-compose.yml +++ /dev/null @@ -1,78 +0,0 @@ -# Local Gitea + Actions runner for verifying .gitea/workflows/*. -# Runs natively on arm64 (Apple Silicon) — every image below is multi-arch. -# -# Browser: http://localhost:3000 -# API: http://localhost:3000/api/v1 -# Push URL: http://galaxy:galaxy-dev@localhost:3000/galaxy/galaxy.git -# Actions: http://localhost:3000/galaxy/galaxy/actions -# -# `bootstrap.sh` (or `make up`) brings everything up and registers the -# runner. State persists in named Docker volumes; `make clean` wipes them. - -services: - gitea: - image: gitea/gitea:1.23 - container_name: galaxy-local-gitea - restart: unless-stopped - environment: - USER_UID: "1000" - USER_GID: "1000" - GITEA__database__DB_TYPE: sqlite3 - GITEA__database__PATH: /data/gitea/gitea.db - # ROOT_URL uses the in-network hostname so the runner and spawned - # workflow containers reach Gitea through the compose network. - # The browser still works at http://localhost:3000 via the port - # mapping below; UI-generated copy URLs may show "gitea:3000", - # which is harmless for local dev. - GITEA__server__ROOT_URL: http://gitea:3000/ - GITEA__server__SSH_PORT: "2222" - GITEA__actions__ENABLED: "true" - GITEA__security__INSTALL_LOCK: "true" - GITEA__service__DISABLE_REGISTRATION: "true" - ports: - - "3000:3000" - - "2222:22" - volumes: - - gitea-data:/data - networks: - - gitea-net - healthcheck: - test: - - CMD-SHELL - - wget -q -O- http://localhost:3000/api/v1/version >/dev/null || exit 1 - interval: 5s - timeout: 3s - retries: 30 - start_period: 5s - - runner: - image: gitea/act_runner:0.6.1 - container_name: galaxy-local-runner - restart: unless-stopped - depends_on: - gitea: - condition: service_healthy - environment: - CONFIG_FILE: /config/config.yaml - GITEA_INSTANCE_URL: http://gitea:3000 - # Provided by bootstrap.sh in the .env file. After the first - # successful registration, act_runner persists credentials in - # /data/.runner and ignores this token on subsequent restarts. - GITEA_RUNNER_REGISTRATION_TOKEN: ${RUNNER_TOKEN:-} - GITEA_RUNNER_NAME: galaxy-local - volumes: - - /var/run/docker.sock:/var/run/docker.sock - - runner-data:/data - - ./config.yaml:/config/config.yaml:ro - networks: - - gitea-net - -networks: - gitea-net: - name: galaxy-local-gitea-net - -volumes: - gitea-data: - name: galaxy-local-gitea-data - runner-data: - name: galaxy-local-runner-data diff --git a/tools/local-dev/Makefile b/tools/local-dev/Makefile index d4444c8..4981f23 100644 --- a/tools/local-dev/Makefile +++ b/tools/local-dev/Makefile @@ -5,9 +5,16 @@ COMPOSE := docker compose REPO_ROOT := $(realpath $(CURDIR)/../..) ENGINE_IMAGE := galaxy-engine:local-dev -# Label set by the engine `Dockerfile` runtime stage; used to find -# engine containers spawned by backend's runtime that fall outside -# `docker compose down`'s scope. +# Engine containers spawned by backend's runtime fall outside the +# compose project. We identify them by two labels: +# STACK_LABEL — backend stamps this on every engine it spawns from +# this stack (see BACKEND_STACK_LABEL env in the +# compose file); +# ENGINE_LABEL — image-level OCI title baked into the engine +# Dockerfile. +# Both filters together select exactly this stack's engine containers +# and never compose-managed services or unrelated workloads. +STACK_LABEL := galaxy.stack=local-dev ENGINE_LABEL := org.opencontainers.image.title=galaxy-game-engine help: @@ -65,9 +72,11 @@ clean: stop-engines # cascade the game to `cancelled`. We only remove them as part of # `clean`, where the whole DB is wiped anyway. stop-engines: - @ids=$$(docker ps -aq --filter label=$(ENGINE_LABEL)); \ + @ids=$$(docker ps -aq \ + --filter "label=$(STACK_LABEL)" \ + --filter "label=$(ENGINE_LABEL)"); \ if [ -n "$$ids" ]; then \ - echo "stopping engine containers…"; \ + echo "stopping engine containers for $(STACK_LABEL)…"; \ docker rm -f $$ids >/dev/null; \ fi @@ -87,7 +96,9 @@ stop-engines: # cycles. prune-broken-engines: @ids=""; \ - for cid in $$(docker ps -aq --filter label=$(ENGINE_LABEL) 2>/dev/null); do \ + for cid in $$(docker ps -aq \ + --filter "label=$(STACK_LABEL)" \ + --filter "label=$(ENGINE_LABEL)" 2>/dev/null); do \ state=$$(docker inspect -f '{{.State.Status}}' $$cid 2>/dev/null); \ case "$$state" in \ running|restarting) ;; \ diff --git a/tools/local-dev/README.md b/tools/local-dev/README.md index d15a405..428172b 100644 --- a/tools/local-dev/README.md +++ b/tools/local-dev/README.md @@ -15,10 +15,10 @@ This stack is **not** a CI gate (the per-stage CI gate now lives on the **long-lived dev environment** at [`tools/dev-deploy/`](../dev-deploy/README.md), which is redeployed on every merge into `development` and is reachable as -`https://www.galaxy.lan` / `https://api.galaxy.lan`. The three stacks -(`tools/local-dev/`, `tools/dev-deploy/`, and the fallback -`tools/local-ci/`) coexist on the same host because every name — -compose project, container, network, volume — is distinct. +`https://www.galaxy.lan` / `https://api.galaxy.lan`. The two stacks +(`tools/local-dev/` and `tools/dev-deploy/`) coexist on the same host +because every name — compose project, container, network, volume — is +distinct. ## Bring it up @@ -203,8 +203,8 @@ make status docker compose ps images built on alpine (so `wget` is available for the compose healthchecks). The build stage mirrors `backend/Dockerfile` and `gateway/Dockerfile` exactly. -- `Makefile` — wrapper over `docker compose` that keeps the muscle - memory close to `tools/local-ci/`'s Makefile. +- `Makefile` — wrapper over `docker compose` with thin targets for the + most common dev cycles. - `.env` — committed defaults for the compose `${VAR:-}` expansions. Edit per-developer or override via your shell. - `keys/gateway-response.pem`, `keys/gateway-response.pub` — dev-only @@ -290,12 +290,13 @@ make status docker compose ps ## Relationship to other infrastructure -- `tools/local-ci/` — Gitea + Actions runner, replays - `.gitea/workflows/*` against a pushed branch. Different stack, - different purpose; coexists with local-dev on the same machine. +- `tools/dev-deploy/` — long-lived dev environment redeployed on every + merge into `development`; reachable at `https://www.galaxy.lan` / + `https://api.galaxy.lan`. Distinct compose project, container names, + network and volumes. - `integration/testenv/` — testcontainers harness used by - `make -C integration integration`. Uses the same images - (`backend/Dockerfile`, `gateway/Dockerfile`) at production - defaults; do not confuse with this local-dev stack, which carries + `make -C integration integration`. Uses the canonical + `backend/Dockerfile` / `gateway/Dockerfile` at production defaults; + do not confuse with this local-dev stack, which carries alpine-runtime images for ergonomics and the dev-mode auth override. diff --git a/tools/local-dev/docker-compose.yml b/tools/local-dev/docker-compose.yml index ac6d724..9e52c39 100644 --- a/tools/local-dev/docker-compose.yml +++ b/tools/local-dev/docker-compose.yml @@ -19,11 +19,15 @@ # can log in without touching Mailpit. Real codes still arrive in # Mailpit; both paths coexist. +name: galaxy-local-dev + services: postgres: image: postgres:16-alpine container_name: galaxy-local-dev-postgres restart: unless-stopped + labels: + galaxy.stack: local-dev environment: POSTGRES_USER: galaxy POSTGRES_PASSWORD: galaxy @@ -45,6 +49,8 @@ services: image: redis:7-alpine container_name: galaxy-local-dev-redis restart: unless-stopped + labels: + galaxy.stack: local-dev command: - redis-server - --requirepass @@ -68,6 +74,8 @@ services: image: axllent/mailpit:v1.21 container_name: galaxy-local-dev-mailpit restart: unless-stopped + labels: + galaxy.stack: local-dev ports: - "${LOCAL_DEV_MAILPIT_PORT:-8025}:8025" networks: @@ -86,6 +94,8 @@ services: image: galaxy/backend:local-dev container_name: galaxy-local-dev-backend restart: unless-stopped + labels: + galaxy.stack: local-dev user: "0:0" depends_on: postgres: @@ -102,6 +112,7 @@ services: BACKEND_SMTP_FROM: "galaxy-backend@galaxy.local" BACKEND_SMTP_TLS_MODE: none BACKEND_DOCKER_NETWORK: galaxy-local-dev-net + BACKEND_STACK_LABEL: local-dev BACKEND_GAME_STATE_ROOT: /tmp/galaxy-game-state BACKEND_GEOIP_DB_PATH: /var/lib/galaxy/geoip.mmdb BACKEND_NOTIFICATION_ADMIN_EMAIL: admin@galaxy.local @@ -144,6 +155,8 @@ services: image: galaxy/gateway:local-dev container_name: galaxy-local-dev-gateway restart: unless-stopped + labels: + galaxy.stack: local-dev depends_on: backend: condition: service_healthy @@ -205,7 +218,11 @@ services: networks: galaxy-net: name: galaxy-local-dev-net + labels: + galaxy.stack: local-dev volumes: postgres-data: name: galaxy-local-dev-postgres-data + labels: + galaxy.stack: local-dev diff --git a/ui/docs/testing.md b/ui/docs/testing.md index 3fe103c..396b4ac 100644 --- a/ui/docs/testing.md +++ b/ui/docs/testing.md @@ -106,8 +106,6 @@ addition to the real Mailpit code; see for the full runbook (regenerating the dev keypair, switching the mode off, troubleshooting common boot issues). -The local-dev stack is independent from the local-ci stack below; -they bind different ports and can run side by side. ## Synthetic reports for visual testing (DEV) @@ -159,92 +157,19 @@ record in the parser's `README.md` that the new field cannot be derived from legacy text. This keeps the synthetic-mode coverage in step with the contract as the UI grows. -## Local CI verification +## CI verification -`tools/local-ci/` ships a self-contained Gitea + Actions runner via -docker-compose so workflow changes are exercised end-to-end on a real -runner before pushing to a remote Gitea instance. On Apple Silicon -the runner and every spawned workflow container are arm64-native -(no QEMU). Full runbook lives in -[`../../tools/local-ci/README.md`](../../tools/local-ci/README.md); -the cheat sheet below covers the operations needed when working a -phase that touches CI. +Workflow changes are exercised on the primary CI host (`gitea.lan`). +Push the branch (`git push gitea …`), then open the run in the Gitea +UI to inspect the status and logs. See `CLAUDE.md` (`## Per-stage CI +gate`) for the per-stage workflow. -### Bring up / push / tear down - -```sh -make -C tools/local-ci up # idempotent: gitea + runner + admin user + repo -make -C tools/local-ci push # add `local-gitea` remote (first call) and push HEAD -make -C tools/local-ci status # docker compose ps -make -C tools/local-ci logs # tail container logs -make -C tools/local-ci down # stop, keep state -make -C tools/local-ci clean # stop and wipe volumes for a fresh start -``` - -Default credentials baked in: `galaxy:galaxy-dev` (admin user, also -the owner of the `galaxy/galaxy` repo). Web UI on -; runs at -. - -### Inspect a run from the shell - -The Gitea Actions API is on `http://localhost:3000/api/v1` with basic -auth. Useful for verifying a workflow change without opening the -browser: - -```sh -# Latest workflow runs — `status` is a human-readable string here: -# "running" / "success" / "failure" / "cancelled". -curl -s -u galaxy:galaxy-dev \ - 'http://localhost:3000/api/v1/repos/galaxy/galaxy/actions/tasks?limit=5' \ - | python3 -m json.tool - -# Tight one-liner for the latest run only: -curl -s -u galaxy:galaxy-dev \ - 'http://localhost:3000/api/v1/repos/galaxy/galaxy/actions/tasks?limit=1' \ - | python3 -c 'import json, sys; r=json.load(sys.stdin)["workflow_runs"][0]; print(r["run_number"], r["status"], r["display_title"])' -``` - -Step-by-step workflow output is stored zstd-compressed under -`/data/gitea/actions_log/galaxy/galaxy//.log.zst` -inside the gitea container: - -```sh -docker compose -f tools/local-ci/docker-compose.yml exec -T gitea sh -c ' - apk add --quiet zstd - zstdcat /data/gitea/actions_log/galaxy/galaxy/01/1.log.zst -' | less -``` - -`` is the run number, zero-padded to two digits -(`01`, `02`, …); `` is the 1-based index of the job -inside that run (only `1` for the current single-job workflows). - -### Typical phase workflow - -When a phase changes anything under `.gitea/workflows/` or surfaces -new tests in CI: - -1. Local sanity first — run the affected commands directly - (`pnpm test`, `pnpm exec playwright test`, the targeted - `go test ./...` slice). -2. Commit and `make -C tools/local-ci push`. -3. Poll the API for the latest run; once it leaves `running`, - inspect status. On failure pull the log via the snippet above. -4. Fix and repeat. The runner is always-on; each push triggers a - fresh run (test cache is cleared by `-count=1` so a green run is - honest). - -### Quick syntax-only dry-run with `act` - -For a sub-second check that the workflow YAML is well-formed and -action references resolve, without pulling images and without -running anything: +For a sub-second syntax check of a workflow YAML without pulling +images or running anything: ```sh act -W .gitea/workflows/ui-test.yaml -n push ``` `act` doesn't honour Gitea-specific behaviours (artifact storage, -secrets, run triggers). Use it for syntax checks; fall back to the -local Gitea above for honest end-to-end verification. +secrets, run triggers); use it only for syntax checks.