chore(ci): tidy CI/dev infra — drop local-ci, lift migration rule, scope by galaxy.stack label
Tests · Go / test (push) Successful in 2m6s
Tests · Go / test (pull_request) Successful in 3m1s
Tests · Integration / integration (pull_request) Successful in 1m42s

Five connected cleanups across the dev/CI infrastructure:

1. Drop tools/local-ci/. The standalone Gitea + act_runner stack was
   the legacy "offline workflow validator"; the per-stage CI gate now
   runs on gitea.lan and the directory was only retained as a
   fallback. Removing it leaves no operational dependency: backend,
   gateway, and game code have no references; documentation that
   pointed at it (CLAUDE.md, docs/ARCHITECTURE.md, ui/docs/testing.md,
   tools/dev-deploy/README.md, tools/local-dev/README.md) is updated
   in this same change. Historical "Verified on local-ci run N"
   markers in ui/PLAN.md are preserved unchanged.

2. Lift the pre-production single-migration rule. The rule forced
   every schema delta into 00001_init.sql and required a manual
   make clean-data wipe on every backward-incompatible change in
   tools/dev-deploy/. Future schema deltas now land as additive
   sequence-numbered files (00002_*.sql, …) that goose applies
   automatically on backend startup; 00001_init.sql becomes an
   immutable baseline. Authoring conventions live in
   backend/internal/postgres/migrations/README.md. The chain may be
   squashed back into a fresh 00001 as a deliberate one-time
   operation before the first production deployment.

3. Document the deployment cadence. The dev environment is
   single-tenant: pushes to feature/* run the test workflows
   (go-unit, ui-test, integration) only; dev-deploy.yaml fires on
   push to development. A workflow_dispatch override on
   dev-deploy.yaml lets a developer preview a feature branch on the
   shared dev environment before merge; the next merge into
   development overwrites the manual deploy idempotently.

4. Scope compose-managed resources by an explicit
   galaxy.stack=<local-dev|dev-deploy> label. Both compose files
   stamp the label on every service, network, and named volume.
   Makefiles in tools/local-dev/ and tools/dev-deploy/ filter their
   engine-cleanup operations by (stack-label AND engine OCI title)
   so they never touch unrelated workloads on the same daemon.
   dev-deploy.yaml gains a pre-`compose up` step that reaps stale
   exited/dead containers under the dev-deploy stack label.

5. Backend now stamps the same galaxy.stack=<value> label on every
   engine container it spawns, sourced from a new BACKEND_STACK_LABEL
   env var (empty → label not applied; legacy-safe). Both compose
   files set it to their stack name (local-dev / dev-deploy). The
   contract is recorded in docs/ARCHITECTURE.md under
   "Container labels". A package-level test in
   backend/internal/runtime exercises both the label-present and
   label-absent paths.

No tests intentionally regressed: go test ./backend/internal/{config,
runtime,dockerclient} is green, both compose files validate cleanly,
and the backend, gateway, and game modules all build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Ilia Denisov
2026-05-18 23:32:42 +02:00
parent 5eec7013ba
commit a9087691a3
23 changed files with 325 additions and 532 deletions
+17 -6
View File
@@ -5,9 +5,16 @@
COMPOSE := docker compose
REPO_ROOT := $(realpath $(CURDIR)/../..)
ENGINE_IMAGE := galaxy-engine:local-dev
# Label set by the engine `Dockerfile` runtime stage; used to find
# engine containers spawned by backend's runtime that fall outside
# `docker compose down`'s scope.
# Engine containers spawned by backend's runtime fall outside the
# compose project. We identify them by two labels:
# STACK_LABEL — backend stamps this on every engine it spawns from
# this stack (see BACKEND_STACK_LABEL env in the
# compose file);
# ENGINE_LABEL — image-level OCI title baked into the engine
# Dockerfile.
# Both filters together select exactly this stack's engine containers
# and never compose-managed services or unrelated workloads.
STACK_LABEL := galaxy.stack=local-dev
ENGINE_LABEL := org.opencontainers.image.title=galaxy-game-engine
help:
@@ -65,9 +72,11 @@ clean: stop-engines
# cascade the game to `cancelled`. We only remove them as part of
# `clean`, where the whole DB is wiped anyway.
stop-engines:
@ids=$$(docker ps -aq --filter label=$(ENGINE_LABEL)); \
@ids=$$(docker ps -aq \
--filter "label=$(STACK_LABEL)" \
--filter "label=$(ENGINE_LABEL)"); \
if [ -n "$$ids" ]; then \
echo "stopping engine containers"; \
echo "stopping engine containers for $(STACK_LABEL)"; \
docker rm -f $$ids >/dev/null; \
fi
@@ -87,7 +96,9 @@ stop-engines:
# cycles.
prune-broken-engines:
@ids=""; \
for cid in $$(docker ps -aq --filter label=$(ENGINE_LABEL) 2>/dev/null); do \
for cid in $$(docker ps -aq \
--filter "label=$(STACK_LABEL)" \
--filter "label=$(ENGINE_LABEL)" 2>/dev/null); do \
state=$$(docker inspect -f '{{.State.Status}}' $$cid 2>/dev/null); \
case "$$state" in \
running|restarting) ;; \
+13 -12
View File
@@ -15,10 +15,10 @@ This stack is **not** a CI gate (the per-stage CI gate now lives on
the **long-lived dev environment** at
[`tools/dev-deploy/`](../dev-deploy/README.md), which is redeployed on
every merge into `development` and is reachable as
`https://www.galaxy.lan` / `https://api.galaxy.lan`. The three stacks
(`tools/local-dev/`, `tools/dev-deploy/`, and the fallback
`tools/local-ci/`) coexist on the same host because every name —
compose project, container, network, volume — is distinct.
`https://www.galaxy.lan` / `https://api.galaxy.lan`. The two stacks
(`tools/local-dev/` and `tools/dev-deploy/`) coexist on the same host
because every name — compose project, container, network, volume — is
distinct.
## Bring it up
@@ -203,8 +203,8 @@ make status docker compose ps
images built on alpine (so `wget` is available for the compose
healthchecks). The build stage mirrors `backend/Dockerfile` and
`gateway/Dockerfile` exactly.
- `Makefile` — wrapper over `docker compose` that keeps the muscle
memory close to `tools/local-ci/`'s Makefile.
- `Makefile` — wrapper over `docker compose` with thin targets for the
most common dev cycles.
- `.env` — committed defaults for the compose `${VAR:-}`
expansions. Edit per-developer or override via your shell.
- `keys/gateway-response.pem`, `keys/gateway-response.pub` — dev-only
@@ -290,12 +290,13 @@ make status docker compose ps
## Relationship to other infrastructure
- `tools/local-ci/` — Gitea + Actions runner, replays
`.gitea/workflows/*` against a pushed branch. Different stack,
different purpose; coexists with local-dev on the same machine.
- `tools/dev-deploy/` — long-lived dev environment redeployed on every
merge into `development`; reachable at `https://www.galaxy.lan` /
`https://api.galaxy.lan`. Distinct compose project, container names,
network and volumes.
- `integration/testenv/` — testcontainers harness used by
`make -C integration integration`. Uses the same images
(`backend/Dockerfile`, `gateway/Dockerfile`) at production
defaults; do not confuse with this local-dev stack, which carries
`make -C integration integration`. Uses the canonical
`backend/Dockerfile` / `gateway/Dockerfile` at production defaults;
do not confuse with this local-dev stack, which carries
alpine-runtime images for ergonomics and the dev-mode auth
override.
+17
View File
@@ -19,11 +19,15 @@
# can log in without touching Mailpit. Real codes still arrive in
# Mailpit; both paths coexist.
name: galaxy-local-dev
services:
postgres:
image: postgres:16-alpine
container_name: galaxy-local-dev-postgres
restart: unless-stopped
labels:
galaxy.stack: local-dev
environment:
POSTGRES_USER: galaxy
POSTGRES_PASSWORD: galaxy
@@ -45,6 +49,8 @@ services:
image: redis:7-alpine
container_name: galaxy-local-dev-redis
restart: unless-stopped
labels:
galaxy.stack: local-dev
command:
- redis-server
- --requirepass
@@ -68,6 +74,8 @@ services:
image: axllent/mailpit:v1.21
container_name: galaxy-local-dev-mailpit
restart: unless-stopped
labels:
galaxy.stack: local-dev
ports:
- "${LOCAL_DEV_MAILPIT_PORT:-8025}:8025"
networks:
@@ -86,6 +94,8 @@ services:
image: galaxy/backend:local-dev
container_name: galaxy-local-dev-backend
restart: unless-stopped
labels:
galaxy.stack: local-dev
user: "0:0"
depends_on:
postgres:
@@ -102,6 +112,7 @@ services:
BACKEND_SMTP_FROM: "galaxy-backend@galaxy.local"
BACKEND_SMTP_TLS_MODE: none
BACKEND_DOCKER_NETWORK: galaxy-local-dev-net
BACKEND_STACK_LABEL: local-dev
BACKEND_GAME_STATE_ROOT: /tmp/galaxy-game-state
BACKEND_GEOIP_DB_PATH: /var/lib/galaxy/geoip.mmdb
BACKEND_NOTIFICATION_ADMIN_EMAIL: admin@galaxy.local
@@ -144,6 +155,8 @@ services:
image: galaxy/gateway:local-dev
container_name: galaxy-local-dev-gateway
restart: unless-stopped
labels:
galaxy.stack: local-dev
depends_on:
backend:
condition: service_healthy
@@ -205,7 +218,11 @@ services:
networks:
galaxy-net:
name: galaxy-local-dev-net
labels:
galaxy.stack: local-dev
volumes:
postgres-data:
name: galaxy-local-dev-postgres-data
labels:
galaxy.stack: local-dev