chore(ci): tidy CI/dev infra — drop local-ci, lift migration rule, scope by galaxy.stack label
Tests · Go / test (push) Successful in 2m6s
Tests · Go / test (pull_request) Successful in 3m1s
Tests · Integration / integration (pull_request) Successful in 1m42s

Five connected cleanups across the dev/CI infrastructure:

1. Drop tools/local-ci/. The standalone Gitea + act_runner stack was
   the legacy "offline workflow validator"; the per-stage CI gate now
   runs on gitea.lan and the directory was only retained as a
   fallback. Removing it leaves no operational dependency: backend,
   gateway, and game code have no references; documentation that
   pointed at it (CLAUDE.md, docs/ARCHITECTURE.md, ui/docs/testing.md,
   tools/dev-deploy/README.md, tools/local-dev/README.md) is updated
   in this same change. Historical "Verified on local-ci run N"
   markers in ui/PLAN.md are preserved unchanged.

2. Lift the pre-production single-migration rule. The rule forced
   every schema delta into 00001_init.sql and required a manual
   make clean-data wipe on every backward-incompatible change in
   tools/dev-deploy/. Future schema deltas now land as additive
   sequence-numbered files (00002_*.sql, …) that goose applies
   automatically on backend startup; 00001_init.sql becomes an
   immutable baseline. Authoring conventions live in
   backend/internal/postgres/migrations/README.md. The chain may be
   squashed back into a fresh 00001 as a deliberate one-time
   operation before the first production deployment.

3. Document the deployment cadence. The dev environment is
   single-tenant: pushes to feature/* run the test workflows
   (go-unit, ui-test, integration) only; dev-deploy.yaml fires on
   push to development. A workflow_dispatch override on
   dev-deploy.yaml lets a developer preview a feature branch on the
   shared dev environment before merge; the next merge into
   development overwrites the manual deploy idempotently.

4. Scope compose-managed resources by an explicit
   galaxy.stack=<local-dev|dev-deploy> label. Both compose files
   stamp the label on every service, network, and named volume.
   Makefiles in tools/local-dev/ and tools/dev-deploy/ filter their
   engine-cleanup operations by (stack-label AND engine OCI title)
   so they never touch unrelated workloads on the same daemon.
   dev-deploy.yaml gains a pre-`compose up` step that reaps stale
   exited/dead containers under the dev-deploy stack label.

5. Backend now stamps the same galaxy.stack=<value> label on every
   engine container it spawns, sourced from a new BACKEND_STACK_LABEL
   env var (empty → label not applied; legacy-safe). Both compose
   files set it to their stack name (local-dev / dev-deploy). The
   contract is recorded in docs/ARCHITECTURE.md under
   "Container labels". A package-level test in
   backend/internal/runtime exercises both the label-present and
   label-absent paths.

No tests intentionally regressed: go test ./backend/internal/{config,
runtime,dockerclient} is green, both compose files validate cleanly,
and the backend, gateway, and game modules all build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Ilia Denisov
2026-05-18 23:32:42 +02:00
parent 5eec7013ba
commit a9087691a3
23 changed files with 325 additions and 532 deletions
+8 -83
View File
@@ -106,8 +106,6 @@ addition to the real Mailpit code; see
for the full runbook (regenerating the dev keypair, switching the
mode off, troubleshooting common boot issues).
The local-dev stack is independent from the local-ci stack below;
they bind different ports and can run side by side.
## Synthetic reports for visual testing (DEV)
@@ -159,92 +157,19 @@ record in the parser's `README.md` that the new field cannot be
derived from legacy text. This keeps the synthetic-mode coverage in
step with the contract as the UI grows.
## Local CI verification
## CI verification
`tools/local-ci/` ships a self-contained Gitea + Actions runner via
docker-compose so workflow changes are exercised end-to-end on a real
runner before pushing to a remote Gitea instance. On Apple Silicon
the runner and every spawned workflow container are arm64-native
(no QEMU). Full runbook lives in
[`../../tools/local-ci/README.md`](../../tools/local-ci/README.md);
the cheat sheet below covers the operations needed when working a
phase that touches CI.
Workflow changes are exercised on the primary CI host (`gitea.lan`).
Push the branch (`git push gitea …`), then open the run in the Gitea
UI to inspect the status and logs. See `CLAUDE.md` (`## Per-stage CI
gate`) for the per-stage workflow.
### Bring up / push / tear down
```sh
make -C tools/local-ci up # idempotent: gitea + runner + admin user + repo
make -C tools/local-ci push # add `local-gitea` remote (first call) and push HEAD
make -C tools/local-ci status # docker compose ps
make -C tools/local-ci logs # tail container logs
make -C tools/local-ci down # stop, keep state
make -C tools/local-ci clean # stop and wipe volumes for a fresh start
```
Default credentials baked in: `galaxy:galaxy-dev` (admin user, also
the owner of the `galaxy/galaxy` repo). Web UI on
<http://localhost:3000>; runs at
<http://localhost:3000/galaxy/galaxy/actions>.
### Inspect a run from the shell
The Gitea Actions API is on `http://localhost:3000/api/v1` with basic
auth. Useful for verifying a workflow change without opening the
browser:
```sh
# Latest workflow runs — `status` is a human-readable string here:
# "running" / "success" / "failure" / "cancelled".
curl -s -u galaxy:galaxy-dev \
'http://localhost:3000/api/v1/repos/galaxy/galaxy/actions/tasks?limit=5' \
| python3 -m json.tool
# Tight one-liner for the latest run only:
curl -s -u galaxy:galaxy-dev \
'http://localhost:3000/api/v1/repos/galaxy/galaxy/actions/tasks?limit=1' \
| python3 -c 'import json, sys; r=json.load(sys.stdin)["workflow_runs"][0]; print(r["run_number"], r["status"], r["display_title"])'
```
Step-by-step workflow output is stored zstd-compressed under
`/data/gitea/actions_log/galaxy/galaxy/<run_padded>/<job_index>.log.zst`
inside the gitea container:
```sh
docker compose -f tools/local-ci/docker-compose.yml exec -T gitea sh -c '
apk add --quiet zstd
zstdcat /data/gitea/actions_log/galaxy/galaxy/01/1.log.zst
' | less
```
`<run_padded>` is the run number, zero-padded to two digits
(`01`, `02`, …); `<job_index>` is the 1-based index of the job
inside that run (only `1` for the current single-job workflows).
### Typical phase workflow
When a phase changes anything under `.gitea/workflows/` or surfaces
new tests in CI:
1. Local sanity first — run the affected commands directly
(`pnpm test`, `pnpm exec playwright test`, the targeted
`go test ./...` slice).
2. Commit and `make -C tools/local-ci push`.
3. Poll the API for the latest run; once it leaves `running`,
inspect status. On failure pull the log via the snippet above.
4. Fix and repeat. The runner is always-on; each push triggers a
fresh run (test cache is cleared by `-count=1` so a green run is
honest).
### Quick syntax-only dry-run with `act`
For a sub-second check that the workflow YAML is well-formed and
action references resolve, without pulling images and without
running anything:
For a sub-second syntax check of a workflow YAML without pulling
images or running anything:
```sh
act -W .gitea/workflows/ui-test.yaml -n push
```
`act` doesn't honour Gitea-specific behaviours (artifact storage,
secrets, run triggers). Use it for syntax checks; fall back to the
local Gitea above for honest end-to-end verification.
secrets, run triggers); use it only for syntax checks.