Files
galaxy-game/tools/dev-deploy
Ilia Denisov a9087691a3
Tests · Go / test (push) Successful in 2m6s
Tests · Go / test (pull_request) Successful in 3m1s
Tests · Integration / integration (pull_request) Successful in 1m42s
chore(ci): tidy CI/dev infra — drop local-ci, lift migration rule, scope by galaxy.stack label
Five connected cleanups across the dev/CI infrastructure:

1. Drop tools/local-ci/. The standalone Gitea + act_runner stack was
   the legacy "offline workflow validator"; the per-stage CI gate now
   runs on gitea.lan and the directory was only retained as a
   fallback. Removing it leaves no operational dependency: backend,
   gateway, and game code have no references; documentation that
   pointed at it (CLAUDE.md, docs/ARCHITECTURE.md, ui/docs/testing.md,
   tools/dev-deploy/README.md, tools/local-dev/README.md) is updated
   in this same change. Historical "Verified on local-ci run N"
   markers in ui/PLAN.md are preserved unchanged.

2. Lift the pre-production single-migration rule. The rule forced
   every schema delta into 00001_init.sql and required a manual
   make clean-data wipe on every backward-incompatible change in
   tools/dev-deploy/. Future schema deltas now land as additive
   sequence-numbered files (00002_*.sql, …) that goose applies
   automatically on backend startup; 00001_init.sql becomes an
   immutable baseline. Authoring conventions live in
   backend/internal/postgres/migrations/README.md. The chain may be
   squashed back into a fresh 00001 as a deliberate one-time
   operation before the first production deployment.

3. Document the deployment cadence. The dev environment is
   single-tenant: pushes to feature/* run the test workflows
   (go-unit, ui-test, integration) only; dev-deploy.yaml fires on
   push to development. A workflow_dispatch override on
   dev-deploy.yaml lets a developer preview a feature branch on the
   shared dev environment before merge; the next merge into
   development overwrites the manual deploy idempotently.

4. Scope compose-managed resources by an explicit
   galaxy.stack=<local-dev|dev-deploy> label. Both compose files
   stamp the label on every service, network, and named volume.
   Makefiles in tools/local-dev/ and tools/dev-deploy/ filter their
   engine-cleanup operations by (stack-label AND engine OCI title)
   so they never touch unrelated workloads on the same daemon.
   dev-deploy.yaml gains a pre-`compose up` step that reaps stale
   exited/dead containers under the dev-deploy stack label.

5. Backend now stamps the same galaxy.stack=<value> label on every
   engine container it spawns, sourced from a new BACKEND_STACK_LABEL
   env var (empty → label not applied; legacy-safe). Both compose
   files set it to their stack name (local-dev / dev-deploy). The
   contract is recorded in docs/ARCHITECTURE.md under
   "Container labels". A package-level test in
   backend/internal/runtime exercises both the label-present and
   label-absent paths.

No tests intentionally regressed: go test ./backend/internal/{config,
runtime,dockerclient} is green, both compose files validate cleanly,
and the backend, gateway, and game modules all build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 23:32:42 +02:00
..

tools/dev-deploy/ — long-lived Galaxy dev environment

A docker-compose stack that runs the Galaxy backend, gateway, supporting services, and a small Caddy in front of them, reachable through the host Caddy at https://www.galaxy.lan and https://api.galaxy.lan. Used by the dev-deploy.yaml Gitea Actions workflow as the canonical dev target on every merge into the development branch, and runnable by hand through this Makefile for local debugging of the deploy plumbing itself.

This stack is not the developer's primary playground for UI work — that role still belongs to tools/local-dev/, which is faster (Vite HMR, host-side dev server) and isolated to one developer. The two stacks coexist on the same host because every name is distinct:

tools/local-dev/ tools/dev-deploy/
Compose project local-dev galaxy-dev
Container prefix galaxy-local-dev-* galaxy-dev-*
Network galaxy-local-dev-net galaxy-dev-internal, edge
Volumes galaxy-local-dev-* galaxy-dev-*
Host ports 5433/6380/8025/8080/9090 none (only edge network)
Game state /tmp/galaxy-game-state /var/lib/galaxy-dev/game-state
Engine image galaxy-engine:local-dev galaxy-engine:dev

Prerequisites

The host must already provide:

  • Docker daemon reachable as the user running make (member of the docker group, no sudo).

  • An external bridge network named edge (or whatever GALAXY_EDGE_NETWORK overrides to):

    docker network create edge
    
  • A host Caddy listening on :80/:443, attached to the edge network, and proxying www.galaxy.lan and api.galaxy.lan to galaxy-caddy:80. Example fragment for the host Caddyfile:

    www.galaxy.lan, api.galaxy.lan {
        tls internal
        reverse_proxy galaxy-caddy:80
    }
    
  • Game-state directory writable by the user running make. Default is ${HOME}/.galaxy-dev/game-state; make up creates it on demand. Override by exporting GALAXY_DEV_GAME_STATE_DIR (e.g. to /var/lib/galaxy-dev/game-state once the host is provisioned for it).

Bring it up

make -C tools/dev-deploy up

up (re)builds the local-dev backend and gateway images, makes sure the engine image galaxy-engine:dev exists, and waits for healthchecks. It does not seed the UI volume — that is normally done by CI. The first time you run by hand:

make -C tools/dev-deploy seed-ui
make -C tools/dev-deploy up
make -C tools/dev-deploy health

seed-ui runs pnpm build in ui/frontend/, then copies the resulting build/ tree into the galaxy-dev-ui-dist volume. Subsequent CI deploys overwrite this volume automatically.

Daily flow

make -C tools/dev-deploy rebuild   # rebuild backend/gateway images + up
make -C tools/dev-deploy logs      # tail compose logs
make -C tools/dev-deploy health    # probe https://*.galaxy.lan
make -C tools/dev-deploy down      # stop, keep state

State persists in named volumes between up/down cycles. The development branch keeps the dev environment continuously usable — games created last week survive into this week unless somebody calls make clean-data.

Logging in

The same dev-mode email-code override as tools/local-dev/ applies, and the dev-deploy compose ships with it enabled by default:

  1. Enter dev@galaxy.lan (or whatever BACKEND_DEV_SANDBOX_EMAIL resolves to) in the login form.
  2. Submit 123456 as the code — the docker-compose default for BACKEND_AUTH_DEV_FIXED_CODE is 123456, so the bcrypt-hashed email code stays a fallback. To force real Mailpit codes (e.g. for mail-flow QA), set BACKEND_AUTH_DEV_FIXED_CODE= (empty) in a local .env and make rebuild.

The fixed-code override is rejected by production env loaders, so it cannot leak into the prod environment.

Networking

Browser
   │  https://www.galaxy.lan, https://api.galaxy.lan
   ▼
host-Caddy (:80, :443, TLS, attached to `edge` network)
   │  reverse_proxy *.galaxy.lan → galaxy-caddy:80
   ▼
galaxy-caddy  (networks: edge + galaxy-dev-internal)
   │  www.galaxy.lan → file_server /srv/galaxy-ui (volume galaxy-dev-ui-dist)
   │  api.galaxy.lan → reverse_proxy galaxy-api:8080
   ▼
galaxy-dev-internal
   ├─ galaxy-api      (gateway:   :8080 REST, :9090 gRPC)
   ├─ galaxy-backend  (backend:   :8080 HTTP, :8081 gRPC push)
   ├─ galaxy-postgres (postgres:  :5432)
   ├─ galaxy-redis    (redis:     :6379)
   ├─ galaxy-mailpit  (mailpit:   :8025 UI, :1025 SMTP)
   └─ engine containers (spawned by backend on demand)

The compose project deliberately exposes no host ports. Diagnostics that used to go through localhost:8025 etc. now go through the container network: docker compose -f tools/dev-deploy/docker-compose.yml exec galaxy-mailpit wget -qO- localhost:8025/messages and similar.

Persistent state and schema changes

The dev Postgres volume galaxy-dev-postgres-data survives redeploys. Schema deltas land as additive, sequence-numbered migration files (backend/internal/postgres/migrations/0000N_*.sql) and pressly/goose applies them on backend startup without operator action.

Use make -C tools/dev-deploy clean-data only when you deliberately want a fresh database (debugging schema drift, exercising the bootstrap path from scratch, etc.):

make -C tools/dev-deploy clean-data
make -C tools/dev-deploy up

The same volume-persistence model applies to tools/local-dev/.

Make targets

make up             Build images, ensure engine image, bring stack up (waits for health)
make rebuild        Rebuild backend / gateway images (ignores cache), then up
make seed-ui        pnpm build + load build/ into galaxy-dev-ui-dist volume
make build-engine   Build galaxy-engine:dev (no-op if image already present)
make down           Stop containers, keep named volumes
make logs           Tail compose logs
make status         docker compose ps
make health         curl https://www.galaxy.lan + https://api.galaxy.lan/healthz
make psql           psql as galaxy@galaxy_backend
make clean-data     Stop everything and wipe volumes + game-state dir

Files

  • docker-compose.yml — six services: postgres, redis, mailpit, galaxy-backend, galaxy-api, galaxy-caddy. Reuses the alpine-runtime Dockerfiles from ../local-dev/ so the backend healthcheck can run wget. Reuses the dev keypair from ../local-dev/keys/.
  • Caddyfile.dev — the application-routing Caddy config, mounted into galaxy-caddy at /etc/caddy/Caddyfile.
  • Caddyfile.prod — placeholder for a future prod deployment; not used by this compose.
  • Makefile — wrapper over docker compose with helpers for engine, UI seeding, health probes, and full wipe.
  • .env.example — non-secret defaults for the compose ${VAR:-} expansions. Copy to .env if you want host-local overrides.

Known issues

See KNOWN-ISSUES.md for symptoms that surface in the long-lived dev environment but are not yet fixed (currently: the sandbox game flipping to cancelled after a redispatch).

Deployment cadence

This environment is single-tenant: one live deployment, redeployed by the dev-deploy.yaml workflow on every merge into development. PR branches do not auto-deploy here — pushes to feature/* only run the test workflows (go-unit, ui-test, integration).

To put a feature branch on the shared dev environment before its PR merges (e.g. to validate a UI flow against the real Caddy edge), run the workflow manually:

  1. Push the branch (git push gitea HEAD).
  2. Gitea UI → Actions → Deploy · Dev → Run workflow, pick the feature ref.

The deploy is idempotent — when the PR later merges into development, the regular push trigger fires the same packaging and healthcheck steps, overwriting whatever the manual dispatch left behind. There is no separate state to clean up between the two paths.

Relationship to other infrastructure

  • tools/local-dev/ — single-developer playground, host-port mapped, Vite dev server on the side. Recommended for active UI work.
  • .gitea/workflows/dev-deploy.yaml — the CI side of this stack: builds images, seeds the UI volume, runs docker compose up -d on every merge into development. The Makefile in this directory is what that workflow ultimately calls into.