Files
galaxy-game/.gitea/workflows/dev-deploy.yaml
T
Ilia Denisov f70258849f fix(dev-deploy): seed geoip onto a named volume
`docker restart galaxy-dev-backend` failed with "not a directory"
after every dev-deploy workflow run. Root cause: the compose file
bind-mounted the geoip database via a relative path
(`../../pkg/geoip/test-data/test-data/GeoIP2-Country-Test.mmdb`).
When the Gitea runner invoked `docker compose up`, the path
resolved against the runner's ephemeral workspace under
`/home/runner/.cache/act/<hash>/hostexecutor/...`. The bind source
baked into the running container therefore pointed at that
ephemeral path; the runner deleted the workspace once the workflow
finished, and any later `docker restart` could not remount.

Replace the bind with a named volume `galaxy-dev-geoip-data`,
seeded at deploy time:

- `tools/dev-deploy/docker-compose.yml`: mount
  `galaxy-dev-geoip-data:/var/lib/galaxy:ro` instead of a relative
  bind. Declare the volume in the top-level `volumes:` block.

- `.gitea/workflows/dev-deploy.yaml`: new `Seed geoip volume` step
  (placed right after the existing UI-volume seed) copies the
  fixture from `pkg/geoip/test-data/test-data/` into the named
  volume via an ephemeral alpine container, the same pattern UI
  seeding already uses.

- `tools/dev-deploy/Makefile`: new `seed-geoip` target performs
  the same copy from the persistent checkout. `up` and `rebuild`
  now depend on it, so a hand-run `make -C tools/dev-deploy up`
  populates the volume without operator action.

- `tools/dev-deploy/README.md`: updated the make-targets table to
  list `seed-geoip`.

- `tools/dev-deploy/KNOWN-ISSUES.md`: the entry for the restart
  failure is downgraded to a "fixed" postmortem; the symptom,
  cause, and where the fix lives are kept for future reference.

Verification on the dev host (this branch checked out):

  $ make -C tools/dev-deploy up                # populates the volume, brings stack healthy
  $ docker restart galaxy-dev-backend          # used to error "not a directory"
  $ until [ "$(docker inspect -f '{{.State.Health.Status}}' galaxy-dev-backend)" = "healthy" ]; do sleep 2; done
  $ echo "ok"                                   # backend up 6s, healthy

The pre-existing sandbox engine `galaxy-game-80f3ce86-...` survived
both `make up` and `docker restart` untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:59:38 +02:00

163 lines
6.0 KiB
YAML

name: Deploy · Dev
# Builds the Galaxy stack and (re)deploys it into the long-lived dev
# environment on the host running this Gitea Actions runner. Triggered
# on every merge into `development`. Branch protections on `development`
# guarantee the commit already passed `go-unit`, `ui-test`, and
# `integration` as part of the PR that produced this push, so this
# workflow does not re-run those tests — it focuses on packaging and
# rollout.
#
# `workflow_dispatch` is also accepted so a developer can deploy any
# branch (typically a feature branch under active review) into the
# shared dev environment from the Gitea Actions UI without waiting for
# the PR to merge first. The deploy job picks up whatever the chosen
# ref is — same packaging + healthcheck steps as the merge path.
on:
push:
branches:
- development
paths:
- 'backend/**'
- 'gateway/**'
- 'game/**'
- 'pkg/**'
- 'ui/**'
- 'go.work'
- 'go.work.sum'
- 'tools/dev-deploy/**'
- '.gitea/workflows/dev-deploy.yaml'
- '!**/*.md'
workflow_dispatch: {}
jobs:
deploy:
runs-on: ubuntu-latest
defaults:
run:
shell: bash
steps:
- name: Checkout
uses: actions/checkout@v4
with:
submodules: recursive
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version-file: go.work
cache: true
- name: Set up pnpm
uses: pnpm/action-setup@v4
with:
version: 11.0.7
- name: Set up Node
uses: actions/setup-node@v4
with:
node-version: 22
cache: pnpm
cache-dependency-path: ui/pnpm-lock.yaml
- name: Install UI dependencies
working-directory: ui
run: pnpm install --frozen-lockfile
- name: Build UI frontend
working-directory: ui/frontend
env:
VITE_GATEWAY_BASE_URL: https://api.galaxy.lan
# Surface the synthetic-report loader and similar dev-only
# affordances in the long-lived dev bundle. The prod build
# path (`prod-build.yaml`) leaves this flag unset so the
# production bundle keeps the same affordances stripped.
VITE_GALAXY_DEV_AFFORDANCES: "true"
run: |
# The response-signing public key is committed in
# `.env.development` alongside its private counterpart in
# `tools/local-dev/keys/`. Pull it from there at build time so
# the production-mode bundle ships the same key the dev
# gateway uses to sign.
export VITE_GATEWAY_RESPONSE_PUBLIC_KEY="$(grep -E '^VITE_GATEWAY_RESPONSE_PUBLIC_KEY=' .env.development | cut -d= -f2)"
pnpm build
- name: Build galaxy-engine image
working-directory: ${{ gitea.workspace }}
run: |
docker build \
-t galaxy-engine:dev \
-f game/Dockerfile \
.
- name: Build backend + gateway images
working-directory: tools/dev-deploy
run: |
docker compose build galaxy-backend galaxy-api
- name: Seed UI volume
run: |
docker volume create galaxy-dev-ui-dist >/dev/null
docker run --rm \
-v galaxy-dev-ui-dist:/dst \
-v "${{ gitea.workspace }}/ui/frontend/build:/src:ro" \
alpine sh -c 'rm -rf /dst/* /dst/.??* 2>/dev/null; cp -a /src/. /dst/'
- name: Seed geoip volume
run: |
# Copy the GeoIP test fixture into a named volume so the
# backend can mount it as /var/lib/galaxy. A bind-mount with
# a relative path would resolve against this runner's
# ephemeral workspace under /home/runner/.cache/act/<hash>/,
# which the runner deletes once the workflow ends — the next
# `docker restart galaxy-dev-backend` would then fail with
# "not a directory" because the mount source vanished.
docker volume create galaxy-dev-geoip-data >/dev/null
docker run --rm \
-v galaxy-dev-geoip-data:/dst \
-v "${{ gitea.workspace }}/pkg/geoip/test-data/test-data:/src:ro" \
alpine sh -c 'cp /src/GeoIP2-Country-Test.mmdb /dst/geoip.mmdb'
- name: Reap stray dev-deploy containers
run: |
# Remove any non-running compose-managed containers from
# earlier deploys before `compose up`. Filter by the stack
# label so we never touch unrelated workloads on the same
# daemon. Running containers (incl. engine instances backend
# spawned itself with the same label) are left intact —
# those are reattached by the backend reconciler on boot.
ids=$(docker ps -aq \
--filter "label=galaxy.stack=dev-deploy" \
--filter "status=exited" \
--filter "status=created" \
--filter "status=dead")
if [ -n "$ids" ]; then
echo "reaping: $ids"
docker rm -f $ids
fi
- name: Bring up the stack
working-directory: tools/dev-deploy
run: |
# Resolve in the shell, not in YAML expressions — `env.HOME`
# is empty at the workflow-evaluation stage.
export GALAXY_DEV_GAME_STATE_DIR="$HOME/.galaxy-dev/game-state"
mkdir -p "$GALAXY_DEV_GAME_STATE_DIR"
docker compose up -d --wait --remove-orphans
- name: Probe the stack
run: |
set -e
# Use --resolve so the probe goes through the same routing as
# a browser on the host: the host Caddy on :443 (which has
# `tls internal`) terminates and forwards into the edge
# network. We accept the host's internal CA via -k because
# the runner image has no reason to trust it.
curl -sk --max-time 10 https://api.galaxy.lan/healthz \
| tee /tmp/healthz
test -s /tmp/healthz
curl -sk --max-time 10 -o /dev/null -w '%{http_code}\n' \
https://www.galaxy.lan/ | tee /tmp/www_status
grep -qE '^(200|304)$' /tmp/www_status