f70258849f
`docker restart galaxy-dev-backend` failed with "not a directory"
after every dev-deploy workflow run. Root cause: the compose file
bind-mounted the geoip database via a relative path
(`../../pkg/geoip/test-data/test-data/GeoIP2-Country-Test.mmdb`).
When the Gitea runner invoked `docker compose up`, the path
resolved against the runner's ephemeral workspace under
`/home/runner/.cache/act/<hash>/hostexecutor/...`. The bind source
baked into the running container therefore pointed at that
ephemeral path; the runner deleted the workspace once the workflow
finished, and any later `docker restart` could not remount.
Replace the bind with a named volume `galaxy-dev-geoip-data`,
seeded at deploy time:
- `tools/dev-deploy/docker-compose.yml`: mount
`galaxy-dev-geoip-data:/var/lib/galaxy:ro` instead of a relative
bind. Declare the volume in the top-level `volumes:` block.
- `.gitea/workflows/dev-deploy.yaml`: new `Seed geoip volume` step
(placed right after the existing UI-volume seed) copies the
fixture from `pkg/geoip/test-data/test-data/` into the named
volume via an ephemeral alpine container, the same pattern UI
seeding already uses.
- `tools/dev-deploy/Makefile`: new `seed-geoip` target performs
the same copy from the persistent checkout. `up` and `rebuild`
now depend on it, so a hand-run `make -C tools/dev-deploy up`
populates the volume without operator action.
- `tools/dev-deploy/README.md`: updated the make-targets table to
list `seed-geoip`.
- `tools/dev-deploy/KNOWN-ISSUES.md`: the entry for the restart
failure is downgraded to a "fixed" postmortem; the symptom,
cause, and where the fix lives are kept for future reference.
Verification on the dev host (this branch checked out):
$ make -C tools/dev-deploy up # populates the volume, brings stack healthy
$ docker restart galaxy-dev-backend # used to error "not a directory"
$ until [ "$(docker inspect -f '{{.State.Health.Status}}' galaxy-dev-backend)" = "healthy" ]; do sleep 2; done
$ echo "ok" # backend up 6s, healthy
The pre-existing sandbox engine `galaxy-game-80f3ce86-...` survived
both `make up` and `docker restart` untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
163 lines
6.0 KiB
YAML
163 lines
6.0 KiB
YAML
name: Deploy · Dev
|
|
|
|
# Builds the Galaxy stack and (re)deploys it into the long-lived dev
|
|
# environment on the host running this Gitea Actions runner. Triggered
|
|
# on every merge into `development`. Branch protections on `development`
|
|
# guarantee the commit already passed `go-unit`, `ui-test`, and
|
|
# `integration` as part of the PR that produced this push, so this
|
|
# workflow does not re-run those tests — it focuses on packaging and
|
|
# rollout.
|
|
#
|
|
# `workflow_dispatch` is also accepted so a developer can deploy any
|
|
# branch (typically a feature branch under active review) into the
|
|
# shared dev environment from the Gitea Actions UI without waiting for
|
|
# the PR to merge first. The deploy job picks up whatever the chosen
|
|
# ref is — same packaging + healthcheck steps as the merge path.
|
|
|
|
on:
|
|
push:
|
|
branches:
|
|
- development
|
|
paths:
|
|
- 'backend/**'
|
|
- 'gateway/**'
|
|
- 'game/**'
|
|
- 'pkg/**'
|
|
- 'ui/**'
|
|
- 'go.work'
|
|
- 'go.work.sum'
|
|
- 'tools/dev-deploy/**'
|
|
- '.gitea/workflows/dev-deploy.yaml'
|
|
- '!**/*.md'
|
|
workflow_dispatch: {}
|
|
|
|
jobs:
|
|
deploy:
|
|
runs-on: ubuntu-latest
|
|
defaults:
|
|
run:
|
|
shell: bash
|
|
steps:
|
|
- name: Checkout
|
|
uses: actions/checkout@v4
|
|
with:
|
|
submodules: recursive
|
|
|
|
- name: Set up Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version-file: go.work
|
|
cache: true
|
|
|
|
- name: Set up pnpm
|
|
uses: pnpm/action-setup@v4
|
|
with:
|
|
version: 11.0.7
|
|
|
|
- name: Set up Node
|
|
uses: actions/setup-node@v4
|
|
with:
|
|
node-version: 22
|
|
cache: pnpm
|
|
cache-dependency-path: ui/pnpm-lock.yaml
|
|
|
|
- name: Install UI dependencies
|
|
working-directory: ui
|
|
run: pnpm install --frozen-lockfile
|
|
|
|
- name: Build UI frontend
|
|
working-directory: ui/frontend
|
|
env:
|
|
VITE_GATEWAY_BASE_URL: https://api.galaxy.lan
|
|
# Surface the synthetic-report loader and similar dev-only
|
|
# affordances in the long-lived dev bundle. The prod build
|
|
# path (`prod-build.yaml`) leaves this flag unset so the
|
|
# production bundle keeps the same affordances stripped.
|
|
VITE_GALAXY_DEV_AFFORDANCES: "true"
|
|
run: |
|
|
# The response-signing public key is committed in
|
|
# `.env.development` alongside its private counterpart in
|
|
# `tools/local-dev/keys/`. Pull it from there at build time so
|
|
# the production-mode bundle ships the same key the dev
|
|
# gateway uses to sign.
|
|
export VITE_GATEWAY_RESPONSE_PUBLIC_KEY="$(grep -E '^VITE_GATEWAY_RESPONSE_PUBLIC_KEY=' .env.development | cut -d= -f2)"
|
|
pnpm build
|
|
|
|
- name: Build galaxy-engine image
|
|
working-directory: ${{ gitea.workspace }}
|
|
run: |
|
|
docker build \
|
|
-t galaxy-engine:dev \
|
|
-f game/Dockerfile \
|
|
.
|
|
|
|
- name: Build backend + gateway images
|
|
working-directory: tools/dev-deploy
|
|
run: |
|
|
docker compose build galaxy-backend galaxy-api
|
|
|
|
- name: Seed UI volume
|
|
run: |
|
|
docker volume create galaxy-dev-ui-dist >/dev/null
|
|
docker run --rm \
|
|
-v galaxy-dev-ui-dist:/dst \
|
|
-v "${{ gitea.workspace }}/ui/frontend/build:/src:ro" \
|
|
alpine sh -c 'rm -rf /dst/* /dst/.??* 2>/dev/null; cp -a /src/. /dst/'
|
|
|
|
- name: Seed geoip volume
|
|
run: |
|
|
# Copy the GeoIP test fixture into a named volume so the
|
|
# backend can mount it as /var/lib/galaxy. A bind-mount with
|
|
# a relative path would resolve against this runner's
|
|
# ephemeral workspace under /home/runner/.cache/act/<hash>/,
|
|
# which the runner deletes once the workflow ends — the next
|
|
# `docker restart galaxy-dev-backend` would then fail with
|
|
# "not a directory" because the mount source vanished.
|
|
docker volume create galaxy-dev-geoip-data >/dev/null
|
|
docker run --rm \
|
|
-v galaxy-dev-geoip-data:/dst \
|
|
-v "${{ gitea.workspace }}/pkg/geoip/test-data/test-data:/src:ro" \
|
|
alpine sh -c 'cp /src/GeoIP2-Country-Test.mmdb /dst/geoip.mmdb'
|
|
|
|
- name: Reap stray dev-deploy containers
|
|
run: |
|
|
# Remove any non-running compose-managed containers from
|
|
# earlier deploys before `compose up`. Filter by the stack
|
|
# label so we never touch unrelated workloads on the same
|
|
# daemon. Running containers (incl. engine instances backend
|
|
# spawned itself with the same label) are left intact —
|
|
# those are reattached by the backend reconciler on boot.
|
|
ids=$(docker ps -aq \
|
|
--filter "label=galaxy.stack=dev-deploy" \
|
|
--filter "status=exited" \
|
|
--filter "status=created" \
|
|
--filter "status=dead")
|
|
if [ -n "$ids" ]; then
|
|
echo "reaping: $ids"
|
|
docker rm -f $ids
|
|
fi
|
|
|
|
- name: Bring up the stack
|
|
working-directory: tools/dev-deploy
|
|
run: |
|
|
# Resolve in the shell, not in YAML expressions — `env.HOME`
|
|
# is empty at the workflow-evaluation stage.
|
|
export GALAXY_DEV_GAME_STATE_DIR="$HOME/.galaxy-dev/game-state"
|
|
mkdir -p "$GALAXY_DEV_GAME_STATE_DIR"
|
|
docker compose up -d --wait --remove-orphans
|
|
|
|
- name: Probe the stack
|
|
run: |
|
|
set -e
|
|
# Use --resolve so the probe goes through the same routing as
|
|
# a browser on the host: the host Caddy on :443 (which has
|
|
# `tls internal`) terminates and forwards into the edge
|
|
# network. We accept the host's internal CA via -k because
|
|
# the runner image has no reason to trust it.
|
|
curl -sk --max-time 10 https://api.galaxy.lan/healthz \
|
|
| tee /tmp/healthz
|
|
test -s /tmp/healthz
|
|
curl -sk --max-time 10 -o /dev/null -w '%{http_code}\n' \
|
|
https://www.galaxy.lan/ | tee /tmp/www_status
|
|
grep -qE '^(200|304)$' /tmp/www_status
|