fix(dev-deploy): seed geoip onto a named volume

`docker restart galaxy-dev-backend` failed with "not a directory"
after every dev-deploy workflow run. Root cause: the compose file
bind-mounted the geoip database via a relative path
(`../../pkg/geoip/test-data/test-data/GeoIP2-Country-Test.mmdb`).
When the Gitea runner invoked `docker compose up`, the path
resolved against the runner's ephemeral workspace under
`/home/runner/.cache/act/<hash>/hostexecutor/...`. The bind source
baked into the running container therefore pointed at that
ephemeral path; the runner deleted the workspace once the workflow
finished, and any later `docker restart` could not remount.

Replace the bind with a named volume `galaxy-dev-geoip-data`,
seeded at deploy time:

- `tools/dev-deploy/docker-compose.yml`: mount
  `galaxy-dev-geoip-data:/var/lib/galaxy:ro` instead of a relative
  bind. Declare the volume in the top-level `volumes:` block.

- `.gitea/workflows/dev-deploy.yaml`: new `Seed geoip volume` step
  (placed right after the existing UI-volume seed) copies the
  fixture from `pkg/geoip/test-data/test-data/` into the named
  volume via an ephemeral alpine container, the same pattern UI
  seeding already uses.

- `tools/dev-deploy/Makefile`: new `seed-geoip` target performs
  the same copy from the persistent checkout. `up` and `rebuild`
  now depend on it, so a hand-run `make -C tools/dev-deploy up`
  populates the volume without operator action.

- `tools/dev-deploy/README.md`: updated the make-targets table to
  list `seed-geoip`.

- `tools/dev-deploy/KNOWN-ISSUES.md`: the entry for the restart
  failure is downgraded to a "fixed" postmortem; the symptom,
  cause, and where the fix lives are kept for future reference.

Verification on the dev host (this branch checked out):

  $ make -C tools/dev-deploy up                # populates the volume, brings stack healthy
  $ docker restart galaxy-dev-backend          # used to error "not a directory"
  $ until [ "$(docker inspect -f '{{.State.Health.Status}}' galaxy-dev-backend)" = "healthy" ]; do sleep 2; done
  $ echo "ok"                                   # backend up 6s, healthy

The pre-existing sandbox engine `galaxy-game-80f3ce86-...` survived
both `make up` and `docker restart` untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Ilia Denisov
2026-05-19 01:59:38 +02:00
parent d19aa3aac5
commit f70258849f
5 changed files with 68 additions and 33 deletions
+15
View File
@@ -104,6 +104,21 @@ jobs:
-v "${{ gitea.workspace }}/ui/frontend/build:/src:ro" \
alpine sh -c 'rm -rf /dst/* /dst/.??* 2>/dev/null; cp -a /src/. /dst/'
- name: Seed geoip volume
run: |
# Copy the GeoIP test fixture into a named volume so the
# backend can mount it as /var/lib/galaxy. A bind-mount with
# a relative path would resolve against this runner's
# ephemeral workspace under /home/runner/.cache/act/<hash>/,
# which the runner deletes once the workflow ends — the next
# `docker restart galaxy-dev-backend` would then fail with
# "not a directory" because the mount source vanished.
docker volume create galaxy-dev-geoip-data >/dev/null
docker run --rm \
-v galaxy-dev-geoip-data:/dst \
-v "${{ gitea.workspace }}/pkg/geoip/test-data/test-data:/src:ro" \
alpine sh -c 'cp /src/GeoIP2-Country-Test.mmdb /dst/geoip.mmdb'
- name: Reap stray dev-deploy containers
run: |
# Remove any non-running compose-managed containers from
+22 -27
View File
@@ -162,9 +162,12 @@ redeploys can short-circuit the diagnostic loop.
## `docker restart galaxy-dev-backend` fails after the CI runner cleans up
**Status: fixed (2026-05-19).** Kept here as a postmortem in case
the symptom resurfaces in a different form.
### Symptom
`docker restart galaxy-dev-backend` from the host fails with:
`docker restart galaxy-dev-backend` from the host failed with:
```text
Error response from daemon: ... error mounting
@@ -172,36 +175,28 @@ Error response from daemon: ... error mounting
to rootfs at "/var/lib/galaxy/geoip.mmdb": ... not a directory
```
The container ends up `Exited (127)` and never comes back.
The container ended up `Exited (127)` and never came back.
### Cause
`tools/dev-deploy/docker-compose.yml` mounts the geoip database via
a path relative to the compose file
`tools/dev-deploy/docker-compose.yml` used to mount the geoip
database via a path relative to the compose file
(`../../pkg/geoip/test-data/test-data/GeoIP2-Country-Test.mmdb`). When
the `dev-deploy.yaml` Gitea runner invokes `docker compose up` it
resolves that relative path against the runner's ephemeral workspace
the `dev-deploy.yaml` Gitea runner invoked `docker compose up`, it
resolved that relative path against the runner's ephemeral workspace
under `/home/runner/.cache/act/<hash>/hostexecutor/tools/dev-deploy/`,
so the bind-mount source baked into the running container points at
that ephemeral path. The runner deletes the workspace once the
workflow ends, the source disappears, and the next `docker restart`
fails to remount it.
so the bind-mount source baked into the running container pointed at
that ephemeral path. The runner deleted the workspace once the
workflow ended, the source disappeared, and the next `docker restart`
failed to remount it.
### Workaround
### Fix
Bring the stack back up from a stable workspace, which re-binds the
mount source to the persistent checkout:
```sh
make -C tools/dev-deploy up
```
This restarts every service (including the broken `galaxy-dev-backend`)
with a stable source path.
### Status
Open. The clean fix is either to bake the geoip test fixture into
the backend image (no host bind-mount) or to copy it onto a named
volume during `dev-deploy.yaml` and bind that instead. Either change
removes the runner-workspace dependency entirely.
Replaced the bind-mount with a named volume,
`galaxy-dev-geoip-data`, seeded by the `dev-deploy.yaml` workflow
(and by the new `make seed-geoip` target) at deploy time. The
backend mounts the volume as `/var/lib/galaxy:ro`, so the bind
source is a Docker-managed volume — independent of the runner
workspace — and survives a `docker restart`. See
`.gitea/workflows/dev-deploy.yaml` ("Seed geoip volume" step) and
`tools/dev-deploy/Makefile` (`seed-geoip` target).
+18 -4
View File
@@ -1,4 +1,4 @@
.PHONY: help up down rebuild logs status clean-data health psql build-engine seed-ui
.PHONY: help up down rebuild logs status clean-data health psql build-engine seed-ui seed-geoip
.DEFAULT_GOAL := help
@@ -18,10 +18,11 @@ COMPOSE := docker compose
help:
@echo "Long-lived Galaxy dev environment (https://*.galaxy.lan):"
@echo " make up Build images, ensure engine image, bring stack up"
@echo " make up Build images, ensure engine image, seed geoip, bring stack up"
@echo " make rebuild Force rebuild of backend / gateway images and bring up"
@echo " make build-engine Build $(ENGINE_IMAGE) from game/Dockerfile (no-op if present)"
@echo " make seed-ui Build ui/frontend and load into galaxy-dev-ui-dist volume"
@echo " make seed-geoip Copy GeoIP fixture into galaxy-dev-geoip-data volume"
@echo " make down Stop containers, keep named volumes"
@echo " make logs Tail all logs"
@echo " make status docker compose ps"
@@ -35,11 +36,11 @@ help:
@echo " - host Caddy proxying *.galaxy.lan into that network"
@echo " - game-state dir: $(GALAXY_DEV_GAME_STATE_DIR) (auto-created)"
up: build-engine
up: build-engine seed-geoip
mkdir -p "$(GALAXY_DEV_GAME_STATE_DIR)"
$(COMPOSE) up -d --wait
rebuild: build-engine
rebuild: build-engine seed-geoip
$(COMPOSE) build --no-cache galaxy-backend galaxy-api
mkdir -p "$(GALAXY_DEV_GAME_STATE_DIR)"
$(COMPOSE) up -d --wait
@@ -52,6 +53,19 @@ build-engine:
docker build -t $(ENGINE_IMAGE) -f $(REPO_ROOT)/game/Dockerfile $(REPO_ROOT); \
fi
# Copy the GeoIP fixture into a named volume the backend mounts as
# /var/lib/galaxy. Using a volume avoids a bind-mount that would
# resolve against an ephemeral workspace path when compose is driven
# from the Gitea runner (see tools/dev-deploy/KNOWN-ISSUES.md for the
# breakage that bind-mounts caused on `docker restart`).
seed-geoip:
@echo "seeding GeoIP fixture into galaxy-dev-geoip-data…"
docker volume create galaxy-dev-geoip-data >/dev/null
docker run --rm \
-v galaxy-dev-geoip-data:/dst \
-v $(REPO_ROOT)/pkg/geoip/test-data/test-data:/src:ro \
alpine sh -c 'cp /src/GeoIP2-Country-Test.mmdb /dst/geoip.mmdb'
# Build the UI frontend and load the resulting build/ directory into
# the named volume Caddy serves from. Used by the dev-deploy workflow
# and by anyone bringing the stack up by hand.
+2 -1
View File
@@ -153,9 +153,10 @@ The same volume-persistence model applies to `tools/local-dev/`.
## Make targets
```text
make up Build images, ensure engine image, bring stack up (waits for health)
make up Build images, ensure engine image, seed geoip, bring stack up
make rebuild Rebuild backend / gateway images (ignores cache), then up
make seed-ui pnpm build + load build/ into galaxy-dev-ui-dist volume
make seed-geoip Copy pkg/geoip fixture into galaxy-dev-geoip-data volume
make build-engine Build galaxy-engine:dev (no-op if image already present)
make down Stop containers, keep named volumes
make logs Tail compose logs
+11 -1
View File
@@ -144,7 +144,15 @@ services:
target: ${GALAXY_DEV_GAME_STATE_DIR}
bind:
create_host_path: true
- ../../pkg/geoip/test-data/test-data/GeoIP2-Country-Test.mmdb:/var/lib/galaxy/geoip.mmdb:ro
# The geoip database lives on a named volume seeded by the
# `dev-deploy.yaml` workflow (or by `make seed-geoip` when
# bringing the stack up by hand). A bind-mount with a relative
# path would resolve against the runner's ephemeral workspace
# under /home/runner/.cache/act/<hash>/, which the runner
# deletes after the workflow ends — and the next
# `docker restart galaxy-dev-backend` would then fail with
# "not a directory" because the mount source vanished.
- galaxy-dev-geoip-data:/var/lib/galaxy:ro
networks:
- galaxy-internal
healthcheck:
@@ -258,3 +266,5 @@ volumes:
name: galaxy-dev-caddy-data
galaxy-dev-ui-dist:
name: galaxy-dev-ui-dist
galaxy-dev-geoip-data:
name: galaxy-dev-geoip-data