galaxy-game

Author	SHA1	Message	Date
Ilia Denisov	e11092234c	feat(dev-deploy): expose Grafana + Mailpit UIs via Caddy; seed monitoring config Deploy wiring for the observability stack (the services and collector config landed in the previous commit): - Caddyfile.dev: route /grafana/* to galaxy-grafana:3000 (Caddy sub-path mode, Grafana keeps its own login) and /mailpit/* to galaxy-mailpit:8025 behind dev basic-auth, so the captured-mail UI (every message, relayed or not) and Grafana are reachable through the single dev origin. - dev-deploy.yaml: seed the monitoring config tree to a stable, reboot-surviving host path (GALAXY_DEV_MONITORING_DIR) before bringing the stack up, and inject the Grafana admin password from a Gitea secret (GALAXY_DEV_GRAFANA_ADMIN_PASSWORD; empty falls back to the compose default).	2026-06-01 05:46:19 +02:00
Ilia Denisov	7fb6a63c2b	feat(dev-deploy): relay Mailpit to Gmail (Stage 3) Keep Mailpit as the backend's SMTP submission point and turn on its relay so OTP/notification mail addressed to the owner reaches a real Gmail inbox, while everything else stays captured-only. - mailpit gains --smtp-relay-config + --smtp-relay-matching (default non-routable, so an unconfigured stack only captures); relay.conf is mounted from a new galaxy-dev-mailpit-config volume - tools/dev-deploy/mailpit/relay.conf.tmpl + a dev-deploy.yaml step that renders it from Gitea secrets (Gmail App Password, never committed) and seeds the volume; the GALAXY_DEV_MAIL_RELAY_MATCH var drives the relay-matching recipient - backend SMTP config unchanged (still -> galaxy-mailpit:1025) - dev-deploy README documents the relay + required secrets/vars Verified locally: compose config valid; the rendered relay.conf is accepted by mailpit v1.21.8 (relay + recipient-matching enabled). Real Gmail delivery is verified at the dev-deploy preview once the owner sets the secrets.	2026-05-31 22:44:32 +02:00
Ilia Denisov	0cae89cba2	refactor(dev): remove the dev-sandbox bootstrap everywhere Tests · Go / test (push) Successful in 1m59s Details Stage 1 of the dev-as-prod-mirror rework. The auto-provisioned "Dev Sandbox" game and dummy users are removed so the dev contour starts empty like prod; the separate legacy-report loader stays as the test-data path. - delete backend/internal/devsandbox (package + tests) - drop the bootstrap call + DevSandboxConfig (struct, Config field, BACKEND_DEV_SANDBOX_* env, defaults, loader, validation) - strip BACKEND_DEV_SANDBOX_* from dev-deploy + local-dev compose and .env.example; the generic engine-recycle / prune-broken-engines logic stays (it serves real games) - update tooling docs (dev-deploy README + KNOWN-ISSUES, local-dev README + Makefile) and stale comments; DeleteGame and InsertMembershipDirect remain (exercised by lobby integration tests) No app behaviour change beyond not auto-creating the sandbox game.	2026-05-31 22:28:03 +02:00
Ilia Denisov	eb549e6049	ci(ui-test): clean root-owned build artifacts so runner teardown succeeds Tests · UI / test (push) Waiting to run Details Tests · UI / test (pull_request) Successful in 3m24s Details In host-mode the ui-test job runs as root, so vite (test:pwa), svelte-kit and Playwright write build/, .svelte-kit/, test-results/ and playwright-report/ root-owned into the shared host workspace. The act_runner (non-root) then cannot remove them at teardown ("unlinkat ui/frontend/build: permission denied"), which spuriously marks this or a sibling job that inherits the dirty workspace as failed — it hit go-unit on the #83 merge even though every test passed. Add an `if: always()` step that removes those generated dirs while the job still has root, after the artifact uploads. Keeps the shared workspace clean for the runner's own teardown and for later jobs.	2026-05-31 12:15:32 +02:00
Ilia Denisov	658ab7f6e7	chore(fbs): pin flatc toolchain to 25.9.23 and guard codegen drift Tests · FBS codegen / codegen (push) Successful in 5s Details Tests · Go / test (push) Successful in 2m29s Details Tests · FBS codegen / codegen (pull_request) Successful in 6s Details Tests · UI / test (push) Waiting to run Details Tests · Integration / integration (pull_request) Successful in 1m46s Details Tests · Go / test (pull_request) Successful in 3m20s Details Tests · UI / test (pull_request) Successful in 3m19s Details The committed FlatBuffers bindings were generated by flatc 25.x (the TS runtime is flatbuffers@25.9.23), but nothing pinned the compiler, so a regen on a box with an older flatc (Debian apt ships 23.5.26) silently churns output and flips nullable-scalar builder defaults. PR #82 hit this and shipped 5 report files from the wrong compiler. Unify the whole toolchain on 25.9.23 (the only version available as an npm package, a prebuilt flatc binary, and a Go tag) and make the bindings reproducible: - Downgrade the flatbuffers Go module 25.12.19 -> 25.9.23 (schema, transcoder, gateway, integration) so compiler and both runtimes match. - Regenerate every schema with flatc 25.9.23. The only resulting change is order/command-item.ts: the lone straggler still on the old optional-scalar builder default (cmd_applied/cmd_error_code: 0 -> null). Inert in practice — the TS side never builds those response-only fields (the engine sets them in Go); the reader is unchanged. - Pin the version in tooling: a flatc-check guard in ui/Makefile (fbs-ts) and a new pkg/schema/fbs/Makefile (fbs-go); both refuse a mismatched flatc and point at the release binary. Fix the stale apt install hint. - Add a path-filtered CI guard (.gitea/workflows/fbs-codegen.yaml) that regenerates with the pinned flatc and fails on any diff. - Document the pinned version and the regen commands in the schema README. No wire-format change: Go build/vet, transcoder roundtrip + engine tests, pnpm check and the full vitest suite (888) stay green.	2026-05-31 11:51:20 +02:00
Ilia Denisov	e038ea6154	fix(dev-deploy): recycle engine containers on galaxy-engine:dev SHA drift Tests · Integration / integration (pull_request) Successful in 1m48s Details Tests · Go / test (pull_request) Successful in 2m1s Details `backend`'s reconciler adopts pre-existing `galaxy-game-` containers without comparing their image SHA against the freshly-built `galaxy-engine:dev`, so a long-lived sandbox would otherwise keep serving the previous engine code after a redeploy. Issue #59 surfaced this: after the per-command-rejection fix was deployed via `workflow_dispatch`, the running sandbox container was still on the old image SHA and the browser kept seeing the 503/unavailable response. Adds a `Recycle engine containers on image drift` step right before `Reap stray dev-deploy containers`. The step compares the new `galaxy-engine:dev` SHA against every running `galaxy-game-` container and, on drift, stops the backend, removes the container, wipes the bind-mounted per-game state directory (Engine.Init() writes turn-0 over any pre-existing `turn-N` files — silent state corruption otherwise), and cascade-deletes the lobby `games` row. The `dev-sandbox` bootstrap on the next backend boot finds no live sandbox and provisions a fresh one on the new engine image. When the engine sources are unchanged, the BuildKit cache hits and the SHA stays the same — the recycle step is a no-op and the running games keep their state across the deploy. Verified end-to-end against the live dev environment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 10:47:25 +02:00
Ilia Denisov	8565942392	feat(deploy): single-origin path-based deployment + project site Build · Site / build (push) Successful in 8s Details Tests · Go / test (push) Successful in 2m22s Details Tests · UI / test (push) Failing after 2m42s Details Serve the whole stack behind one host: site at /, game UI at /game/, gateway REST at /api + /healthz, Connect at /rpc (prefix stripped by the edge Caddy). The built artifact is domain-agnostic — the UI talks to the gateway same-origin via relative URLs, so the same bundle runs under any host with no rebuild and with CORS disabled. - Rename the Connect proto service galaxy.gateway.v1.EdgeGateway -> edge.v1.Gateway; regenerate Go + TS; public path /rpc/edge.v1.Gateway. - Move the game UI under base path /game (env BASE_PATH); make the manifest, service-worker scope, WASM loader, and all navigation base-aware via a withBase helper. - Relative API + /rpc Connect prefix; Vite dev proxy mirrors the strip. - Rewrite the edge Caddy (dev + prod) for path-based routing; empty CORS allow-lists (same-origin); single host. - New VitePress project site (site/): i18n en/ru with switcher, LaTeX math, minimal monospace theme; built and served at /. - dev-deploy compose/Makefile + CI (dev-deploy, prod-build, new site-build) build and seed the site; probes hit /, /game/, /healthz. - Sync docs (ARCHITECTURE, gateway README/openapi, dev-deploy & local-dev READMEs, CLAUDE.md, ui/PLAN). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 18:19:07 +02:00
Ilia Denisov	04c7f6e68a	feat(ui): installable offline PWA — service worker, manifest, icons (F5) Tests · UI / test (push) Failing after 7m31s Details Native SvelteKit service worker (src/service-worker.ts): a version-keyed cache precaches the app shell + build artefacts (incl. core.wasm) + static files; activate purges old caches; the gateway is never intercepted; navigations fall back to the cached shell offline. Adds static/manifest.webmanifest, a generated placeholder icon set (scripts/gen-pwa-icons.mjs — dependency-free pure-Node PNG encoder), and manifest / theme-color / apple-touch tags in app.html. Gated by Playwright against a production preview (playwright.pwa.config.ts + tests/pwa/pwa.spec.ts via `pnpm test:pwa`, wired into ui-test): manifest + installable icons, SW registration + a single version-keyed cache, and offline shell load. Lighthouse is not used — its PWA category was removed in v12. Docs: ui/docs/pwa-strategy.md (+ index); F5 marked done. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 15:46:42 +02:00
Ilia Denisov	b729036778	build(ui): build core.wasm in CI, stop committing the binary (F6) Tests · UI / test (push) Successful in 3m48s Details Tests · UI / test (pull_request) Successful in 2m35s Details core.wasm and wasm_exec.js are no longer tracked (untracked + gitignored). A reusable composite action .gitea/actions/build-wasm installs TinyGo (actions/cache'd) and runs `make -C ui wasm`; it runs in all three frontend-building workflows — ui-test (before Playwright; Vitest uses the fake Core and needs no build), dev-deploy, and prod-build. ui-test gains a Go setup (TinyGo shells out to Go); the deploy workflows already had one. Docs: ui/docs/wasm-toolchain.md, ui/README.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 14:29:33 +02:00
Ilia Denisov	b24d53b82f	ci: install pnpm into a per-job dir to fix the host-runner setup race Tests · UI / test (push) Waiting to run Details Tests · UI / test (pull_request) Successful in 1m56s Details `pnpm/action-setup@v4` defaults to installing pnpm in the shared `~/setup-pnpm`. On the single host-mode runner $HOME is shared across concurrent jobs, so when two pnpm jobs overlap (e.g. a post-merge `dev-deploy` and `ui-test`, which sit in different concurrency groups) their self-installers race and one fails with `ENOTEMPTY ... rmdir '~/setup-pnpm/node_modules/.bin/store/v11/files'` before the tests even run. Point each step's `dest` at `${{ runner.temp }}/setup-pnpm` (a per-job isolated directory) so concurrent jobs never share the install location. The action still adds `dest` to PATH, so setup-node's pnpm cache and later `pnpm` calls are unaffected; the pnpm package store stays shared (safe — pnpm locks it). Applied to the three workflows that set up pnpm: ui-test, dev-deploy, prod-build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 17:26:49 +02:00
Ilia Denisov	f70258849f	fix(dev-deploy): seed geoip onto a named volume `docker restart galaxy-dev-backend` failed with "not a directory" after every dev-deploy workflow run. Root cause: the compose file bind-mounted the geoip database via a relative path (`../../pkg/geoip/test-data/test-data/GeoIP2-Country-Test.mmdb`). When the Gitea runner invoked `docker compose up`, the path resolved against the runner's ephemeral workspace under `/home/runner/.cache/act/<hash>/hostexecutor/...`. The bind source baked into the running container therefore pointed at that ephemeral path; the runner deleted the workspace once the workflow finished, and any later `docker restart` could not remount. Replace the bind with a named volume `galaxy-dev-geoip-data`, seeded at deploy time: - `tools/dev-deploy/docker-compose.yml`: mount `galaxy-dev-geoip-data:/var/lib/galaxy:ro` instead of a relative bind. Declare the volume in the top-level `volumes:` block. - `.gitea/workflows/dev-deploy.yaml`: new `Seed geoip volume` step (placed right after the existing UI-volume seed) copies the fixture from `pkg/geoip/test-data/test-data/` into the named volume via an ephemeral alpine container, the same pattern UI seeding already uses. - `tools/dev-deploy/Makefile`: new `seed-geoip` target performs the same copy from the persistent checkout. `up` and `rebuild` now depend on it, so a hand-run `make -C tools/dev-deploy up` populates the volume without operator action. - `tools/dev-deploy/README.md`: updated the make-targets table to list `seed-geoip`. - `tools/dev-deploy/KNOWN-ISSUES.md`: the entry for the restart failure is downgraded to a "fixed" postmortem; the symptom, cause, and where the fix lives are kept for future reference. Verification on the dev host (this branch checked out): $ make -C tools/dev-deploy up # populates the volume, brings stack healthy $ docker restart galaxy-dev-backend # used to error "not a directory" $ until [ "$(docker inspect -f '{{.State.Health.Status}}' galaxy-dev-backend)" = "healthy" ]; do sleep 2; done $ echo "ok" # backend up 6s, healthy The pre-existing sandbox engine `galaxy-game-80f3ce86-...` survived both `make up` and `docker restart` untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-19 01:59:38 +02:00
Ilia Denisov	a9087691a3	chore(ci): tidy CI/dev infra — drop local-ci, lift migration rule, scope by galaxy.stack label Tests · Go / test (push) Successful in 2m6s Details Tests · Go / test (pull_request) Successful in 3m1s Details Tests · Integration / integration (pull_request) Successful in 1m42s Details Five connected cleanups across the dev/CI infrastructure: 1. Drop tools/local-ci/. The standalone Gitea + act_runner stack was the legacy "offline workflow validator"; the per-stage CI gate now runs on gitea.lan and the directory was only retained as a fallback. Removing it leaves no operational dependency: backend, gateway, and game code have no references; documentation that pointed at it (CLAUDE.md, docs/ARCHITECTURE.md, ui/docs/testing.md, tools/dev-deploy/README.md, tools/local-dev/README.md) is updated in this same change. Historical "Verified on local-ci run N" markers in ui/PLAN.md are preserved unchanged. 2. Lift the pre-production single-migration rule. The rule forced every schema delta into 00001_init.sql and required a manual make clean-data wipe on every backward-incompatible change in tools/dev-deploy/. Future schema deltas now land as additive sequence-numbered files (00002_.sql, …) that goose applies automatically on backend startup; 00001_init.sql becomes an immutable baseline. Authoring conventions live in backend/internal/postgres/migrations/README.md. The chain may be squashed back into a fresh 00001 as a deliberate one-time operation before the first production deployment. 3. Document the deployment cadence. The dev environment is single-tenant: pushes to feature/ run the test workflows (go-unit, ui-test, integration) only; dev-deploy.yaml fires on push to development. A workflow_dispatch override on dev-deploy.yaml lets a developer preview a feature branch on the shared dev environment before merge; the next merge into development overwrites the manual deploy idempotently. 4. Scope compose-managed resources by an explicit galaxy.stack=<local-dev\|dev-deploy> label. Both compose files stamp the label on every service, network, and named volume. Makefiles in tools/local-dev/ and tools/dev-deploy/ filter their engine-cleanup operations by (stack-label AND engine OCI title) so they never touch unrelated workloads on the same daemon. dev-deploy.yaml gains a pre-`compose up` step that reaps stale exited/dead containers under the dev-deploy stack label. 5. Backend now stamps the same galaxy.stack=<value> label on every engine container it spawns, sourced from a new BACKEND_STACK_LABEL env var (empty → label not applied; legacy-safe). Both compose files set it to their stack name (local-dev / dev-deploy). The contract is recorded in docs/ARCHITECTURE.md under "Container labels". A package-level test in backend/internal/runtime exercises both the label-present and label-absent paths. No tests intentionally regressed: go test ./backend/internal/{config, runtime,dockerclient} is green, both compose files validate cleanly, and the backend, gateway, and game modules all build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 23:32:42 +02:00
Ilia Denisov	81917acc3e	dev-deploy: enable Dev Sandbox bootstrap and synthetic-report loader Tests · UI / test (push) Has been cancelled Details Tests · Integration / integration (pull_request) Successful in 1m47s Details Tests · Go / test (pull_request) Successful in 2m4s Details Tests · UI / test (pull_request) Successful in 2m23s Details Two long-standing dev-environment ergonomics had not survived the move from the bespoke local-dev stack to the CI-driven dev-deploy: 1. `BACKEND_DEV_SANDBOX_EMAIL` defaulted to an empty string in the dev-deploy compose, so the auto-provisioned "Dev Sandbox" game never appeared on `https://www.galaxy.lan`. Bake `dev@galaxy.lan` as the default — matches `.env.example` and lets a developer who logs in with that email find a ready-to-play game in the lobby. 2. The lobby's synthetic-report loader was gated on `import.meta.env.DEV`, which is true only for `vite dev` (the tools/local-dev path). The long-lived dev environment builds with `vite build` (production mode), so the section was always stripped from its bundle. Gate it on an explicit `VITE_GALAXY_DEV_AFFORDANCES` flag instead and set it both in `.env.development` (preserves `pnpm dev` behaviour) and in the `dev-deploy.yaml` build step. The `prod-build.yaml` build path leaves the flag unset, so production stays clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-16 21:46:24 +02:00
Ilia Denisov	859b157a59	auth dev-fixed-code bypasses attempts cap; dev-deploy gains manual dispatch Tests · Go / test (pull_request) Successful in 2m9s Details Tests · Go / test (push) Successful in 2m9s Details Tests · Integration / integration (pull_request) Successful in 1m49s Details Tests · UI / test (pull_request) Successful in 2m51s Details Two problems showed up while trying to log into the long-lived dev environment with the dev-fixed code `123456`: 1. `ConfirmEmailCode` checked the per-challenge attempts ceiling before the dev-fixed-code override. A developer who burned past `ChallengeMaxAttempts` on an existing un-consumed challenge (easy to trigger when the throttle reuses one challenge_id) hit `ErrTooManyAttempts` and the UI rendered "code expired or already used" even though the fixed code was correct. Reorder so the dev-fixed-code branch runs first and bypasses both the bcrypt verify and the attempts gate. Production stays unaffected because production loaders refuse to set `DevFixedCode`. 2. `dev-deploy.yaml` only fires on push to `development`, so the matching docker-compose default change for `BACKEND_AUTH_DEV_FIXED_CODE` could not reach the running stack before this PR merged. Add `workflow_dispatch: {}` so a developer can deploy any branch — typically a feature branch under review — from the Gitea Actions UI without waiting for the merge. Covered by a new `TestConfirmEmailCodeDevFixedCodeBypassesAttemptsCeiling` integration test that burns through the ceiling with wrong codes then proves the dev-fixed code still produces a session. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-16 21:28:30 +02:00
Ilia Denisov	1a0e3e992f	ci/ui-test: queue runs in one bucket instead of cancelling Tests · UI / test (push) Waiting to run Details Tests · UI / test (pull_request) Successful in 2m20s Details `cancel-in-progress: true` killed run #73 even though it was the only ui-test in its concurrency group — Gitea appears to cancel the in-progress job on its own under that setting in some edge cases. Switch to a singleton group with `cancel-in-progress: false`. The new behaviour is simple queueing: only one ui-test workflow runs at a time across the repository, the rest wait. Vite-on-:5173 cannot collide because there is never a second ui-test alive. The wall-time hit is bounded — ui-test is ~2 minutes — and bursts are rare enough that queueing is cheap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 08:51:54 +02:00
Ilia Denisov	6e6186a571	ci/ui-test: key concurrency by head sha, not gitea.ref Tests · UI / test (push) Has been cancelled Details Tests · UI / test (pull_request) Successful in 2m17s Details `gitea.ref` differs between push (`refs/heads/<branch>`) and pull_request (`refs/pull/N/head`) events even for the same commit, so the two parallel runs land in different concurrency groups and the Vite-on-:5173 collision is not suppressed. Switching the key to the head sha (`gitea.event.pull_request.head.sha \|\| gitea.sha`) collapses both events into one bucket, leaving exactly one ui-test alive per commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 08:46:00 +02:00
Ilia Denisov	e3bb30201d	ci/ui-test: serialise per-ref + clear stale Vite before Playwright Tests · UI / test (pull_request) Failing after 6s Details Tests · UI / test (push) Successful in 2m21s Details Two ui-test jobs cannot coexist on the same host: Playwright's `webServer` spec spawns `pnpm dev` on :5173, and on a host-mode runner the port lives in the host namespace shared by every job. ui-test #67 hit "Error: http://localhost:5173 is already used" because a parallel job's Vite still held the port. Two changes: 1. `concurrency: ui-test-${{ gitea.ref }}` with `cancel-in-progress: true`. New push/PR runs against the same ref kill any earlier ui-test before starting, so we never have two `pnpm dev`s alive at once. 2. `pkill -f 'vite dev' \|\| true` plus `fuser -k 5173/tcp` right before Playwright. Defence in depth in case the concurrency cancellation does not reap the spawned shell promptly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 08:42:08 +02:00
Ilia Denisov	2a95bf4a50	ci: re-enable actions cache now that the runner serves it Tests · UI / test (push) Successful in 2m20s Details Tests · Go / test (push) Failing after 2m21s Details Tests · Go / test (pull_request) Successful in 1m40s Details Tests · Integration / integration (pull_request) Successful in 1m46s Details Tests · UI / test (pull_request) Successful in 2m2s Details The Gitea Actions cache service now answers on 10.200.0.1:43513 (post nftables fix on the runner side). Turn `cache: true` and `cache: pnpm` back on so setup-go/setup-node can use it for cross-job tarball caching on top of the host-persistent caches we already rely on. The setup-* actions still tolerate the cache being unavailable, so this is reversible to `cache: false` if the service goes away again. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 07:39:39 +02:00
Ilia Denisov	8058f26397	ci: drop cache: setting in setup-go/setup-node Tests · Go / test (push) Successful in 2m21s Details Tests · UI / test (push) Successful in 2m22s Details Tests · Go / test (pull_request) Successful in 3m14s Details Tests · Integration / integration (pull_request) Successful in 1m37s Details Tests · UI / test (pull_request) Successful in 2m7s Details `cache: true` (setup-go) and `cache: pnpm` (setup-node) make the actions push and pull tarballs through the Gitea Actions cache service at 192.168.0.222:43513. That endpoint currently does not answer, so every workflow burns minutes per run on reserveCache retries before the action gives up. In host-mode the real caches live under the runner user's $HOME (~/go/pkg/mod, ~/.cache/go-build, ~/.local/share/pnpm, ~/.cache/ms-playwright) and persist between jobs without any actions/cache plumbing. Switching cache: off avoids the zombie retries and uses the local disk caches the runner already has warm. Reviving the cache service is a separate TODO. Until then this is the simpler and faster baseline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 06:39:22 +02:00
Ilia Denisov	9135991887	ci/ui-test: drop --with-deps now that runner is host-mode Tests · Go / test (pull_request) Successful in 2m6s Details Tests · UI / test (push) Failing after 2m32s Details Tests · Integration / integration (pull_request) Successful in 1m52s Details Tests · UI / test (pull_request) Successful in 2m3s Details `playwright install --with-deps` shells out to `sudo apt-get install` for the system libraries that headless browsers need. In a job container that runs as root this is silent; on a host-mode runner the non-interactive sudo prompts for a password, fails three times, and the step exits 1. Drop --with-deps. The system .so libraries are installed once on the host via `pnpm exec playwright install-deps` (or the equivalent apt-get incantation); workflow runs only need to fetch the browser binaries themselves, which lives under the runner user's home and needs no privilege. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 01:59:45 +02:00
Ilia Denisov	4a88b24f4b	ci: drop GIT_SSL_NO_VERIFY now that runner is host-mode The act_runner now executes jobs natively on the host (no per-job container), so actions/checkout uses the host's system CA store, which already trusts the host-Caddy root CA. The workaround that disabled TLS verification for `git fetch` is no longer needed and just hides legitimate cert issues if they ever appear. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 01:04:11 +02:00
Ilia Denisov	9ebb2e7f0f	ci: rename workflows for Gitea UI readability Tests · Go / test (push) Successful in 2m31s Details Tests · Integration / integration (pull_request) Successful in 2m23s Details Tests · Go / test (pull_request) Successful in 2m50s Details Tests · UI / test (push) Successful in 13m2s Details Tests · UI / test (pull_request) Successful in 13m22s Details Switches the `name:` field on every workflow to the bulleted style: Tests · Go (go-unit.yaml) Tests · UI (ui-test.yaml) Tests · Integration (integration.yaml) Deploy · Dev (dev-deploy.yaml) Build · Prod (prod-build.yaml) Deploy · Prod (deploy-prod.yaml) File names stay the same so existing path filters and any URL references continue to work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 00:22:53 +02:00
Ilia Denisov	0da360a644	dev-deploy: fix backend startup in CI Two bugs surfaced on the first real merge into development: 1. `${{ env.HOME }}` evaluates to empty string at the workflow stage, so GALAXY_DEV_GAME_STATE_DIR became `/.galaxy-dev/game-state`. Resolve in the shell instead of YAML. 2. The compose bind-mount of GeoIP2-Country-Test.mmdb referenced a path inside the runner's workspace volume, which the host Docker daemon cannot see — it created an empty directory and the backend crashed with "geoip database: is a directory" in a restart loop. Bake the file into the backend image so dev-deploy no longer needs a bind-mount; local-dev compose still mounts it on top for swap-in during development. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 00:22:16 +02:00
Ilia Denisov	c6c5f3c8dd	ci: skip TLS verify for actions/checkout on LAN Gitea go-unit / test (push) Successful in 2m28s Details go-unit / test (pull_request) Successful in 2m30s Details integration / integration (pull_request) Successful in 2m20s Details ui-test / test (push) Successful in 13m5s Details ui-test / test (pull_request) Successful in 14m31s Details The Gitea host serves https://gitea.iliadenisov.ru with a cert signed by host-Caddy's internal CA, which the runner-image's CA bundle does not trust. actions/checkout@v4 fails on `git fetch` as a result, so every workflow on gitea.lan has been failing — visible only now that we made gitea.lan the primary CI target. Sets GIT_SSL_NO_VERIFY=true on every workflow as a quick fix. Safe in practice because both endpoints sit on the same LAN. The long-term fix is to bake the Caddy root CA into the runner image and drop this env. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 23:43:51 +02:00
Ilia Denisov	f316952c12	ci: split workflows for linear development flow Reshapes .gitea/workflows/ around the new main ← development ← feature/* branching model: - go-unit.yaml — Go unit tests, runs on push/PR matching Go paths - ui-test.yaml — narrowed to Vitest + Playwright only (Go tests now live in go-unit.yaml) - integration.yaml — testcontainers suite, fires on PR to development/main and on push to development - dev-deploy.yaml — builds the stack and (re)deploys tools/dev-deploy/ on every merge into development - prod-build.yaml — builds prod images on push to main and uploads docker save bundles as artifacts (30-day retention) - deploy-prod.yaml — workflow_dispatch placeholder for the future SSH-based rollout ui-release.yaml is removed; its v* tag trigger is superseded by prod-build.yaml plus the manual deploy-prod entry point. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 23:26:46 +02:00
Ilia Denisov	39b7b2ef29	ci: skip docs-only triggers; document per-stage local-ci gate ui-test workflow gains a `!*/.md` negation so commits touching only markdown (READMEs, PLAN.md updates, topic docs) no longer kick off the full Go + Vitest + Playwright pipeline. Mixed commits keep triggering because at least one positive path (`ui/`, `gateway/`, …) still matches. Project CLAUDE.md adds a per-stage CI gate section so the local Gitea Actions runner is exercised at the close of every stage from any PLAN.md, with the push step pre-authorised. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 09:47:27 +02:00
Ilia Denisov	dc1c9b109c	phase 3	2026-05-07 09:40:37 +02:00
Ilia Denisov	1b5749bd31	fix: make ci green on a fresh runner Two issues surfaced by the first end-to-end ui-test.yaml run on a clean Linux runner that don't reproduce locally: - pkg/geoip tests load fixtures from the pkg/geoip/test-data git submodule (MaxMind-DB). actions/checkout@v4 does not fetch submodules by default, so the fixture path is missing on the runner. Both ui-test and ui-release workflows now check out with submodules: recursive. - pkg/util/TestWritable asserts that /usr/lib is not writable, which holds for unprivileged users but fails inside the catthehacker workflow container that runs as root. Skip that branch when os.Geteuid() == 0; the root-only "the writable dir is writable" branch still runs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 08:35:34 +02:00
Ilia Denisov	7450006ed3	phase 2: ui testing infrastructure Vitest + @testing-library/jest-dom matchers wired through tests/setup.ts. Playwright with four projects: chromium-desktop, webkit-desktop, chromium-mobile-iphone-13, chromium-mobile-pixel-5; traces and screenshots retained on failure. .gitea/workflows/ui-test.yaml runs Tier 1 on every push and pull request: monorepo Go service tests (backend with -p 1 to dodge testcontainer contention; gateway, game, every pkg/<name> module), pnpm install --frozen-lockfile, playwright install --with-deps, pnpm test, pnpm exec playwright test. Uploads playwright-report and test-results on failure. Integration suite stays gated behind make -C integration integration; deprecated client/ excluded. .gitea/workflows/ui-release.yaml mirrors Tier 1 on v* tag push and keeps commented placeholders for visual regression (Phase 33) and macOS iOS smoke (Phase 32). ui/docs/testing.md documents both tiers and the local invocations that mirror what CI runs. ui/PLAN.md Phase 2 marked done; Phase 3 gains a bullet to extend the go test command with ./ui/core/...; Phase 36 has the renamed release workflow path. tools/local-ci/ ships a self-contained docker-compose for verifying workflows against a local Gitea + arm64 act_runner before pushing to a real instance. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-07 08:24:44 +02:00

29 Commits