3d06f49f3ccc7503ee0eade82f93b34ef65e404c
395 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
3d06f49f3c |
fix(generator): drop incorrect distinctness assert in TestPlanetRandomName
`RandomName` builds the suffix as two independent `rand.Intn(1000)` calls, so the two 4-digit halves collide on ~0.1% of runs. The sub-test asserted `g[2] != g[3]`, which flakes whenever the same value lands twice — once per ~1000 sub-runs per class, so across the seven `PlanetClass` rows the integration suite hit it on `#199 go-unit.yaml` against `feature/subscribe-events-heartbeat` (`"0074"` collision). Distinctness is not a property `RandomName` promises and is not load-bearing for callers: `game/internal/controller/generate_game.go` uses these names for planet labels and already tolerates duplicate names across planets, so collisions inside one name are no worse than collisions between names. Drop the assert; keep the format and class-prefix checks, which are the actual contract. Stress-tested with `-count=200`: 200 consecutive iterations × 7 classes = 1400 sub-runs without a single failure where the prior version's flake probability would have surfaced ~once on average. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
08345606a5 |
Merge pull request 'fix(dev-deploy): explicit Cache-Control on the UI surface' (#18) from feature/caddy-cache-headers into development
Deploy · Dev / deploy (push) Successful in 27s
|
||
|
|
b85a9e1b9b |
fix(dev-deploy): explicit Cache-Control on the UI surface
Caddy's `file_server` did not set Cache-Control on the SvelteKit build, so browsers fell back to heuristic caching keyed off Last-Modified. On the long-lived dev environment the heuristic window leaves the previous deploy's `index.html` cached for minutes-to-hours, and Safari combined that with stale conditional requests into a visible multi-second freeze on every reload (the reproduction was "private window reloads instantly, normal window hangs; clearing Safari caches restores normal speed"). Push delivery itself works — heartbeat keeps the SubscribeEvents stream alive — but the bundle path stalls behind the browser revalidating a chain of stale chunks. Mirror the standard SvelteKit cache split inside both Caddyfiles: - `_app/immutable/*` — hash-named JS/CSS chunks Vite emits with content-addressed file names — `Cache-Control: public, max-age=31536000, immutable`. Safe to cache forever because the name changes whenever the content does, so the next deploy serves new files under new URLs. - Everything else (`index.html` fallback via `try_files`, `env.js`, `version.json`, `core.wasm`, `wasm_exec.js`, `favicon.svg`) — `Cache-Control: no-cache, must-revalidate`. The browser still uses the cached body when the ETag matches, but it always asks first; a fresh deploy reaches the user on the next reload without a manual cache clear. Smoke-tested locally: a docker-run Caddy with this config returns the immutable header only for `/_app/immutable/*` and the no-cache header for `/index.html`, `/env.js`, and the SPA-fallback path `/some/route`. The Caddyfile passes `caddy validate` in both `Caddyfile.dev` and `Caddyfile.prod`; the pre-existing formatting warning on line 7 is untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8170abd5fa | Merge pull request 'feat(gateway): unsigned gateway.heartbeat keeps Safari push streams alive' (#17) from feature/subscribe-events-heartbeat into development | ||
|
|
14b65389ef |
feat(gateway): unsigned gateway.heartbeat keeps Safari push streams alive
Tests · UI / test (push) Successful in 2m35s
Tests · Go / test (push) Successful in 1m56s
Tests · UI / test (pull_request) Has been cancelled
Tests · Integration / integration (pull_request) Successful in 1m42s
Tests · Go / test (pull_request) Successful in 2m0s
Browser fetch-streaming layers close response bodies they consider
idle after roughly 15-30 s without incoming bytes. Safari is the
most aggressive, but the symptom matters everywhere: a quiet
SubscribeEvents stream (lobby, between turns, mailbox empty) gets
torn down by the browser, the EventStream singleton reconnects with
backoff, and any push event that fires inside the reconnect window
is lost because `push.Hub` queues are not persisted across
subscription closes. The user-visible failure mode is the
intermittent "Fetch API cannot load … due to access control checks"
console error (a misleading WebKit symptom — CORS headers are
actually present) plus missed turn-ready / mail-received toasts.
Server-side fix: a silence-based heartbeat at the
`authenticatedPushStreamService` wrapper layer. After the signed
`gateway.server_time` bootstrap event, gateway wraps the bound
stream with `heartbeatingStream`. Every tail Send (fan-out, future
variants) resets the silence timer; when the timer elapses, a
goroutine emits `gateway.heartbeat` with only `EventType` set —
everything else stays at proto3 defaults, so the wire frame is
~45 bytes amortised. A `sendMu` serialises the heartbeat goroutine
with tail Sends because grpc.ServerStream.Send is not goroutine-safe.
The heartbeat is intentionally UNSIGNED: heartbeats carry no
payload, dispatch to no handler on the client, and an injected
heartbeat trivially causes no user-visible state change. TLS still
protects the wire and real events keep the signed envelope
unchanged. Documented in `docs/ARCHITECTURE.md` § 15 alongside the
per-scale bandwidth projection (100…100 000 clients × 15…60 s).
Config: new `GATEWAY_PUSH_HEARTBEAT_INTERVAL` (default `15s`,
`0s` disables). Telemetry: new
`gateway.push.heartbeats_sent{outcome}` counter so operators can
budget bandwidth and spot a sudden `outcome=error` bump as an
upstream-failing-before-flush signal.
Client (`ui/frontend/src/api/events.svelte.ts`): early `continue`
on `event.eventType === "gateway.heartbeat"` before `verifyEvent`,
`verifyPayloadHash`, or dispatch — empty signature would otherwise
trip SignatureError and reconnect. A leading heartbeat still flips
`connectionStatus` to `connected` and resets backoff, because
receiving one is proof the stream is healthy.
Tests:
- `push_heartbeat_test.go`: unit tests for the wrapper — zero
interval returns nil, heartbeat fires after silence, real Send
resets the timer, Stop / context-cancel halt the goroutine,
Send errors propagate.
- `server_test.go`: integration tests through the full gateway
pipeline — heartbeat fires after the configured silence window,
zero interval keeps the stream silent.
- `config_test.go`: default applied, env-override parsed,
negative value rejected.
- `events.test.ts`: heartbeat skipped before verification + not
dispatched to handlers; leading heartbeat still flips
`connectionStatus` to `connected`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
8f84075c4b | Merge pull request 'fix(battle-viewer): unblock synthetic-game battle load' (#16) from feature/synthetic-battle-loading-fix into development | ||
|
|
bde01b1ce2 |
fix(battle-viewer): unblock synthetic-game battle load
The Phase 28 ConnectRPC migration of the battle viewer added a guard in `lib/active-view/battle.svelte` that waits for the surrounding layout to publish a `GalaxyClient` before issuing the fetch. The in-game shell layout deliberately skips `galaxyClient.set(...)` on the synthetic branch (gateway is not reachable in synthetic mode), so for any battle opened from a synthetic-report game the viewer sat on "loading battle…" forever — `fetchBattle` was never called, so the synthetic-fixture short-circuit it carries was unreachable. Let the guard skip synthetic ids: `fetchBattle` already resolves those through `lookupSyntheticBattle` and never touches the client, so its signature widens to `GalaxyClient | null` and the synthetic path passes `null`. The live path still waits for the handle as before; a `null` client on the live path now fails fast with a transport-level `BattleFetchError` instead of silently sitting on `loading`. Tests: - Existing "loading placeholder" smoke now uses a non-synthetic game id so it keeps asserting the live-path wait. - Two new cases pin the synthetic behaviour: missing fixture → `battle-not-found`; registered fixture → `BattleViewer` mounts. Docs: - `docs/FUNCTIONAL.md` §6.5 still described the pre-Phase-28 raw REST path. Updated to the signed ConnectRPC command and noted the synthetic short-circuit. Russian mirror updated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
82bdb6777a |
Merge pull request 'fix(dev-deploy): seed geoip onto a named volume' (#15) from feature/dev-deploy-geoip-volume into development
Deploy · Dev / deploy (push) Successful in 28s
|
||
|
|
f70258849f |
fix(dev-deploy): seed geoip onto a named volume
`docker restart galaxy-dev-backend` failed with "not a directory"
after every dev-deploy workflow run. Root cause: the compose file
bind-mounted the geoip database via a relative path
(`../../pkg/geoip/test-data/test-data/GeoIP2-Country-Test.mmdb`).
When the Gitea runner invoked `docker compose up`, the path
resolved against the runner's ephemeral workspace under
`/home/runner/.cache/act/<hash>/hostexecutor/...`. The bind source
baked into the running container therefore pointed at that
ephemeral path; the runner deleted the workspace once the workflow
finished, and any later `docker restart` could not remount.
Replace the bind with a named volume `galaxy-dev-geoip-data`,
seeded at deploy time:
- `tools/dev-deploy/docker-compose.yml`: mount
`galaxy-dev-geoip-data:/var/lib/galaxy:ro` instead of a relative
bind. Declare the volume in the top-level `volumes:` block.
- `.gitea/workflows/dev-deploy.yaml`: new `Seed geoip volume` step
(placed right after the existing UI-volume seed) copies the
fixture from `pkg/geoip/test-data/test-data/` into the named
volume via an ephemeral alpine container, the same pattern UI
seeding already uses.
- `tools/dev-deploy/Makefile`: new `seed-geoip` target performs
the same copy from the persistent checkout. `up` and `rebuild`
now depend on it, so a hand-run `make -C tools/dev-deploy up`
populates the volume without operator action.
- `tools/dev-deploy/README.md`: updated the make-targets table to
list `seed-geoip`.
- `tools/dev-deploy/KNOWN-ISSUES.md`: the entry for the restart
failure is downgraded to a "fixed" postmortem; the symptom,
cause, and where the fix lives are kept for future reference.
Verification on the dev host (this branch checked out):
$ make -C tools/dev-deploy up # populates the volume, brings stack healthy
$ docker restart galaxy-dev-backend # used to error "not a directory"
$ until [ "$(docker inspect -f '{{.State.Health.Status}}' galaxy-dev-backend)" = "healthy" ]; do sleep 2; done
$ echo "ok" # backend up 6s, healthy
The pre-existing sandbox engine `galaxy-game-80f3ce86-...` survived
both `make up` and `docker restart` untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
d19aa3aac5 |
Merge pull request 'fix(integration): scope preclean to galaxy.stack=integration' (#14) from feature/preclean-stack-scope into development
Tests · Integration / integration (push) Successful in 1m36s
|
||
|
|
a338ebf058 |
fix(integration): scope preclean to galaxy.stack=integration
Tests · Integration / integration (pull_request) Successful in 1m37s
Root cause for the long-standing "Dev Sandbox flips to cancelled after dev-deploy" symptom in push-triggered cycles: when `integration.yaml` runs in parallel with `dev-deploy.yaml`, its `integration/scripts/preclean.sh` issues a `docker rm -f` over every container labelled `galaxy.backend=1`. That label is stamped by the backend's runtime adapter on every engine it spawns — including the engines living in the long-lived dev-deploy environment on the same Docker daemon. Each post-merge auto-deploy therefore had the integration preclean wipe the dev-sandbox engine, and the new backend's reconciler tick observed `container disappeared` and cascaded the sandbox into `cancelled`. Fix: - `integration/testenv/backend.go` now sets `BACKEND_STACK_LABEL=integration` on every backend-under-test, so the engines spawned by integration carry `galaxy.stack=integration` in addition to `galaxy.backend=1`. The backend support for this env was added in the previous CI tidy-up PR (#13). - `integration/scripts/preclean.sh` gains a multi-label AND filter helper and uses it to scope engine cleanup to the combination `galaxy.backend=1 AND galaxy.stack=integration`. dev-deploy and local-dev engines carry different `galaxy.stack` values, so the AND match leaves them alone. - `docs/ARCHITECTURE.md` "Container labels" — refreshed to call out the AND-scoping rule and the new integration backend stamp. - `tools/dev-deploy/KNOWN-ISSUES.md` — the sandbox-cancel entry gets an "Update" section recording the root cause and the fix; the status is downgraded to "partially fixed" because the solo `workflow_dispatch` reproduction (which does NOT trigger integration) remains unexplained. - `tools/dev-deploy/KNOWN-ISSUES.md` — separately, document the `docker restart galaxy-dev-backend` failure caused by the runner-workspace bind-mount that surfaced while diagnosing this issue. Workaround: `make -C tools/dev-deploy up` from the persistent checkout. Real fix is a follow-up (bake fixture into image or copy to named volume). Verification: - `go build ./backend/... ./integration/...` — clean. - `bash -n integration/scripts/preclean.sh` — syntax OK. - Live AND-filter check on the dev host: `docker ps -aq --filter label=galaxy.backend=1 --filter label=galaxy.stack=integration` returns nothing while the dev-deploy engine `galaxy-game-80f3ce86-...` keeps running. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
f91cf6eb41 | Merge pull request 'chore(ci): tidy CI/dev infra — drop local-ci, lift migration rule' (#13) from feature/ci-tidy-up into development | ||
|
|
daed2690c1 |
fix(compose): keep galaxy.stack label on containers only
The previous commit stamped `galaxy.stack=<value>` on services, volumes, and networks. Putting it on volumes/networks changes their compose config-hash on every label revision, so `docker compose up` tries to recreate them — which on the long-lived dev environment either destroys the postgres data volume or deadlocks while trying to remove `galaxy-dev-internal` with containers still bound to it. Observed live: run #184 hung in compose recreate after the three stateful services were stopped, with no recovery. Containers alone are sufficient for the cleanup contract (we filter containers, not volumes or networks). Roll back the label on volumes and networks in both compose files and capture the rule in docs/ARCHITECTURE.md so the next contributor does not reintroduce it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a9087691a3 |
chore(ci): tidy CI/dev infra — drop local-ci, lift migration rule, scope by galaxy.stack label
Five connected cleanups across the dev/CI infrastructure:
1. Drop tools/local-ci/. The standalone Gitea + act_runner stack was
the legacy "offline workflow validator"; the per-stage CI gate now
runs on gitea.lan and the directory was only retained as a
fallback. Removing it leaves no operational dependency: backend,
gateway, and game code have no references; documentation that
pointed at it (CLAUDE.md, docs/ARCHITECTURE.md, ui/docs/testing.md,
tools/dev-deploy/README.md, tools/local-dev/README.md) is updated
in this same change. Historical "Verified on local-ci run N"
markers in ui/PLAN.md are preserved unchanged.
2. Lift the pre-production single-migration rule. The rule forced
every schema delta into 00001_init.sql and required a manual
make clean-data wipe on every backward-incompatible change in
tools/dev-deploy/. Future schema deltas now land as additive
sequence-numbered files (00002_*.sql, …) that goose applies
automatically on backend startup; 00001_init.sql becomes an
immutable baseline. Authoring conventions live in
backend/internal/postgres/migrations/README.md. The chain may be
squashed back into a fresh 00001 as a deliberate one-time
operation before the first production deployment.
3. Document the deployment cadence. The dev environment is
single-tenant: pushes to feature/* run the test workflows
(go-unit, ui-test, integration) only; dev-deploy.yaml fires on
push to development. A workflow_dispatch override on
dev-deploy.yaml lets a developer preview a feature branch on the
shared dev environment before merge; the next merge into
development overwrites the manual deploy idempotently.
4. Scope compose-managed resources by an explicit
galaxy.stack=<local-dev|dev-deploy> label. Both compose files
stamp the label on every service, network, and named volume.
Makefiles in tools/local-dev/ and tools/dev-deploy/ filter their
engine-cleanup operations by (stack-label AND engine OCI title)
so they never touch unrelated workloads on the same daemon.
dev-deploy.yaml gains a pre-`compose up` step that reaps stale
exited/dead containers under the dev-deploy stack label.
5. Backend now stamps the same galaxy.stack=<value> label on every
engine container it spawns, sourced from a new BACKEND_STACK_LABEL
env var (empty → label not applied; legacy-safe). Both compose
files set it to their stack name (local-dev / dev-deploy). The
contract is recorded in docs/ARCHITECTURE.md under
"Container labels". A package-level test in
backend/internal/runtime exercises both the label-present and
label-absent paths.
No tests intentionally regressed: go test ./backend/internal/{config,
runtime,dockerclient} is green, both compose files validate cleanly,
and the backend, gateway, and game modules all build.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
5eec7013ba | Merge pull request 'chore(dev-deploy): KNOWN-ISSUES entry for sandbox-cancel after redispatch' (#12) from chore/dev-sandbox-cancel-todo into development | ||
|
|
49f614926a |
KNOWN-ISSUES: park sandbox-cancel; owner rejected host-side hypotheses
After the live investigation, the project owner confirms that none of the host-side cleanup paths apply: no docker prune cron, no manual `docker rm`, no `dockerd` restart in the window, and the engine binary does not crash while idling on API calls. Replace the host-side hypothesis list with a one-line note that they were considered and rejected, narrow the open suspicion to the `dev-deploy.yaml` job sequence (`docker build` + `docker compose build` + the alpine `docker run --rm` for UI seeding + `docker compose up -d --wait --remove-orphans`), and park the entry. Reopen if the symptom recurs with a fresh `docker events --since 0` capture armed before the deploy starts. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
cadb72b412 |
KNOWN-ISSUES: rule out compose orphan reap; narrow to host-side reap
A live `docker inspect` of an engine container and two redispatch
runs with `docker events` captured confirm:
- Engine has no `com.docker.compose.*` labels and `AutoRemove=false`,
so `--remove-orphans` cannot reap it.
- Two consecutive `dev-deploy.yaml` redispatches with an engine
already running emitted `die` / `destroy` events only for
`galaxy-dev-{backend,api,caddy}` — never for the engine.
- The reconciler tick that fires 60s after backend recreate
correctly matched the surviving engine in both cases
(`status=running` in both `games` and `runtime_records`).
- `runtime.Service` has no `Shutdown` that proactively removes
engine containers, so a graceful backend exit also leaves them
alone.
The repro window therefore needs a separate trigger that removed
the engine container outside of compose. The new hypotheses point
at host-side `docker prune` jobs, a `dockerd` restart that lost the
container, or an early `Engine.Init` failure that exited the engine
before `status=running` reached the runtime row. The investigation
list now leads with `journalctl -u docker` and the host crontab —
those are the cheapest checks to confirm or rule out next.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
5177fef2ef |
tools/dev-deploy: log the sandbox-cancellation TODO
Capture the diagnostic notes for the issue we hit after every `dev-deploy.yaml` redispatch: the freshly-bootstrapped "Dev Sandbox" game ends up `cancelled` ~15 minutes later, with the runtime reconciler reporting "container disappeared". The engine never shows up in `docker ps -a --filter label=galaxy-game-engine`, so either it never spawned or it was removed before any host-side snapshot. `KNOWN-ISSUES.md` records the symptom, the log excerpt, three working hypotheses (runtime spawn race, `--remove-orphans` interaction, engine `--rm` lifecycle), and the investigation checklist before opening an issue. The README gets a one-line pointer so future redeploys land on the doc immediately. No code change — this is the placeholder so the next person investigating the cancellation pattern does not have to rediscover the diagnostic from scratch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
823be9d980 |
Phase 28: diplomatic mail UI
Merges feat/ui-stage-28 into development. |
||
|
|
2119f825d6 |
mail UI: dedupe broadcast fan-out and drop in-game admin compose
Two issues surfaced once the long-lived dev environment finally
reached the diplomail view:
1. `/sent` returns one row per recipient for broadcast and admin
fan-outs (so the admin tooling can render the materialised
audience). The list pane fed all rows into the stand-alone
bucket, so the `{#each entries as e (entryKey(e))}` key in
`thread-list.svelte` collapsed to the same `standalone:${id}`
for every recipient and Svelte 5 aborted the render with
`each_key_duplicate`. Dedupe stand-alones by `message_id` in
`buildEntries`.
2. The compose dialog exposed an `admin` kind toggle gated on
"owner of game". That was a Phase 28 plan decision, but admin
compose is an operator tool (server admin), not an in-game
action — every game owner should not be able to broadcast
admin notifications. Drop the admin option, the audience
sub-toggles, and the admin path through `submit`. The
`MailStore.composeAdmin` wrapper and the backend RPC stay so
the future admin UI can call them.
Vitest covers the fan-out dedup with three rows sharing one
`message_id` collapsing to a single stand-alone entry.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
57e6c1d253 |
gateway: CORS allow-list for the authenticated Connect-Web surface
The public REST listener already exposes `GATEWAY_PUBLIC_HTTP_CORS_ALLOWED_ORIGINS`; the authenticated Connect-Web listener on the separate gRPC port had no equivalent. That worked in `tools/local-dev` (Vite proxy makes everything same-origin) and would work in production once UI and gateway share a single hostname, but the long-lived dev environment serves the UI from `https://www.galaxy.lan` and the gateway from `https://api.galaxy.lan` — every `/galaxy.gateway.v1.EdgeGateway/*` fetch failed in the browser with the WebKit "Load failed" generic message because the response carried no `Access-Control-Allow-Origin` header. Lobby rendered as "[unknown] Load failed" with no game. Mirror the public-REST CORS surface for the authenticated handler: - new env `GATEWAY_AUTHENTICATED_GRPC_CORS_ALLOWED_ORIGINS`; - new `AuthenticatedGRPCConfig.CORSAllowedOrigins` field; - new `grpcapi.withCORS` middleware wrapping the Connect mux; - dev-deploy stack sets the env to `https://www.galaxy.lan`. The middleware speaks plain net/http (the Connect handler is mounted on a ServeMux, not gin), handles preflight 204 immediately, and exposes the Connect-Web header set the browser needs to read the response (`Grpc-Status`, `Grpc-Message`, `Connect-Protocol-Version`). Empty allow-list disables the middleware — production stays at "single hostname" by default. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
4b2a949f12 |
dev-deploy Caddy: route Connect-Web traffic to gateway :9090
`api.galaxy.lan` was proxying every path to `galaxy-api:8080` (the public REST listener), so authenticated Connect-Web calls (`/galaxy.gateway.v1.EdgeGateway/ExecuteCommand`, `/galaxy.gateway.v1.EdgeGateway/SubscribeEvents`) collapsed to a 404 from the public route table — the lobby loaded the static bundle but every authenticated query failed silently. Split routing by path: `/galaxy.gateway.v1.EdgeGateway/*` goes to the authenticated listener on `:9090`, everything else stays on `:8080`. Mirrors the Vite dev-server proxy in `ui/frontend/vite.config.ts`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
81917acc3e |
dev-deploy: enable Dev Sandbox bootstrap and synthetic-report loader
Two long-standing dev-environment ergonomics had not survived the move from the bespoke local-dev stack to the CI-driven dev-deploy: 1. `BACKEND_DEV_SANDBOX_EMAIL` defaulted to an empty string in the dev-deploy compose, so the auto-provisioned "Dev Sandbox" game never appeared on `https://www.galaxy.lan`. Bake `dev@galaxy.lan` as the default — matches `.env.example` and lets a developer who logs in with that email find a ready-to-play game in the lobby. 2. The lobby's synthetic-report loader was gated on `import.meta.env.DEV`, which is true only for `vite dev` (the tools/local-dev path). The long-lived dev environment builds with `vite build` (production mode), so the section was always stripped from its bundle. Gate it on an explicit `VITE_GALAXY_DEV_AFFORDANCES` flag instead and set it both in `.env.development` (preserves `pnpm dev` behaviour) and in the `dev-deploy.yaml` build step. The `prod-build.yaml` build path leaves the flag unset, so production stays clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
859b157a59 |
auth dev-fixed-code bypasses attempts cap; dev-deploy gains manual dispatch
Two problems showed up while trying to log into the long-lived dev
environment with the dev-fixed code `123456`:
1. `ConfirmEmailCode` checked the per-challenge attempts ceiling
*before* the dev-fixed-code override. A developer who burned past
`ChallengeMaxAttempts` on an existing un-consumed challenge (easy
to trigger when the throttle reuses one challenge_id) hit
`ErrTooManyAttempts` and the UI rendered "code expired or already
used" even though the fixed code was correct. Reorder so the
dev-fixed-code branch runs first and bypasses both the bcrypt
verify and the attempts gate. Production stays unaffected
because production loaders refuse to set `DevFixedCode`.
2. `dev-deploy.yaml` only fires on push to `development`, so the
matching docker-compose default change for
`BACKEND_AUTH_DEV_FIXED_CODE` could not reach the running stack
before this PR merged. Add `workflow_dispatch: {}` so a developer
can deploy any branch — typically a feature branch under review —
from the Gitea Actions UI without waiting for the merge.
Covered by a new `TestConfirmEmailCodeDevFixedCodeBypassesAttemptsCeiling`
integration test that burns through the ceiling with wrong codes
then proves the dev-fixed code still produces a session.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
166baf4be0 |
battle-viewer e2e: mock user.games.battle ConnectRPC command
Phase 28 moved the battle fetch off the REST passthrough onto the signed envelope, so the Playwright spec's `page.route(...)` against the old REST path no longer intercepts anything and the viewer times out waiting for data. Update the spec to: - Build a FlatBuffers `BattleReport` payload in `fixtures/battle-fbs.ts` (mirrors `report-fbs.ts`'s pattern). - Add a `user.games.battle` case to the ExecuteCommand mock that decodes the FBS `GameBattleRequest`, returns the encoded report when the battle_id matches the seeded one, and surfaces a canonical `not_found` resultCode otherwise. - Drop the obsolete REST route stubs. - Drive the negative-path test with a real UUID that does not match the seeded one, so the gateway-side switch is the source of the 404 (the old `missing-uuid` literal was no longer a valid wire shape for the UUID decoder). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
ebd156ece2 |
battle-fetch: migrate to user.games.battle ConnectRPC command
Tests · UI / test (push) Has been cancelled
Tests · Go / test (pull_request) Successful in 2m6s
Tests · Go / test (push) Successful in 2m7s
Tests · Integration / integration (pull_request) Successful in 1m47s
Tests · UI / test (pull_request) Failing after 3m42s
The Phase 27 BattleViewer was the last UI surface still issuing raw
fetch() against the backend REST contract (`/api/v1/user/games/...
/battles/...`). The dev-deploy gateway never proxied that path, so
the viewer worked only in tools/local-dev/. Move it onto the signed
ConnectRPC channel every other authenticated surface already uses.
Wire pieces:
- FBS GameBattleRequest in pkg/schema/fbs/battle.fbs, regenerated
Go + TS bindings.
- MessageTypeUserGamesBattle constant + GameBattleRequest struct in
pkg/model/report/messages.go.
- pkg/transcoder/battle.go gains GameBattleRequestToPayload and
PayloadToGameBattleRequest helpers.
- gateway games_commands.go switches on the new message type and
GETs /api/v1/user/games/{id}/battles/{turn}/{battle_id}; the JSON
response is re-encoded as a FlatBuffers BattleReport before being
returned. 404 from backend surfaces as the canonical `not_found`
gateway error.
- ui/frontend/src/api/battle-fetch.ts now builds the FBS request,
calls GalaxyClient.executeCommand, and decodes the FBS response
into the existing UI shape (Record<string,string> race/ship maps,
string-form UUID). BattleFetchError carries an HTTP-style status
derived from the result code so the active-view's not_found branch
keeps working.
- battle.svelte pulls the GalaxyClient from the in-game shell
context. While the layout's boot Promise.all is in flight the
effect stays in `loading` until the client handle becomes
non-null.
- ui/Makefile FBS_INPUTS gains battle.fbs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
8bc75fd71b |
dev-deploy: default BACKEND_AUTH_DEV_FIXED_CODE to 123456
The long-lived dev environment now opts into the bcrypt-bypass on a fresh `up`/`rebuild` so a returning developer can sign in with `123456` even after the matching browser session was cleared (the real emailed code is single-use). Set the variable to an empty string in `.env` to force real Mailpit codes (mail-flow QA). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
1556d36511 |
Phase 28: mark stage done after CI gate green
Gitea runs at commit
|
||
|
|
6d0272b078 |
Phase 28 (Step 11): Vitest coverage for MailStore threading
`tests/mail-store.test.ts` exercises the `entries` derived rune with handcrafted inbox + sent fixtures: - personal messages exchanged with one race collapse into a per-race thread with messages sorted oldest → newest; - system mail (`sender_kind=system`) and admin notifications (`sender_kind=admin`) surface as stand-alone items even when a race-name snapshot is present; - the caller's own paid-tier broadcasts (`broadcast_scope= game_broadcast`) render as stand-alone outgoing items; - `unreadCount` counts inbox rows with `readAt === null`. The store fields are mutated directly to avoid wiring a fake `GalaxyClient`; the underlying `$derived` rune fires whenever those fields change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c48bc83890 |
Phase 28 (Step 10): docs — diplomail UI topic + FUNCTIONAL mirror
- `ui/docs/diplomail-ui.md`: new topic doc covering the wire surface, recipient-by-race-name decision, threading model, translation toggle, push events, badge, layout, and accessibility. - `docs/FUNCTIONAL.md` §11.4 grows a paragraph that records the UI's per-race threading rule, the absent read-receipt UX, and the recipient-by-race-name compose path. Mirrored verbatim into `docs/FUNCTIONAL_ru.md`. - `ui/PLAN.md` Phase 28 marked done with a "Decisions during stage" block matching the implementation plan, and the artifact list updated to the actual file set. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
db81bd8e08 |
Phase 28 (Steps 7+8): header unread badge + push/init wiring
Step 7 — header view-menu badge.
`view-menu.svelte` reads `mailStore.unreadCount` and renders an
inline pill next to the "diplomatic mail" entry whenever the
counter is non-zero. The badge styling matches the per-row dot in
`thread-list.svelte` so the two surfaces feel consistent.
Step 8 — push event handler + MailStore init in the in-game layout.
`routes/games/[id]/+layout.svelte`:
- registers a `diplomail.message.received` handler alongside the
existing `game.turn.ready` / `game.paused` ones, parses the
signed payload, calls `mailStore.applyPushEvent` to refresh the
inbox for the matching game, and raises a toast with a "view"
deep-link that navigates to `/games/:id/mail`;
- adds `mailStore.init({ client, cache, gameId })` to the boot
`Promise.all` so the inbox + sent lists are warm by the time the
view mounts, and the badge counter is populated before any user
interaction;
- disposes the new subscription in the `onDestroy` block so a game
switch does not leak handlers across navigations.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f7300f25a3 |
Phase 28 (Steps 6+9): mail active view + i18n keys
Step 6 — mail active view + subcomponents. - `lib/active-view/mail.svelte` replaces the Phase 10 stub with the list / detail layout: two-pane on desktop, one-pane stack on mobile (CSS media query, no separate route). - `lib/active-view/mail/thread-list.svelte` renders per-race threads collapsed to their last message plus stand-alone system / admin / outgoing-broadcast items, with unread badges. - `lib/active-view/mail/thread-pane.svelte` is the chat-style transcript for one race; bodies render through `textContent`, per-message Show original / translation toggles flip the rendering when a translated body is present, and a persistent reply box at the bottom calls `mailStore.composePersonal`. - `lib/active-view/mail/system-item-pane.svelte` renders one stand-alone item read-only with the same translation toggle. - `lib/active-view/mail/compose.svelte` is the compose dialog: recipient race picker fed from `report.races[]`, kind toggle (personal / broadcast / admin), admin sub-toggle for target user / all and recipient-scope picker. Server-side enforces paid-tier and owner gating; the UI surfaces 403 inline. - `lib/active-view/mail/system-titles.ts` keeps the keyword → i18n-title mapping for lifecycle-hook system mail so both the list and the detail pane pick the same canonical title. Step 9 — i18n strings (en + ru). `game.mail.*`, `game.view.mail.badge`, `game.events.mail_new.*`, `game.mail.system.*` keys added in lockstep across both locales covering compose labels / validation copy / per-system titles / translation toggle / reply / delete affordances. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
fdd5fd193d |
Phase 28 (Step 5): MailStore reactive state
Adds `src/lib/mail-store.svelte.ts` — the reactive store that
coordinates the in-game mail view. Responsibilities:
- holds the inbox and sent listings for the current game and fires
the initial parallel fetch (`fetchInbox` + `fetchSent`) on
`setGame`;
- exposes a `entries` derived rune that builds the unified list
pane: per-race threads merged from incoming + outgoing personal
messages, plus stand-alone items for system / admin / own
paid-tier broadcasts. Thread messages are sorted oldest → newest
for chat-style rendering; the list itself sorts newest-first by
the most-recent entry timestamp;
- derives `unreadCount` from `readAt === null` rows for the header
view-menu badge;
- imperative `markRead` / `softDelete` actions with optimistic
state flips and roll-back on RPC failure;
- compose actions for personal / paid-tier broadcast / owner-admin
sends;
- `applyPushEvent(gameId)` hook called by the layout when a
`diplomail.message.received` push frame arrives; refetches the
inbox without trusting the preview payload;
- persists the most recent message id under
`cache.diplomail/${gameId}/last-seen` so a returning session can
pre-paint the badge without a network round-trip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
7378d4c8ed |
Phase 28 (Step 4): UI api/diplomail.ts wrappers
Adds typed wrappers around `GalaxyClient.executeCommand` for the eight Phase 28 mail RPCs. Each wrapper builds the matching FlatBuffers request, decodes the response, and surfaces backend errors through a dedicated `MailError` (mirroring `LobbyError`). The compose helpers accept the recipient race name directly so the UI can feed it straight from `report.races[].name` without a membership lookup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
4cb03736de |
Phase 28 (Step 3): gateway translators for user.games.mail.*
Adds the gateway-side translation layer that maps the eight new
ConnectRPC mail commands onto backend's
`/api/v1/user/games/{game_id}/mail/*` REST endpoints.
- `gateway/internal/backendclient/mail_commands.go` defines
`ExecuteMailCommand` and one helper per command (inbox, sent,
message.get, send, broadcast, admin, read, delete). Each helper
decodes the FlatBuffers request envelope, issues the REST call
via the existing `*RESTClient.do`, decodes the JSON body, and
re-encodes a typed FlatBuffers response. Recipient identifiers
travel through unchanged so the new `recipient_race_name`
shortcut introduced in Step 1 reaches backend untouched.
- `routes.go` exposes a `MailRoutes` constructor and a matching
`mailCommandClient` implementing `downstream.Client`.
- `cmd/gateway/main.go` registers the new routes alongside the
existing user / lobby / game-engine routes.
- `mail_commands_test.go` covers the inbox, send-by-race-name, and
read-state paths end-to-end against an `httptest.Server`,
asserting request shapes (path, body, X-User-ID) and the
decoded FlatBuffers response.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
57d2286f5e |
Phase 28 (Step 3a): /sent returns full message detail per recipient
Phase 28's in-game mail UI threads sent messages by the recipient race name, so the bulk `/sent` endpoint now returns the same `UserMailMessageDetail` shape as `/inbox` — single sends contribute one row per message, broadcasts contribute one row per addressee and the UI collapses them by `message_id` into a stand-alone item. - `Store.ListSent` / `Service.ListSent` switched from `[]Message` to `[]InboxEntry`. SQL grows an INNER JOIN with `diplomail_recipients`. - Handler emits `userMailMessageDetailWire` items; the deprecated `userMailSentSummaryWire` is removed. - `openapi.yaml`: `UserMailSentList.items` now reference `UserMailMessageDetail`; the standalone `UserMailSentSummary` schema is dropped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
fed282f2d2 |
Phase 28 (Step 2): FBS schemas + message-type constants for mail
Tests · UI / test (push) Has been cancelled
Tests · Go / test (pull_request) Successful in 2m4s
Tests · Go / test (push) Successful in 2m5s
Tests · Integration / integration (pull_request) Successful in 1m54s
Tests · UI / test (pull_request) Successful in 2m50s
Adds the wire schema for the eight `user.games.mail.*` ConnectRPC commands together with the shared payload types (`MailMessage`, `MailRecipientState`, `MailBroadcastReceipt`). Send-request tables carry the optional `recipient_race_name` introduced in Step 1. Drops: - `pkg/schema/fbs/diplomail.fbs` — schema sources; - `pkg/schema/fbs/diplomail/*.go` — generated Go bindings (flatc `--go --go-module-name galaxy/schema/fbs`); - `pkg/model/diplomail/diplomail.go` — message-type catalog used by the gateway router; - `ui/frontend/src/proto/galaxy/fbs/diplomail/*.ts` — generated TS bindings consumed by the upcoming UI client wrapper; - `ui/Makefile` `FBS_INPUTS` extended to pick the new schema up on the next `make -C ui fbs-ts` run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
7b43ce5844 |
Phase 28 (Step 1): backend support for race-name mail send
Phase 28's in-game mail UI groups personal threads by the other party's race. To support that without an extra membership-listing RPC, the diplomail subsystem now: - accepts `recipient_race_name` on `POST /messages` and `POST /admin` (target=user) as an alternative to `recipient_user_id`; the service resolves it via the existing `Memberships.ListMembers(gameID, "active")` and rejects with `forbidden` when the matching member is no longer active; - snapshots `diplomail_messages.sender_race_name` at send time for every player sender (admin / system rows stay NULL). The UI keys per-race threading on this column. Schema, openapi, README, and a focused e2e test for the new path (happy path + dual / missing / unknown / kicked errors) land in this commit; the gateway + UI legs follow in subsequent commits on this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
74c1e7ab24 | Merge pull request 'diplomail (Stage A→D): backend in-game diplomatic mail' (#10) from feature/diplomail-backend into development | ||
|
|
2d36b54b8d |
diplomail (Stage F): docs + edge-case tests + LibreTranslate recipe
Closes the documentation gaps from the freshly-audited diplomail implementation. FUNCTIONAL.md gains a §11 "Diplomatic mail" with the full user-facing story across all five stages, mirrored into FUNCTIONAL_ru.md as the project conventions require. A new backend/docs/diplomail-translator-setup.md captures the LibreTranslate operational recipe (Docker image, env wiring, manual smoke test, troubleshooting). The package README gains a "Multi-instance posture" note documenting the deliberate absence of FOR UPDATE in the worker pickup query — single-instance is safe today; multi-instance scaling will revisit the claim mechanism. Two small edge-case tests round things out: malformed LibreTranslate response bodies (single string, short array, empty array, missing field) must surface as errors so the worker falls back instead of crashing; and an empty translation queue must produce zero events on three consecutive Worker.Tick calls. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9f7c9099bc |
diplomail (Stage E): LibreTranslate client + async translation worker
Synchronous translation on read (Stage D) blocks the HTTP handler on translator I/O. Stage E switches to "send moments-fast, deliver when translated": recipients whose preferred_language differs from the detected body_lang are inserted with available_at=NULL, and an async worker turns them on once a LibreTranslate call materialises the cache row (or fails terminally after 5 retries). Schema delta on diplomail_recipients: available_at, translation_attempts, next_translation_attempt_at, plus a snapshot recipient_preferred_language so the worker queries do not need a join. Read paths (ListInbox, GetMessage, UnreadCount) filter on available_at IS NOT NULL. Push fan-out is moved from Service to the worker so the recipient only sees the toast when the inbox row is actually visible. Translator backend is now a configurable choice: empty BACKEND_DIPLOMAIL_TRANSLATOR_URL → noop (deliver original); populated → LibreTranslate HTTP client. Per-attempt timeout, max attempts, and worker interval all live in DiplomailConfig. The HTTP client itself is unit-tested via httptest (happy path, BCP47 normalisation, unsupported pair, 5xx, identical src/dst, missing URL); worker delivery + fallback paths are covered by the testcontainers-backed e2e tests in diplomail_e2e_test.go. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
e22f4b7800 |
diplomail (Stage D): language detection + lazy translation cache
Replaces the LangUndetermined placeholder with whatlanggo-backed body detection on every send path, then adds a translation cache keyed on (message_id, target_lang) populated lazily on the per-message read endpoint. The noop translator that ships with Stage D returns engine="noop", which the service treats as "translation unavailable" — wiring a real backend (LibreTranslate HTTP client is the documented next step) is a one-file swap. GetMessage and ListInbox now accept a targetLang argument; the HTTP layer resolves the caller's accounts.preferred_language and forwards it. Inbox uses the cache only (never calls the translator) so bulk reads stay fast under future SaaS backends. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
362f92e520 |
diplomail (Stage C): paid-tier broadcast + multi-game + cleanup
Closes out the producer-side of the diplomail surface. Paid-tier players can fan out one personal message to the rest of the active roster (gated on entitlement_snapshots.is_paid). Site admins gain a multi-game broadcast (POST /admin/mail/broadcast with `selected` / `all_running` scopes) and the bulk-purge endpoint that wipes diplomail rows tied to games finished more than N years ago. An admin listing (GET /admin/mail/messages) rounds out the observability surface. EntitlementReader and GameLookup are new narrow deps wired from `*user.Service` and `*lobby.Service` in cmd/backend/main; the lobby service grows a one-off `ListFinishedGamesBefore` helper for the cleanup path (the cache evicts terminal-state games so the cache walk is not enough). Stage D will swap LangUndetermined for an actual body-language detector and add the translation cache. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b3f24cc440 |
diplomail (Stage B): admin/owner sends + lifecycle hooks
Item 7 of the spec wants game-state and membership-state changes to land as durable inbox entries the affected players can re-read after the fact — push alone times out of the 5-minute ring buffer. Stage B adds the admin-kind send matrix (owner-driven via /user, site-admin driven via /admin) plus the lobby lifecycle hooks: paused / cancelled emit a broadcast system mail to active members, kick / ban emit a single-recipient system mail to the affected user (which they keep read access to even after the membership row is revoked, per item 8). Migration relaxes diplomail_messages_kind_sender_chk so an owner sending kind=admin keeps sender_kind=player; the new LifecyclePublisher dep on lobby.Service is wired through a thin adapter in cmd/backend/main, mirroring how lobby's notification publisher is plumbed today. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
535e27008f |
diplomail (Stage A): add in-game personal mail subsystem
Phase 28 of ui/PLAN.md needs a persistent player-to-player mail channel; the existing `mail` package is a transactional email outbox and the `notification` catalog is one-way platform events. Stage A lands the schema (diplomail_messages / _recipients / _translations), a single-recipient personal send/read/delete service path, a `diplomail.message.received` push kind plumbed through the notification pipeline, and an unread-counts endpoint that drives the lobby badge. Admin / system mail, lifecycle hooks, paid-tier broadcast, multi-game broadcast, bulk purge and language detection / translation cache come in stages B–D. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
77cb7c78b6 |
Merge pull request #9: ui-test singleton queue
Tests · UI / test (push) Successful in 2m14s
Replaces per-sha cancel-in-progress (which fired spurious self-cancels) with a singleton queueing group. ui-test #74 (push) and #75 (pull_request) both green at ~2m, queue-not-cancel verified. |
||
|
|
1a0e3e992f |
ci/ui-test: queue runs in one bucket instead of cancelling
`cancel-in-progress: true` killed run #73 even though it was the only ui-test in its concurrency group — Gitea appears to cancel the in-progress job on its own under that setting in some edge cases. Switch to a singleton group with `cancel-in-progress: false`. The new behaviour is simple queueing: only one ui-test workflow runs at a time across the repository, the rest wait. Vite-on-:5173 cannot collide because there is never a second ui-test alive. The wall-time hit is bounded — ui-test is ~2 minutes — and bursts are rare enough that queueing is cheap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
faf598b2cd |
Merge pull request #8: Playwright tuning + concurrency for ui-test
Caps Playwright at 4 workers + 4 retries to absorb the host-mode flake budget, and serialises ui-test runs by head sha so push and pull_request events for the same commit cannot collide on Vite :5173. |
||
|
|
6e6186a571 |
ci/ui-test: key concurrency by head sha, not gitea.ref
`gitea.ref` differs between push (`refs/heads/<branch>`) and pull_request (`refs/pull/N/head`) events even for the same commit, so the two parallel runs land in different concurrency groups and the Vite-on-:5173 collision is not suppressed. Switching the key to the head sha (`gitea.event.pull_request.head.sha || gitea.sha`) collapses both events into one bucket, leaving exactly one ui-test alive per commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
e3bb30201d |
ci/ui-test: serialise per-ref + clear stale Vite before Playwright
Two ui-test jobs cannot coexist on the same host: Playwright's `webServer` spec spawns `pnpm dev` on :5173, and on a host-mode runner the port lives in the host namespace shared by every job. ui-test #67 hit "Error: http://localhost:5173 is already used" because a parallel job's Vite still held the port. Two changes: 1. `concurrency: ui-test-${{ gitea.ref }}` with `cancel-in-progress: true`. New push/PR runs against the same ref kill any earlier ui-test before starting, so we never have two `pnpm dev`s alive at once. 2. `pkill -f 'vite dev' || true` plus `fuser -k 5173/tcp` right before Playwright. Defence in depth in case the concurrency cancellation does not reap the spawned shell promptly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |