Commit Graph

418 Commits

Author SHA1 Message Date
developer 44ed0a90eb Merge pull request 'docs(ui): finalize MVP plan structure + de-archaeologize topic docs' (#25) from feature/ui-finalization-plan into development 2026-05-21 21:25:06 +00:00
Ilia Denisov a89048f6c5 docs(ui): finalize MVP plan structure and de-archaeologize topic docs
MVP web client (Phases 1-30) is complete; reorganize planning + living docs around that.

- PLAN.md kept as the staged MVP record (1-30) with a status block + pointers; removed the 31-36 stages, regression scenarios, and deferred-TODO section (moved out); fixed a stale cross-machine plan path.

- ui/PLAN-finalize.md (new): active web-finalization plan in 8 stages (visual system, a11y, i18n, error UX, PWA, build hygiene, docs, owner manual-QA loop); absorbs former Phases 33 and 35.

- ui/ROADMAP.md (new): post-MVP (Wails, Capacitor, realistic projection, acceptance + regression scenarios) and triaged deferred follow-ups.

- ui/docs/README.md (new): grouped topic-doc index.

- De-archaeologized all 20 ui/docs topic docs + ui/README.md + ui/core/README.md: stripped Phase-N build history, rewritten as current-state; deferred work now points at ROADMAP.md / PLAN-finalize.md. Docs-only; no code change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 23:17:51 +02:00
developer 51865b8cf4 Merge pull request 'Phase 30 — Ship Class Calculator (goal-seek, reach circles, planet build)' (#24) from feature/ui-calculator into development
Deploy · Dev / deploy (push) Successful in 38s
Tests · Go / test (push) Successful in 2m6s
Tests · Integration / integration (push) Successful in 1m43s
Tests · UI / test (push) Successful in 2m1s
2026-05-21 19:47:16 +00:00
Ilia Denisov 4d3cfd11a3 docs(ui): mark Phase 30 (ship-class calculator) done in PLAN.md
Tests · Integration / integration (pull_request) Successful in 1m47s
Tests · Go / test (pull_request) Successful in 2m10s
Tests · UI / test (pull_request) Successful in 1m59s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 21:40:45 +02:00
Ilia Denisov b1b87c8521 feat(ui-calculator): input validation, load caps, ceil display, modernization layout
Tests · Go / test (push) Successful in 2m26s
Tests · UI / test (push) Successful in 2m26s
- custom load capped at cargo capacity (error when exceeded); full load shows the cargo capacity; zero cargo pins load to empty and disables the toggle

- per-input red border + tooltip for every invalid value (blocks, techs, load, MAT, modernization target); no value may be negative; locking a speed is disabled when drive is zero

- display every computed number (results + goal-seek back-solved input) rounded up to 3 decimals via a shared pkg/calc Ceil3 bridged to wasm; engine keeps its own round-to-nearest util.Fixed*

- modernization total upgrade cost spans two columns (single line)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 21:24:40 +02:00
Ilia Denisov 3ea29cf8b5 fix(ui-calculator): keep calculator state long-lived; don't eject on planet click
Tests · UI / test (push) Successful in 1m59s
Move the calculator's inputs into a page-level calculatorState singleton so they survive the sidebar unmounting the tab on a tab switch (the inspector auto-opens on a planet click). ensureGame resets the design when the active game changes.

While on the calculator, a planet click no longer switches to the inspector — the calculator consumes the selection in its planet area / reach circles. Halve the reach-circle stroke width.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:29:08 +02:00
Ilia Denisov 9ae7b88b89 feat(ui): Phase 30 ship-class calculator with goal-seek and reach circles
Tests · UI / test (push) Successful in 2m14s
Tests · Go / test (push) Successful in 2m25s
Fuse the standalone ship-class designer (Phases 17/18) into a sidebar calculator: live mass/speed/attack/defence/bombing results, a planet build-rate readout, single-target goal-seek, a modernization-cost mode, and auto reach circles on the map for the selected planet.

pkg/calc becomes the single source for the new math (no mirroring): extract BombingPower from the engine model and the per-turn ship-production loop from controller.ProduceShip into pkg/calc (engine now delegates), and add inverse goal-seek solvers in pkg/calc/solve.go. Thin-bridge the combat, planet-build, and solver functions through ui/core/calc + ui/wasm and rebuild core.wasm.

Remove the standalone designer view/route; the ship-classes table and the view/bottom menus open the calculator via a shared request store.

Docs: rewrite ui/PLAN.md Phase 30, adjust Phase 34 (realistic forecast + CAP/COL ownership), add ui/docs/calculator-ux.md, extend calc-bridge.md, fix navigation.md; remove ui/CALCULATOR.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:04:07 +02:00
developer 00159ddf7c Merge pull request 'ci: per-job pnpm install dir to fix the host-runner setup-pnpm race' (#23) from feature/ci-pnpm-setup-isolation into development
Deploy · Dev / deploy (push) Successful in 40s
Tests · UI / test (push) Successful in 2m4s
2026-05-20 15:31:12 +00:00
Ilia Denisov b24d53b82f ci: install pnpm into a per-job dir to fix the host-runner setup race
Tests · UI / test (push) Waiting to run
Tests · UI / test (pull_request) Successful in 1m56s
`pnpm/action-setup@v4` defaults to installing pnpm in the shared
`~/setup-pnpm`. On the single host-mode runner $HOME is shared across
concurrent jobs, so when two pnpm jobs overlap (e.g. a post-merge
`dev-deploy` and `ui-test`, which sit in different concurrency groups)
their self-installers race and one fails with
`ENOTEMPTY ... rmdir '~/setup-pnpm/node_modules/.bin/store/v11/files'`
before the tests even run.

Point each step's `dest` at `${{ runner.temp }}/setup-pnpm` (a per-job
isolated directory) so concurrent jobs never share the install location.
The action still adds `dest` to PATH, so setup-node's pnpm cache and
later `pnpm` calls are unaffected; the pnpm package store stays shared
(safe — pnpm locks it). Applied to the three workflows that set up pnpm:
ui-test, dev-deploy, prod-build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 17:26:49 +02:00
developer 6572f5b59d Merge pull request 'fix(ui-map): inverse stencil mask for the visibility fog (Safari pan perf, stage 2)' (#22) from feature/ui-map-fog-inverse-mask into development
Deploy · Dev / deploy (push) Successful in 31s
Tests · UI / test (push) Successful in 1m58s
2026-05-20 15:02:03 +00:00
Ilia Denisov a08f4f55b0 fix(ui-map): cut the visibility fog with an inverse stencil mask (Safari pan perf, stage 2)
Tests · UI / test (push) Successful in 1m57s
Tests · UI / test (pull_request) Successful in 1m56s
Stage 1 (render-on-demand) removed the idle / whole-system freeze, but
panning a loaded map with "visible hyperspace" on stayed heavy in Safari:
the fog still cut its visibility holes by opaque overpaint — on KNNTS041
that is ~260 near-world-sized opaque circles blended over the fog every
rendered frame, a fill-rate cliff for Safari's WebGPU / Apple's tile-based
GPU.

Replace the overpaint with an INVERSE stencil mask: setVisibilityFog now
draws the FOG_COLOR rectangle(s) into fogLayer and collects the visibility
circles into one Graphics set as fogLayer.setMask({ mask, inverse: true }),
so the fog shows everywhere except the union of the circles. Per-frame cost
drops from dozens of blended opaque circle fills to one rect fill + a
stencil pass (no colour writes), which Apple's TBDR GPU handles cheaply,
and the fog stays fully vector — crisp at any zoom.

fogPaintOps and its unit tests are unchanged (the circle ops now feed the
mask instead of an overpaint). Verified with a high-contrast screenshot
during development (fog field with a correct circle-union hole) plus the
existing fog / render-on-demand e2e green on chromium + webkit.

Docs: renderer.md fog section + PLAN.md Phase 29 decision 9.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 16:53:54 +02:00
developer 44c18c3ef4 Merge pull request 'fix(ui-map): render-on-demand + drop pan inertia (Safari fog freeze, stage 1)' (#21) from feature/ui-map-render-on-demand into development
Deploy · Dev / deploy (push) Successful in 30s
Tests · UI / test (push) Successful in 1m57s
2026-05-20 14:43:57 +00:00
Ilia Denisov 51902b995f fix(ui-map): render-on-demand + drop pan inertia to stop the Safari fog freeze
Tests · UI / test (push) Successful in 1m55s
Tests · UI / test (pull_request) Successful in 2m4s
The Phase 29 visibility fog ("visible hyperspace") froze the whole UI on
large reports in Safari while staying smooth in Firefox. Root cause: the
fog is a layered overpaint (torus mode = 9 world-sized rects + 9xN
near-world-sized opaque circles, ~270 fills for KNNTS041) and Pixi's
continuous auto-render loop re-rasterised all of it every frame, even
while idle. Safari's WebGPU backend cannot sustain that fillrate, so the
main thread/compositor starved and the entire UI froze.

Stage 1 (vector-preserving, no rasterisation):

- Stop Pixi's auto-render loop (app.stop()) and paint on demand via a
  single Ticker.shared flush gated on viewport.dirty (camera) plus an
  internal requestRender() from every content mutation (fog / hide-set /
  extras / wrap mode / resize / pick overlay). An idle map now does zero
  GPU work per frame; plain hover paints nothing.
- Remove the decelerate (drag-inertia) plugin: a released drag stops
  instantly (owner request) and the viewport goes idle immediately.
- Expose RendererHandle.getRenderCount() / getMapRenderCount for
  deterministic e2e assertions.

Tests: new map-toggles e2e specs (idle map does not repaint; released
drag does not coast) green on all four Playwright projects incl. WebKit.
Docs: renderer.md (render-on-demand section; fog section corrected to the
current single-fogLayer model; FPS note) and PLAN.md Phase 29 decision 8.

If Safari pan is still heavy after this, stage 2 will cut the overpaint
itself with an inverse stencil mask of the circle union (kept vector).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 16:28:18 +02:00
developer 0da2f4b6fb Merge pull request #20 from feature/ui-map-toggles
Deploy · Dev / deploy (push) Successful in 30s
Tests · Integration / integration (push) Successful in 1m43s
Tests · Go / test (push) Successful in 2m5s
Tests · UI / test (push) Successful in 2m56s
Phase 29 — Map Toggles

Reviewed and verified visually on dev-deploy.
2026-05-19 22:37:29 +00:00
Ilia Denisov 53b892ae00 fix(ui-map): move fog overlay to a viewport-level layer below the copies
Tests · UI / test (push) Successful in 2m50s
Tests · UI / test (pull_request) Has been cancelled
Tests · Integration / integration (pull_request) Successful in 1m47s
Tests · Go / test (pull_request) Successful in 2m5s
Two regressions surfaced once visible-hyperspace toggled on a real
dev-deploy map:

1. On the zero-turn map the bg holes painted ON TOP of the planet
   glyphs — every LOCAL planet looked like a hollow circle of
   background colour instead of the planet pixel inside an
   unfogged area.
2. On a legacy report with a drive tech that pushes the visibility
   radius well past the world dimensions the bg circles overlapped
   to cover the entire viewport. Combined with the wrong z-order
   the result was a uniformly black canvas with every primitive
   hidden.

The per-copy implementation added the fog container via
`copy.addChildAt(container, 0)` and trusted Pixi v8 to insert the
container at the start of the copy's children. Whether by a Pixi
quirk or by some interaction with how `populatePrimitives` orders
its `c.addChild(g)` calls, the fog ended up rendering after every
primitive in practice — the symptoms above are a perfect match for
that ordering.

Restructured the fog rendering so the z-order is structural
rather than relying on `addChildAt`:

- A single `fogLayer: Container` is added to the viewport BEFORE
  the nine torus copies. Pixi renders viewport children in order,
  so the layer is guaranteed to paint first; every copy renders
  on top.
- `fogPaintOps` now emits world-space coordinates with wrap
  offsets baked in (9 fog rects + 9 bg circles per visibility
  entry in torus mode, 1 + N in no-wrap mode). The renderer
  populates `fogLayer` with one `Graphics` per op — no per-copy
  iteration on the fog side.
- The previous `fogGraphics: Container[]` closure state is gone.
  Each `setVisibilityFog` flip drops every child of `fogLayer`
  and rebuilds it. The dispose path drops the children
  eagerly before `app.destroy({children: true})` walks the tree.

The fog-paint-ops test exercises the new contract: the no-wrap
path keeps one rect + N circles, the torus path expands to nine
rects + nine wrapped circles per entry (including the seam-fix
case at x = 950).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 00:26:06 +02:00
Ilia Denisov 00e84579ca fix(ui-map): split fog overlay into per-shape Graphics + torus-wrap circles
Tests · UI / test (push) Successful in 3m23s
Two visible regressions in the in-game map's fog overlay surfaced
on dev-deploy:

1. With three LOCAL planets close together, only the last planet
   glyph stayed visible inside the bg holes — the other two were
   obscured. The previous implementation stacked the fog rectangle
   plus every bg circle onto a single `Graphics` via repeated
   `g.rect(...).fill(...).circle(...).fill(...)...`. Pixi v8's
   multi-shape Graphics is supported in theory, but in practice
   only the last shape's fill seems to land, dropping the earlier
   bg holes (and the planet glyphs on top look like they vanished
   along with their hole). Splitting each op onto its own
   `Graphics` inside a per-copy `Container` removes the ambiguity
   — one shape, one fill, one render pass.

2. A planet near the right world edge produced a "sector" — the
   bg circle painted into the area past the seam, but the
   neighbouring tile's fog rectangle then overpainted that bleed,
   leaving a quarter-circle hole. In torus mode each visibility
   circle is now drawn at the nine wrapped positions
   (`(dx, dy) ∈ {-1, 0, 1}²`); the wrapped copies in the
   neighbour-tile-aligned positions keep the hole continuous
   across the seam. No-wrap mode keeps a single emission per
   circle, because wrapped circles would leak into the visible
   world rectangle as unwanted holes.

The `fogPaintOps` helper now takes the wrap mode as a parameter;
`tests/fog-paint-ops.test.ts` covers the torus expansion
(nine-wrap product per circle, the seam-fix case at x = 950) and
re-asserts the no-wrap path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 00:04:35 +02:00
Ilia Denisov 7ade838df8 test(ui-map): unit-cover the fog overlay's layered-overpaint contract
Tests · UI / test (push) Successful in 2m49s
Lifted the Phase 29 fog draw sequence out of `setVisibilityFog`
into a pure `fogPaintOps` helper that returns an ordered list of
fill operations (one fog rect, then one background-coloured
circle per visibility entry). The renderer now dispatches each op
straight onto a Pixi `Graphics`; the indirection lets the layered-
overpaint contract be tested without booting Pixi.

`tests/fog-paint-ops.test.ts` covers: empty input → no ops; single
circle → fog rect + bg circle in that order; multiple circles → N
bg circles after the fog rect; overlapping circles emitted
independently (the rendering order unions them); zero / negative
world dimensions → no ops.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 23:42:39 +02:00
Ilia Denisov 37580b7699 fix(ui-map): repaint fog as layered overpaint; rename to visibleHyperspace
Tests · UI / test (push) Waiting to run
The Phase 29 fog overlay rendered as a handful of random arc
segments instead of a clean union of holes around LOCAL planets
— Pixi v8's `Graphics.cut()` does not reliably subtract multiple
overlapping circles from a base path.

Replaced the cut-based approach with a layered overpaint: a
fog-tinted rectangle fills the world, then opaque background-
coloured circles are painted on top for every visibility circle.
The natural rendering order unions overlapping circles for free —
no geometry, no `cut()` quirks, one extra fill per circle.

Renamed the toggle from `visibilityFog` to `visibleHyperspace`
across the store, i18n strings, popover, tests, and docs. The
overlay still implements the visual "fog" effect at the renderer
level (FOG_COLOR, setVisibilityFog, getMapFog); the toggle is
named after the player-facing concept it controls — the portion
of the map that is visible (intelligence/scan coverage) — rather
than the obscured part.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 23:39:39 +02:00
Ilia Denisov 2f4dc01d54 fix(ui-map): apply wrap-mode flips in place instead of remounting
Tests · UI / test (push) Successful in 2m51s
The previous logic re-mounted the renderer whenever
`store.wrapMode` flipped, because the `sameSnapshot` gate
included `handle.getMode() === mode`. Pixi 8 does not reliably
re-initialise an `Application` on the same canvas — the symptom
showed up as the chromium tab silently closing during the
Phase 29 wrap-mode e2e ("Target page, context or browser has
been closed").

The renderer already exposes an in-place `setMode` that swaps
the wrap-clamp / torus-copy visibility synchronously while
preserving the camera; the playground-map.spec.ts wrap toggle
has been driving it for several phases without issue. Drop
mode from the snapshot gate and route the change through
`handle.setMode(mode)` instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 22:33:38 +02:00
Ilia Denisov 7c46aa4bec fix(ui-e2e): tighten Phase 29 effect tracking + radio wiring
Tests · UI / test (push) Failing after 7m19s
Run #217 surfaced three independent bugs that survived the first
fixup pass:

1. `visibleHighBitCount` masked the id with `(prim.id >>> 0) & 0xf…`,
   but JS bitwise AND always returns a signed int32 — the mask had
   to be re-converted with `>>> 0` AFTER the AND, not before. Result
   was always 0 on the previous run, masking the next two bugs by
   making the persistence test's high-bit-count assertions a
   tautology.
2. `applyVisibilityState` was wrapped in `untrack`, so the
   `toggles.X` reads inside `computeHiddenIds` / `computeFogCircles`
   never landed in the effect's dependency set — toggling fog or any
   marker / group / kind flag did not re-run the effect, so the
   renderer never received the new hide / fog input. Explicit
   `void toggles.X` reads now live at the top of the effect so every
   key is tracked synchronously.
3. The wrap-mode radios fired on `onchange`, which Svelte 5
   suppresses on a re-activation of an already-checked input — the
   Playwright `.click()` flake on the second wrap test reflected the
   missed event. Switched to `onclick` and short-circuited when the
   target mode is already active.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 22:23:15 +02:00
Ilia Denisov 2528d63b51 fix(ui-e2e): Phase 29 map-toggles spec passes across all four projects
Tests · UI / test (push) Failing after 10m52s
Three independent bugs in `tests/e2e/map-toggles.spec.ts` made the
fresh-Phase-29 suite red on CI #216:

1. `visiblePlanets` filtered on `p.id < 1_000_000`, which JS interprets
   in signed space — high-bit-prefix primitives (cargo route 0x80…,
   battle 0xa0…, bombing 0xc0…) are stored as negative Numbers and
   leaked into the planet list. Filter switched to a `0 < id < 1e7`
   window that matches the engine planet-number range exactly.
2. The `visibleHighBitCount` helper now ToUint32-converts the id
   before masking so the bitmask comparison works regardless of
   whether the id is stored as positive or negative.
3. The fog and wrap-mode tests read the renderer state synchronously
   after the click — the Svelte effect re-runs asynchronously, so the
   tests saw stale state. Both now `waitForFunction` on the canonical
   "settled" signal: empty fog circles for the fog flip, and a new
   `getMapMode()` debug accessor for the wrap-mode remount.

Renderer side: registers a `MapModeProvider` next to the existing
camera / fog providers and exposes `getMapMode()` through the debug
surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 22:02:15 +02:00
Ilia Denisov 2bd1b54936 feat(ui): Phase 29 map visibility toggles
Tests · Go / test (push) Successful in 2m31s
Tests · UI / test (push) Failing after 8m7s
Adds the gear-icon popover on the map view with per-game persistence
of every category toggle plus the wrap-mode radio. Hide-by-id and
visibility-fog facilities land on the renderer so every flip applies
within one frame without a Pixi remount; the wrap-mode toggle keeps
its existing remount + camera-preserve path. A new server-side turn
force-resets every flag to defaults so a hidden category never makes
the player miss the next turn's news.

Also fixes the FligthDistance → FlightDistance typo in pkg/calc/race.go
(plus the single Go caller); the TS side keeps duplicating the formula
until a race-level WASM bridge lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 21:33:53 +02:00
developer 65c0fbb87d Merge pull request 'fix(generator): drop incorrect distinctness assert in TestPlanetRandomName' (#19) from feature/fix-flaky-planet-random-name into development
Deploy · Dev / deploy (push) Successful in 31s
Tests · Integration / integration (push) Successful in 1m35s
Tests · Go / test (push) Successful in 2m58s
2026-05-19 08:15:45 +00:00
Ilia Denisov 3d06f49f3c fix(generator): drop incorrect distinctness assert in TestPlanetRandomName
Tests · Go / test (push) Successful in 1m54s
Tests · Integration / integration (pull_request) Successful in 1m42s
Tests · Go / test (pull_request) Successful in 2m2s
`RandomName` builds the suffix as two independent `rand.Intn(1000)`
calls, so the two 4-digit halves collide on ~0.1% of runs. The
sub-test asserted `g[2] != g[3]`, which flakes whenever the same
value lands twice — once per ~1000 sub-runs per class, so across
the seven `PlanetClass` rows the integration suite hit it on
`#199 go-unit.yaml` against `feature/subscribe-events-heartbeat`
(`"0074"` collision).

Distinctness is not a property `RandomName` promises and is not
load-bearing for callers: `game/internal/controller/generate_game.go`
uses these names for planet labels and already tolerates duplicate
names across planets, so collisions inside one name are no worse
than collisions between names. Drop the assert; keep the format and
class-prefix checks, which are the actual contract.

Stress-tested with `-count=200`: 200 consecutive iterations × 7
classes = 1400 sub-runs without a single failure where the prior
version's flake probability would have surfaced ~once on average.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 10:13:25 +02:00
developer 08345606a5 Merge pull request 'fix(dev-deploy): explicit Cache-Control on the UI surface' (#18) from feature/caddy-cache-headers into development
Deploy · Dev / deploy (push) Successful in 27s
2026-05-19 08:11:40 +00:00
Ilia Denisov b85a9e1b9b fix(dev-deploy): explicit Cache-Control on the UI surface
Caddy's `file_server` did not set Cache-Control on the SvelteKit
build, so browsers fell back to heuristic caching keyed off
Last-Modified. On the long-lived dev environment the heuristic
window leaves the previous deploy's `index.html` cached for
minutes-to-hours, and Safari combined that with stale conditional
requests into a visible multi-second freeze on every reload (the
reproduction was "private window reloads instantly, normal window
hangs; clearing Safari caches restores normal speed"). Push
delivery itself works — heartbeat keeps the SubscribeEvents stream
alive — but the bundle path stalls behind the browser revalidating
a chain of stale chunks.

Mirror the standard SvelteKit cache split inside both Caddyfiles:

- `_app/immutable/*` — hash-named JS/CSS chunks Vite emits with
  content-addressed file names — `Cache-Control:
  public, max-age=31536000, immutable`. Safe to cache forever
  because the name changes whenever the content does, so the next
  deploy serves new files under new URLs.
- Everything else (`index.html` fallback via `try_files`,
  `env.js`, `version.json`, `core.wasm`, `wasm_exec.js`,
  `favicon.svg`) — `Cache-Control: no-cache, must-revalidate`.
  The browser still uses the cached body when the ETag matches,
  but it always asks first; a fresh deploy reaches the user on
  the next reload without a manual cache clear.

Smoke-tested locally: a docker-run Caddy with this config returns
the immutable header only for `/_app/immutable/*` and the
no-cache header for `/index.html`, `/env.js`, and the SPA-fallback
path `/some/route`. The Caddyfile passes `caddy validate` in
both `Caddyfile.dev` and `Caddyfile.prod`; the pre-existing
formatting warning on line 7 is untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 10:11:09 +02:00
developer 8170abd5fa Merge pull request 'feat(gateway): unsigned gateway.heartbeat keeps Safari push streams alive' (#17) from feature/subscribe-events-heartbeat into development
Deploy · Dev / deploy (push) Successful in 33s
Tests · Integration / integration (push) Successful in 1m36s
Tests · Go / test (push) Successful in 3m18s
Tests · UI / test (push) Successful in 2m28s
2026-05-19 07:35:35 +00:00
Ilia Denisov 14b65389ef feat(gateway): unsigned gateway.heartbeat keeps Safari push streams alive
Tests · UI / test (push) Successful in 2m35s
Tests · Go / test (push) Successful in 1m56s
Tests · UI / test (pull_request) Has been cancelled
Tests · Integration / integration (pull_request) Successful in 1m42s
Tests · Go / test (pull_request) Successful in 2m0s
Browser fetch-streaming layers close response bodies they consider
idle after roughly 15-30 s without incoming bytes. Safari is the
most aggressive, but the symptom matters everywhere: a quiet
SubscribeEvents stream (lobby, between turns, mailbox empty) gets
torn down by the browser, the EventStream singleton reconnects with
backoff, and any push event that fires inside the reconnect window
is lost because `push.Hub` queues are not persisted across
subscription closes. The user-visible failure mode is the
intermittent "Fetch API cannot load … due to access control checks"
console error (a misleading WebKit symptom — CORS headers are
actually present) plus missed turn-ready / mail-received toasts.

Server-side fix: a silence-based heartbeat at the
`authenticatedPushStreamService` wrapper layer. After the signed
`gateway.server_time` bootstrap event, gateway wraps the bound
stream with `heartbeatingStream`. Every tail Send (fan-out, future
variants) resets the silence timer; when the timer elapses, a
goroutine emits `gateway.heartbeat` with only `EventType` set —
everything else stays at proto3 defaults, so the wire frame is
~45 bytes amortised. A `sendMu` serialises the heartbeat goroutine
with tail Sends because grpc.ServerStream.Send is not goroutine-safe.

The heartbeat is intentionally UNSIGNED: heartbeats carry no
payload, dispatch to no handler on the client, and an injected
heartbeat trivially causes no user-visible state change. TLS still
protects the wire and real events keep the signed envelope
unchanged. Documented in `docs/ARCHITECTURE.md` § 15 alongside the
per-scale bandwidth projection (100…100 000 clients × 15…60 s).

Config: new `GATEWAY_PUSH_HEARTBEAT_INTERVAL` (default `15s`,
`0s` disables). Telemetry: new
`gateway.push.heartbeats_sent{outcome}` counter so operators can
budget bandwidth and spot a sudden `outcome=error` bump as an
upstream-failing-before-flush signal.

Client (`ui/frontend/src/api/events.svelte.ts`): early `continue`
on `event.eventType === "gateway.heartbeat"` before `verifyEvent`,
`verifyPayloadHash`, or dispatch — empty signature would otherwise
trip SignatureError and reconnect. A leading heartbeat still flips
`connectionStatus` to `connected` and resets backoff, because
receiving one is proof the stream is healthy.

Tests:
- `push_heartbeat_test.go`: unit tests for the wrapper — zero
  interval returns nil, heartbeat fires after silence, real Send
  resets the timer, Stop / context-cancel halt the goroutine,
  Send errors propagate.
- `server_test.go`: integration tests through the full gateway
  pipeline — heartbeat fires after the configured silence window,
  zero interval keeps the stream silent.
- `config_test.go`: default applied, env-override parsed,
  negative value rejected.
- `events.test.ts`: heartbeat skipped before verification + not
  dispatched to handlers; leading heartbeat still flips
  `connectionStatus` to `connected`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 09:29:29 +02:00
developer 8f84075c4b Merge pull request 'fix(battle-viewer): unblock synthetic-game battle load' (#16) from feature/synthetic-battle-loading-fix into development
Deploy · Dev / deploy (push) Successful in 34s
Tests · UI / test (push) Successful in 2m20s
2026-05-19 05:58:02 +00:00
Ilia Denisov bde01b1ce2 fix(battle-viewer): unblock synthetic-game battle load
Tests · UI / test (push) Successful in 2m18s
Tests · UI / test (pull_request) Waiting to run
The Phase 28 ConnectRPC migration of the battle viewer added a
guard in `lib/active-view/battle.svelte` that waits for the
surrounding layout to publish a `GalaxyClient` before issuing the
fetch. The in-game shell layout deliberately skips
`galaxyClient.set(...)` on the synthetic branch (gateway is not
reachable in synthetic mode), so for any battle opened from a
synthetic-report game the viewer sat on "loading battle…"
forever — `fetchBattle` was never called, so the synthetic-fixture
short-circuit it carries was unreachable.

Let the guard skip synthetic ids: `fetchBattle` already resolves
those through `lookupSyntheticBattle` and never touches the
client, so its signature widens to `GalaxyClient | null` and the
synthetic path passes `null`. The live path still waits for the
handle as before; a `null` client on the live path now fails
fast with a transport-level `BattleFetchError` instead of silently
sitting on `loading`.

Tests:
- Existing "loading placeholder" smoke now uses a non-synthetic
  game id so it keeps asserting the live-path wait.
- Two new cases pin the synthetic behaviour: missing fixture →
  `battle-not-found`; registered fixture → `BattleViewer` mounts.

Docs:
- `docs/FUNCTIONAL.md` §6.5 still described the pre-Phase-28
  raw REST path. Updated to the signed ConnectRPC command and
  noted the synthetic short-circuit. Russian mirror updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 07:52:26 +02:00
developer 82bdb6777a Merge pull request 'fix(dev-deploy): seed geoip onto a named volume' (#15) from feature/dev-deploy-geoip-volume into development
Deploy · Dev / deploy (push) Successful in 28s
2026-05-19 00:02:12 +00:00
Ilia Denisov f70258849f fix(dev-deploy): seed geoip onto a named volume
`docker restart galaxy-dev-backend` failed with "not a directory"
after every dev-deploy workflow run. Root cause: the compose file
bind-mounted the geoip database via a relative path
(`../../pkg/geoip/test-data/test-data/GeoIP2-Country-Test.mmdb`).
When the Gitea runner invoked `docker compose up`, the path
resolved against the runner's ephemeral workspace under
`/home/runner/.cache/act/<hash>/hostexecutor/...`. The bind source
baked into the running container therefore pointed at that
ephemeral path; the runner deleted the workspace once the workflow
finished, and any later `docker restart` could not remount.

Replace the bind with a named volume `galaxy-dev-geoip-data`,
seeded at deploy time:

- `tools/dev-deploy/docker-compose.yml`: mount
  `galaxy-dev-geoip-data:/var/lib/galaxy:ro` instead of a relative
  bind. Declare the volume in the top-level `volumes:` block.

- `.gitea/workflows/dev-deploy.yaml`: new `Seed geoip volume` step
  (placed right after the existing UI-volume seed) copies the
  fixture from `pkg/geoip/test-data/test-data/` into the named
  volume via an ephemeral alpine container, the same pattern UI
  seeding already uses.

- `tools/dev-deploy/Makefile`: new `seed-geoip` target performs
  the same copy from the persistent checkout. `up` and `rebuild`
  now depend on it, so a hand-run `make -C tools/dev-deploy up`
  populates the volume without operator action.

- `tools/dev-deploy/README.md`: updated the make-targets table to
  list `seed-geoip`.

- `tools/dev-deploy/KNOWN-ISSUES.md`: the entry for the restart
  failure is downgraded to a "fixed" postmortem; the symptom,
  cause, and where the fix lives are kept for future reference.

Verification on the dev host (this branch checked out):

  $ make -C tools/dev-deploy up                # populates the volume, brings stack healthy
  $ docker restart galaxy-dev-backend          # used to error "not a directory"
  $ until [ "$(docker inspect -f '{{.State.Health.Status}}' galaxy-dev-backend)" = "healthy" ]; do sleep 2; done
  $ echo "ok"                                   # backend up 6s, healthy

The pre-existing sandbox engine `galaxy-game-80f3ce86-...` survived
both `make up` and `docker restart` untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:59:38 +02:00
developer d19aa3aac5 Merge pull request 'fix(integration): scope preclean to galaxy.stack=integration' (#14) from feature/preclean-stack-scope into development
Tests · Integration / integration (push) Successful in 1m36s
2026-05-18 23:48:47 +00:00
Ilia Denisov a338ebf058 fix(integration): scope preclean to galaxy.stack=integration
Tests · Integration / integration (pull_request) Successful in 1m37s
Root cause for the long-standing "Dev Sandbox flips to cancelled
after dev-deploy" symptom in push-triggered cycles: when
`integration.yaml` runs in parallel with `dev-deploy.yaml`, its
`integration/scripts/preclean.sh` issues a `docker rm -f` over every
container labelled `galaxy.backend=1`. That label is stamped by the
backend's runtime adapter on every engine it spawns — including the
engines living in the long-lived dev-deploy environment on the same
Docker daemon. Each post-merge auto-deploy therefore had the
integration preclean wipe the dev-sandbox engine, and the new
backend's reconciler tick observed `container disappeared` and
cascaded the sandbox into `cancelled`.

Fix:

- `integration/testenv/backend.go` now sets
  `BACKEND_STACK_LABEL=integration` on every backend-under-test, so
  the engines spawned by integration carry
  `galaxy.stack=integration` in addition to `galaxy.backend=1`. The
  backend support for this env was added in the previous CI tidy-up
  PR (#13).

- `integration/scripts/preclean.sh` gains a multi-label AND filter
  helper and uses it to scope engine cleanup to the combination
  `galaxy.backend=1 AND galaxy.stack=integration`. dev-deploy and
  local-dev engines carry different `galaxy.stack` values, so the
  AND match leaves them alone.

- `docs/ARCHITECTURE.md` "Container labels" — refreshed to call out
  the AND-scoping rule and the new integration backend stamp.

- `tools/dev-deploy/KNOWN-ISSUES.md` — the sandbox-cancel entry
  gets an "Update" section recording the root cause and the fix; the
  status is downgraded to "partially fixed" because the solo
  `workflow_dispatch` reproduction (which does NOT trigger
  integration) remains unexplained.

- `tools/dev-deploy/KNOWN-ISSUES.md` — separately, document the
  `docker restart galaxy-dev-backend` failure caused by the
  runner-workspace bind-mount that surfaced while diagnosing this
  issue. Workaround: `make -C tools/dev-deploy up` from the
  persistent checkout. Real fix is a follow-up (bake fixture into
  image or copy to named volume).

Verification:

- `go build ./backend/... ./integration/...` — clean.
- `bash -n integration/scripts/preclean.sh` — syntax OK.
- Live AND-filter check on the dev host:
  `docker ps -aq --filter label=galaxy.backend=1 --filter label=galaxy.stack=integration`
  returns nothing while the dev-deploy engine
  `galaxy-game-80f3ce86-...` keeps running.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:37:55 +02:00
developer f91cf6eb41 Merge pull request 'chore(ci): tidy CI/dev infra — drop local-ci, lift migration rule' (#13) from feature/ci-tidy-up into development
Deploy · Dev / deploy (push) Successful in 28s
Tests · Go / test (push) Successful in 2m0s
Tests · Integration / integration (push) Successful in 1m40s
2026-05-18 23:10:21 +00:00
Ilia Denisov daed2690c1 fix(compose): keep galaxy.stack label on containers only
Tests · Integration / integration (pull_request) Successful in 1m41s
Tests · Go / test (pull_request) Successful in 2m0s
The previous commit stamped `galaxy.stack=<value>` on services,
volumes, and networks. Putting it on volumes/networks changes their
compose config-hash on every label revision, so `docker compose up`
tries to recreate them — which on the long-lived dev environment
either destroys the postgres data volume or deadlocks while trying
to remove `galaxy-dev-internal` with containers still bound to it.
Observed live: run #184 hung in compose recreate after the three
stateful services were stopped, with no recovery.

Containers alone are sufficient for the cleanup contract (we filter
containers, not volumes or networks). Roll back the label on volumes
and networks in both compose files and capture the rule in
docs/ARCHITECTURE.md so the next contributor does not reintroduce it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 01:00:21 +02:00
Ilia Denisov a9087691a3 chore(ci): tidy CI/dev infra — drop local-ci, lift migration rule, scope by galaxy.stack label
Tests · Go / test (push) Successful in 2m6s
Tests · Go / test (pull_request) Successful in 3m1s
Tests · Integration / integration (pull_request) Successful in 1m42s
Five connected cleanups across the dev/CI infrastructure:

1. Drop tools/local-ci/. The standalone Gitea + act_runner stack was
   the legacy "offline workflow validator"; the per-stage CI gate now
   runs on gitea.lan and the directory was only retained as a
   fallback. Removing it leaves no operational dependency: backend,
   gateway, and game code have no references; documentation that
   pointed at it (CLAUDE.md, docs/ARCHITECTURE.md, ui/docs/testing.md,
   tools/dev-deploy/README.md, tools/local-dev/README.md) is updated
   in this same change. Historical "Verified on local-ci run N"
   markers in ui/PLAN.md are preserved unchanged.

2. Lift the pre-production single-migration rule. The rule forced
   every schema delta into 00001_init.sql and required a manual
   make clean-data wipe on every backward-incompatible change in
   tools/dev-deploy/. Future schema deltas now land as additive
   sequence-numbered files (00002_*.sql, …) that goose applies
   automatically on backend startup; 00001_init.sql becomes an
   immutable baseline. Authoring conventions live in
   backend/internal/postgres/migrations/README.md. The chain may be
   squashed back into a fresh 00001 as a deliberate one-time
   operation before the first production deployment.

3. Document the deployment cadence. The dev environment is
   single-tenant: pushes to feature/* run the test workflows
   (go-unit, ui-test, integration) only; dev-deploy.yaml fires on
   push to development. A workflow_dispatch override on
   dev-deploy.yaml lets a developer preview a feature branch on the
   shared dev environment before merge; the next merge into
   development overwrites the manual deploy idempotently.

4. Scope compose-managed resources by an explicit
   galaxy.stack=<local-dev|dev-deploy> label. Both compose files
   stamp the label on every service, network, and named volume.
   Makefiles in tools/local-dev/ and tools/dev-deploy/ filter their
   engine-cleanup operations by (stack-label AND engine OCI title)
   so they never touch unrelated workloads on the same daemon.
   dev-deploy.yaml gains a pre-`compose up` step that reaps stale
   exited/dead containers under the dev-deploy stack label.

5. Backend now stamps the same galaxy.stack=<value> label on every
   engine container it spawns, sourced from a new BACKEND_STACK_LABEL
   env var (empty → label not applied; legacy-safe). Both compose
   files set it to their stack name (local-dev / dev-deploy). The
   contract is recorded in docs/ARCHITECTURE.md under
   "Container labels". A package-level test in
   backend/internal/runtime exercises both the label-present and
   label-absent paths.

No tests intentionally regressed: go test ./backend/internal/{config,
runtime,dockerclient} is green, both compose files validate cleanly,
and the backend, gateway, and game modules all build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 23:32:42 +02:00
developer 5eec7013ba Merge pull request 'chore(dev-deploy): KNOWN-ISSUES entry for sandbox-cancel after redispatch' (#12) from chore/dev-sandbox-cancel-todo into development 2026-05-16 21:17:37 +00:00
Ilia Denisov 49f614926a KNOWN-ISSUES: park sandbox-cancel; owner rejected host-side hypotheses
After the live investigation, the project owner confirms that none
of the host-side cleanup paths apply: no docker prune cron, no
manual `docker rm`, no `dockerd` restart in the window, and the
engine binary does not crash while idling on API calls.

Replace the host-side hypothesis list with a one-line note that
they were considered and rejected, narrow the open suspicion to
the `dev-deploy.yaml` job sequence (`docker build` + `docker
compose build` + the alpine `docker run --rm` for UI seeding +
`docker compose up -d --wait --remove-orphans`), and park the
entry. Reopen if the symptom recurs with a fresh
`docker events --since 0` capture armed before the deploy
starts.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 23:16:51 +02:00
Ilia Denisov cadb72b412 KNOWN-ISSUES: rule out compose orphan reap; narrow to host-side reap
Tests · UI / test (push) Successful in 2m36s
Tests · Go / test (push) Successful in 2m38s
A live `docker inspect` of an engine container and two redispatch
runs with `docker events` captured confirm:

- Engine has no `com.docker.compose.*` labels and `AutoRemove=false`,
  so `--remove-orphans` cannot reap it.
- Two consecutive `dev-deploy.yaml` redispatches with an engine
  already running emitted `die` / `destroy` events only for
  `galaxy-dev-{backend,api,caddy}` — never for the engine.
- The reconciler tick that fires 60s after backend recreate
  correctly matched the surviving engine in both cases
  (`status=running` in both `games` and `runtime_records`).
- `runtime.Service` has no `Shutdown` that proactively removes
  engine containers, so a graceful backend exit also leaves them
  alone.

The repro window therefore needs a separate trigger that removed
the engine container outside of compose. The new hypotheses point
at host-side `docker prune` jobs, a `dockerd` restart that lost the
container, or an early `Engine.Init` failure that exited the engine
before `status=running` reached the runtime row. The investigation
list now leads with `journalctl -u docker` and the host crontab —
those are the cheapest checks to confirm or rule out next.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 23:10:13 +02:00
Ilia Denisov 5177fef2ef tools/dev-deploy: log the sandbox-cancellation TODO
Capture the diagnostic notes for the issue we hit after every
`dev-deploy.yaml` redispatch: the freshly-bootstrapped "Dev Sandbox"
game ends up `cancelled` ~15 minutes later, with the runtime
reconciler reporting "container disappeared". The engine never
shows up in `docker ps -a --filter label=galaxy-game-engine`, so
either it never spawned or it was removed before any host-side
snapshot.

`KNOWN-ISSUES.md` records the symptom, the log excerpt, three
working hypotheses (runtime spawn race, `--remove-orphans`
interaction, engine `--rm` lifecycle), and the investigation
checklist before opening an issue. The README gets a one-line
pointer so future redeploys land on the doc immediately.

No code change — this is the placeholder so the next person
investigating the cancellation pattern does not have to
rediscover the diagnostic from scratch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 22:56:25 +02:00
developer 823be9d980 Phase 28: diplomatic mail UI
Deploy · Dev / deploy (push) Successful in 29s
Tests · Go / test (push) Successful in 2m5s
Tests · Integration / integration (push) Successful in 1m45s
Tests · UI / test (push) Successful in 2m31s
Merges feat/ui-stage-28 into development.
2026-05-16 20:56:16 +00:00
Ilia Denisov 2119f825d6 mail UI: dedupe broadcast fan-out and drop in-game admin compose
Tests · UI / test (push) Has been cancelled
Tests · Integration / integration (pull_request) Successful in 1m46s
Tests · Go / test (pull_request) Successful in 2m6s
Tests · UI / test (pull_request) Successful in 2m24s
Two issues surfaced once the long-lived dev environment finally
reached the diplomail view:

1. `/sent` returns one row per recipient for broadcast and admin
   fan-outs (so the admin tooling can render the materialised
   audience). The list pane fed all rows into the stand-alone
   bucket, so the `{#each entries as e (entryKey(e))}` key in
   `thread-list.svelte` collapsed to the same `standalone:${id}`
   for every recipient and Svelte 5 aborted the render with
   `each_key_duplicate`. Dedupe stand-alones by `message_id` in
   `buildEntries`.

2. The compose dialog exposed an `admin` kind toggle gated on
   "owner of game". That was a Phase 28 plan decision, but admin
   compose is an operator tool (server admin), not an in-game
   action — every game owner should not be able to broadcast
   admin notifications. Drop the admin option, the audience
   sub-toggles, and the admin path through `submit`. The
   `MailStore.composeAdmin` wrapper and the backend RPC stay so
   the future admin UI can call them.

Vitest covers the fan-out dedup with three rows sharing one
`message_id` collapsing to a single stand-alone entry.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 22:38:59 +02:00
Ilia Denisov 57e6c1d253 gateway: CORS allow-list for the authenticated Connect-Web surface
Tests · Go / test (push) Successful in 2m9s
Tests · Go / test (pull_request) Successful in 2m9s
Tests · Integration / integration (pull_request) Successful in 1m47s
Tests · UI / test (pull_request) Successful in 2m52s
The public REST listener already exposes
`GATEWAY_PUBLIC_HTTP_CORS_ALLOWED_ORIGINS`; the authenticated
Connect-Web listener on the separate gRPC port had no equivalent.
That worked in `tools/local-dev` (Vite proxy makes everything
same-origin) and would work in production once UI and gateway share
a single hostname, but the long-lived dev environment serves the
UI from `https://www.galaxy.lan` and the gateway from
`https://api.galaxy.lan` — every `/galaxy.gateway.v1.EdgeGateway/*`
fetch failed in the browser with the WebKit "Load failed" generic
message because the response carried no `Access-Control-Allow-Origin`
header. Lobby rendered as "[unknown] Load failed" with no game.

Mirror the public-REST CORS surface for the authenticated handler:

- new env `GATEWAY_AUTHENTICATED_GRPC_CORS_ALLOWED_ORIGINS`;
- new `AuthenticatedGRPCConfig.CORSAllowedOrigins` field;
- new `grpcapi.withCORS` middleware wrapping the Connect mux;
- dev-deploy stack sets the env to `https://www.galaxy.lan`.

The middleware speaks plain net/http (the Connect handler is mounted
on a ServeMux, not gin), handles preflight 204 immediately, and
exposes the Connect-Web header set the browser needs to read the
response (`Grpc-Status`, `Grpc-Message`, `Connect-Protocol-Version`).
Empty allow-list disables the middleware — production stays at
"single hostname" by default.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 22:15:11 +02:00
Ilia Denisov 4b2a949f12 dev-deploy Caddy: route Connect-Web traffic to gateway :9090
Tests · Integration / integration (pull_request) Successful in 1m44s
Tests · Go / test (pull_request) Successful in 2m6s
Tests · UI / test (pull_request) Successful in 2m27s
`api.galaxy.lan` was proxying every path to `galaxy-api:8080` (the
public REST listener), so authenticated Connect-Web calls
(`/galaxy.gateway.v1.EdgeGateway/ExecuteCommand`,
`/galaxy.gateway.v1.EdgeGateway/SubscribeEvents`) collapsed to a 404
from the public route table — the lobby loaded the static bundle
but every authenticated query failed silently.

Split routing by path: `/galaxy.gateway.v1.EdgeGateway/*` goes to
the authenticated listener on `:9090`, everything else stays on
`:8080`. Mirrors the Vite dev-server proxy in
`ui/frontend/vite.config.ts`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 22:03:55 +02:00
Ilia Denisov 81917acc3e dev-deploy: enable Dev Sandbox bootstrap and synthetic-report loader
Tests · UI / test (push) Has been cancelled
Tests · Integration / integration (pull_request) Successful in 1m47s
Tests · Go / test (pull_request) Successful in 2m4s
Tests · UI / test (pull_request) Successful in 2m23s
Two long-standing dev-environment ergonomics had not survived the
move from the bespoke local-dev stack to the CI-driven dev-deploy:

1. `BACKEND_DEV_SANDBOX_EMAIL` defaulted to an empty string in the
   dev-deploy compose, so the auto-provisioned "Dev Sandbox" game
   never appeared on `https://www.galaxy.lan`. Bake `dev@galaxy.lan`
   as the default — matches `.env.example` and lets a developer who
   logs in with that email find a ready-to-play game in the lobby.

2. The lobby's synthetic-report loader was gated on
   `import.meta.env.DEV`, which is true only for `vite dev` (the
   tools/local-dev path). The long-lived dev environment builds
   with `vite build` (production mode), so the section was always
   stripped from its bundle. Gate it on an explicit
   `VITE_GALAXY_DEV_AFFORDANCES` flag instead and set it both in
   `.env.development` (preserves `pnpm dev` behaviour) and in the
   `dev-deploy.yaml` build step. The `prod-build.yaml` build path
   leaves the flag unset, so production stays clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 21:46:24 +02:00
Ilia Denisov 859b157a59 auth dev-fixed-code bypasses attempts cap; dev-deploy gains manual dispatch
Tests · Go / test (pull_request) Successful in 2m9s
Tests · Go / test (push) Successful in 2m9s
Tests · Integration / integration (pull_request) Successful in 1m49s
Tests · UI / test (pull_request) Successful in 2m51s
Two problems showed up while trying to log into the long-lived dev
environment with the dev-fixed code `123456`:

1. `ConfirmEmailCode` checked the per-challenge attempts ceiling
   *before* the dev-fixed-code override. A developer who burned past
   `ChallengeMaxAttempts` on an existing un-consumed challenge (easy
   to trigger when the throttle reuses one challenge_id) hit
   `ErrTooManyAttempts` and the UI rendered "code expired or already
   used" even though the fixed code was correct. Reorder so the
   dev-fixed-code branch runs first and bypasses both the bcrypt
   verify and the attempts gate. Production stays unaffected
   because production loaders refuse to set `DevFixedCode`.

2. `dev-deploy.yaml` only fires on push to `development`, so the
   matching docker-compose default change for
   `BACKEND_AUTH_DEV_FIXED_CODE` could not reach the running stack
   before this PR merged. Add `workflow_dispatch: {}` so a developer
   can deploy any branch — typically a feature branch under review —
   from the Gitea Actions UI without waiting for the merge.

Covered by a new `TestConfirmEmailCodeDevFixedCodeBypassesAttemptsCeiling`
integration test that burns through the ceiling with wrong codes
then proves the dev-fixed code still produces a session.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 21:28:30 +02:00
Ilia Denisov 166baf4be0 battle-viewer e2e: mock user.games.battle ConnectRPC command
Tests · UI / test (push) Has been cancelled
Tests · Integration / integration (pull_request) Successful in 1m46s
Tests · Go / test (pull_request) Successful in 2m0s
Tests · UI / test (pull_request) Successful in 2m23s
Phase 28 moved the battle fetch off the REST passthrough onto the
signed envelope, so the Playwright spec's `page.route(...)` against
the old REST path no longer intercepts anything and the viewer
times out waiting for data. Update the spec to:

- Build a FlatBuffers `BattleReport` payload in
  `fixtures/battle-fbs.ts` (mirrors `report-fbs.ts`'s pattern).
- Add a `user.games.battle` case to the ExecuteCommand mock that
  decodes the FBS `GameBattleRequest`, returns the encoded report
  when the battle_id matches the seeded one, and surfaces a
  canonical `not_found` resultCode otherwise.
- Drop the obsolete REST route stubs.
- Drive the negative-path test with a real UUID that does not match
  the seeded one, so the gateway-side switch is the source of the
  404 (the old `missing-uuid` literal was no longer a valid wire
  shape for the UUID decoder).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 12:55:15 +02:00
Ilia Denisov ebd156ece2 battle-fetch: migrate to user.games.battle ConnectRPC command
Tests · UI / test (push) Has been cancelled
Tests · Go / test (pull_request) Successful in 2m6s
Tests · Go / test (push) Successful in 2m7s
Tests · Integration / integration (pull_request) Successful in 1m47s
Tests · UI / test (pull_request) Failing after 3m42s
The Phase 27 BattleViewer was the last UI surface still issuing raw
fetch() against the backend REST contract (`/api/v1/user/games/...
/battles/...`). The dev-deploy gateway never proxied that path, so
the viewer worked only in tools/local-dev/. Move it onto the signed
ConnectRPC channel every other authenticated surface already uses.

Wire pieces:
- FBS GameBattleRequest in pkg/schema/fbs/battle.fbs, regenerated
  Go + TS bindings.
- MessageTypeUserGamesBattle constant + GameBattleRequest struct in
  pkg/model/report/messages.go.
- pkg/transcoder/battle.go gains GameBattleRequestToPayload and
  PayloadToGameBattleRequest helpers.
- gateway games_commands.go switches on the new message type and
  GETs /api/v1/user/games/{id}/battles/{turn}/{battle_id}; the JSON
  response is re-encoded as a FlatBuffers BattleReport before being
  returned. 404 from backend surfaces as the canonical `not_found`
  gateway error.
- ui/frontend/src/api/battle-fetch.ts now builds the FBS request,
  calls GalaxyClient.executeCommand, and decodes the FBS response
  into the existing UI shape (Record<string,string> race/ship maps,
  string-form UUID). BattleFetchError carries an HTTP-style status
  derived from the result code so the active-view's not_found branch
  keeps working.
- battle.svelte pulls the GalaxyClient from the in-game shell
  context. While the layout's boot Promise.all is in flight the
  effect stays in `loading` until the client handle becomes
  non-null.
- ui/Makefile FBS_INPUTS gains battle.fbs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 12:41:54 +02:00
Ilia Denisov 8bc75fd71b dev-deploy: default BACKEND_AUTH_DEV_FIXED_CODE to 123456
The long-lived dev environment now opts into the bcrypt-bypass on a
fresh `up`/`rebuild` so a returning developer can sign in with `123456`
even after the matching browser session was cleared (the real emailed
code is single-use). Set the variable to an empty string in `.env` to
force real Mailpit codes (mail-flow QA).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 12:41:32 +02:00