Files
Ilia Denisov b23649059f legacy-report: parse battles + envelope JSON output
Side activity on top of Phase 27: the legacy-report tool now extracts
the "Battle at (#N) Name" / "Battle Protocol" blocks the parser used
to skip. Both the per-battle summary (Report.Battle: []BattleSummary)
and the full BattleReport (rosters + protocol) flow through.

Parser:
- new sectionBattle / sectionBattleProtocol states, with handle()
  trapping the per-race "<Race> Groups" sub-headers so the roster
  stays attributed to the right race;
- parseBattleHeader extracts (planet, planetName) from
  "Battle at (#NN) <Name>";
- parseBattleRosterRow maps the 10-token row into
  BattleReportGroup; column 8 ("L") is NumberLeft, confirmed against
  KNNTS fixtures;
- parseBattleProtocolLine counts shots and builds
  BattleActionReport entries from the 8-token "X Y fires on A B :
  Destroyed|Shields" lines;
- flushPendingBattle finalises a battle on next "Battle at" or any
  top-level section change and appends both the summary and the
  full report;
- syntheticBattleID(idx) + syntheticBattleRaceID(name) synthesise
  stable UUIDs in dedicated namespaces so re-runs produce
  byte-identical JSON.

Parse() signature widens to (Report, []BattleReport, error); the
single caller — the CLI — is updated.

CLI emits a v1 envelope:
  { "version": 1, "report": <Report>, "battles": { <uuid>: <BR>, ... } }
Bare-Report JSONs still load on the UI side for backward compat.

UI synthetic loader: loadSyntheticReportFromJSON detects the v1
envelope, decodes the report as before, and forwards every battle
through registerSyntheticBattle so the Battle Viewer resolves any
UUID offline. Pre-envelope JSON files (no `version` field) still
load — the battle registry stays empty for them.

Docs: legacy-report README moves Battles from "Skipped" to
in-scope, documents the envelope and UUID namespaces;
docs/FUNCTIONAL.md §6.5 (and the ru mirror) note that synthetic
mode is now end-to-end via the envelope.

Tests:
- TestParseBattles covers two battles with full rosters,
  per-shot destroyed/shielded mapping, NumberLeft from column 8,
  deterministic UUIDs across re-parses, and proves a trailing
  top-level section still parses (battle state closes cleanly);
- smokeWant gains a battles count; runSmoke cross-checks
  BattleSummary ↔ BattleReport alignment (id/planet/shots);
- all six real-fixture smoke tests pinned to their `Battle at`
  counts (28, 79, 56, 30, 83, 57);
- Vitest covers the synthetic-report envelope path (battles
  forwarded, missing-battles tolerated, bare-Report backward
  compat);
- KNNTS041.json regenerated against the new parser (existing
  diff was stale w.r.t. Phase 23 anyway; this commit brings it
  in line with the v1 envelope).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 14:22:53 +02:00

198 lines
9.0 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# legacy-report-to-json
Converts legacy text-format Galaxy turn reports (the *dg* and *gplus*
engines that lived under `tools/local-dev/reports/`) into a JSON
envelope around [`pkg/model/report.Report`](../../../pkg/model/report)
plus full `BattleReport`s (Phase 27).
## Output envelope
```jsonc
{
"version": 1,
"report": { /* report.Report */ },
"battles": { "<uuid>": { /* report.BattleReport */ }, ... }
}
```
`version: 1` lets the UI distinguish a current-format envelope from a
bare `Report` JSON. The synthetic-report loader accepts both — pre-
envelope synthetic JSON files still load, just without battle
fixtures. `battles` is omitted when the legacy file has no combat
events.
The output is consumed by the **DEV-only synthetic-report loader** on
the UI client's lobby (`import.meta.env.DEV`). With it, the map view,
inspectors, and order-overlay can be exercised against rich game
states without playing many turns end-to-end against a real backend.
The tool is part of the synthetic-report parity rule documented in
[`ui/PLAN.md`](../../../ui/PLAN.md).
## Build / run
```sh
# from the repo root, with the Go workspace active
go run ./tools/local-dev/legacy-report/cmd/legacy-report-to-json \
--in tools/local-dev/reports/dg/KNNTS041.REP \
--out tools/local-dev/reports/dg/KNNTS041.json
```
`--in` reads `-` as stdin; `--out` defaults to stdout when empty or
`-`. The tool exits non-zero on any I/O or parse failure.
## Supported input variants
| Variant | Sample dir | Status |
| ------- | ------------------------------------- | ------------- |
| dg | `tools/local-dev/reports/dg/*.REP` | First-class |
| gplus | `tools/local-dev/reports/gplus/*.REP` | First-class |
| ng | `tools/local-dev/reports/ng/*.rep` | Not supported |
| lucky | `tools/local-dev/reports/lucky/*.rep` | Not supported |
dg uses CRLF line endings, gplus uses LF and tabs in section indentation;
both are space-aligned tabular inside data blocks. The parser splits on
runs of whitespace (`strings.Fields`) so the same code handles both.
Pseudo-Cyrillic glyphs (`MbI`, `KAMA3`, `9IMA`) appear in some races
and ship class names but are stored as plain ASCII letter substitutions
— no encoding conversion is needed.
## In-scope fields (current)
The parser only fills the subset of `report.Report` that the UI client
already decodes from server responses
(`ui/frontend/src/api/game-state.ts``decodeReport`):
| `report.Report` field | Source section in legacy file |
| --------------------- | ------------------------------------ |
| `Race` | `<Race> Report for Galaxy ...` line |
| `Turn` | same |
| `Width`, `Height` | `Size: N` (square galaxies) |
| `PlanetCount` | `Planets: N` |
| `VoteFor`, `Votes` | `Your vote:` block |
| `Player[]` | `Status of Players (total ...)` |
| `LocalPlanet[]` | `Your Planets` |
| `OtherPlanet[]` | `<Race> Planets` (one per race) |
| `UninhabitedPlanet[]` | `Uninhabited Planets` |
| `UnidentifiedPlanet[]`| `Unidentified Planets` |
| `LocalShipClass[]` | `Your Ship Types` |
| `OtherShipClass[]` | `<Race> Ship Types` (Phase 23) |
| `LocalScience[]` | `Your Sciences` (Phase 23) |
| `OtherScience[]` | `<Race> Sciences` (Phase 23) |
| `Bombing[]` | `Bombings` (Phase 23) |
| `ShipProduction[]` | `Ships In Production` (Phase 23) |
| `LocalGroup[]` | `Your Groups` (Phase 19) |
| `LocalFleet[]` | `Your Fleets` (Phase 19) |
| `IncomingGroup[]` | `Incoming Groups` (Phase 19) |
| `Battle[]` (summary) | `Battle at (#N) Name` headers + `Battle Protocol` (Phase 27 follow-up) |
The envelope's `battles` map carries the full `BattleReport`-s parsed
out of the same blocks: every roster row turns into a
`BattleReportGroup` (`Number`/`Tech`/`LoadType`/`LoadQuantity`/
`NumberLeft`/`InBattle`), every `... fires on ... : Destroyed|Shields`
line turns into a `BattleActionReport`. UUIDs are synthesised
deterministically — `syntheticBattleID(idx)` for the battle
identifier (per-report 0-based index, SHA1 namespace
`be01a000-0000-0000-0000-000000000002`) and
`syntheticBattleRaceID(name)` for `BattleReport.Races` entries (SHA1
namespace `be01a000-0000-0000-0000-000000000003`). Re-running the
converter on the same input file yields byte-identical JSON, so
synthetic-mode UI URLs (`/games/synthetic-…/battle/<uuid>?turn=N`)
stay stable across regenerations.
Players whose name in the legacy file ends with `_RIP` are emitted with
the suffix stripped and `Extinct: true`.
`LocalGroup.ID` is synthesised deterministically from the per-report
group index via `uuid.NewSHA1`, so re-running the converter on the same
input file yields byte-identical JSON.
`LocalGroup.Speed` is left at zero — the legacy "Your Groups" table does
not expose ship speed; the UI can derive it from `pkg/calc.Speed` if
ever required.
Origin / Range names that don't resolve against the parsed planet
tables (foreign-only knowledge the local player lacks) cause the entire
group / fleet / incoming row to be dropped — preferable to fabricating
a destination.
`ShipProduction.ProdUsed` is derived from the on-disk `Percent` and the
producing planet's material/resources via [`pkg/calc.ShipBuildCost`]
(the same helper the engine's `controller.ProduceShip` uses). The
legacy text format does not carry a `prod_used` column directly; the
derivation gives the cumulative production-equivalent of the build
progress so far. The real engine's `ProdUsed` is the per-turn
residual production poured into the partial ship, which is not
recoverable from a single legacy snapshot. The two numbers stay in
the same units and the same ballpark, which is good enough for the
synthetic-mode UI — live engine reports come over the FBS wire and
do not flow through this parser. A ships-in-production row pointing
at a planet that did not appear in `Your Planets` (which would be a
malformed legacy file) is dropped.
## Skipped sections (today)
These exist in legacy reports but cannot be derived from the legacy
text format at all. Each could become in-scope if a strong enough
reason arises (see "Adding a new field" below).
- `OtherGroup[]` — no top-level legacy section. Foreign groups appear
only inside battle rosters; the synthetic JSON emits
`otherGroup: []`.
- `UnidentifiedGroup[]` — no legacy section at all; synthetic JSON
emits `unidentifiedGroup: []`.
- Cargo routes — no dedicated section in the legacy text format; the
synthetic JSON emits `route: []`. The UI's overlay path
(`applyOrderOverlay`) supports running on top of an empty `routes`.
## Adding a new field
`ui/PLAN.md` carries a global rule: every UI phase that extends
`decodeReport` to read a new `report.Report` field also extends this
parser, in the same PR, to populate it from legacy text — or, if the
field cannot be derived, adds an entry to the **Skipped sections**
list above with a one-line explanation.
The Go side of the rule is enforced mechanically: this tool imports
`galaxy/model/report`, so any backwards-incompatible change to the
schema breaks the tool's compilation before the change ships.
When extending:
1. Identify the legacy section in `tools/local-dev/reports/dg/*.REP`
(and `gplus/*.REP`) that carries the field, using `game/rules.txt`
section "Отчет о результатах хода" as the column-layout reference.
2. Add a section to the state machine in `parser.go`
(`classifySection`, the `section` constants, the `parse*` methods).
3. Cover the new section with a unit test in `parser_test.go` (inline
minimal fixture) and update the smoke counts in
`TestParseDgKNNTS039` / `TestParseGplus40` so a future regression
that drops the section is caught.
4. Run `go test ./tools/local-dev/legacy-report/...`, then re-run the
CLI on `dg/KNNTS039.REP` and `gplus/40.REP` and visually skim the
JSON — the field should appear with sensible values.
## Tests
```sh
go test ./tools/local-dev/legacy-report/...
```
Inline fixtures exercise the per-section row parsers; smoke tests
parse the real fixtures under `tools/local-dev/reports/dg/` and
`tools/local-dev/reports/gplus/` and assert top-level counts. The
current smoke set spans:
- **dg/KNNTS039041** — KnightErrants saga; `041` is the only one
with `Incoming Groups`, exercising deferred name resolution.
- **dg/Killer031** — Killer engine variant with two `Your Fleets`
entries (`Fl1`, `F2`).
- **dg/Tancordia037** — the richest fixture: 311 local groups in
30 fleets, two incoming groups, "Incoming Groups" landing before
"Your Planets".
- **gplus/40.REP** — gplus variant; tabs in headers, pseudo-cyrillic
ship class names, single fleet, ten incoming groups.
Field-level fidelity is the inline tests' responsibility; the smoke
tests catch regressions where a refactor of the section classifier
silently drops a whole table.