R7: trip report + docs/tracker bake-back; mark R7 done
CI / changes (pull_request) Successful in 1s
CI / unit (pull_request) Successful in 9s
CI / integration (pull_request) Successful in 12s
CI / ui (pull_request) Successful in 37s
CI / gate (pull_request) Successful in 0s
CI / deploy (pull_request) Successful in 58s

- loadtest/REPORT-R7.md: the final stress-run report — method, the 500-player resource
  profile, the agreed tuning, the validation (transport_error 2.49% -> 0.72% at 3 gateway
  cores; the burst run showing connection-bound behavior), and the prod-sizing
  recommendation for Stage 18.
- loadtest/README.md: per-player transports, --cpus capping, docker_stats (was cAdvisor),
  the absolute BACKEND_DICT_DIR for ./loadtest/... , and report links.
- docs/TESTING.md + docs/ARCHITECTURE.md: observability now uses the otelcol docker_stats
  receiver (cAdvisor removed); links to both trip reports.
- CLAUDE.md: repo-layout line reflects docker_stats + per-service limits.
- PRERELEASE.md: R7 marked done in the tracker + heading; a Refinements entry recording
  the decisions, findings, applied tuning and validation.

This is the final pre-release hardening phase; Stage 18 (prod cutover) is next.
This commit is contained in:
Ilia Denisov
2026-06-11 11:18:57 +02:00
parent f23da88028
commit 2a48df9b83
6 changed files with 257 additions and 21 deletions
+25 -11
View File
@@ -36,17 +36,21 @@ container on the contour's docker network (this bypasses the host→gateway hair
# from the repo root
docker build -f loadtest/Dockerfile -t scrabble-loadtest .
docker run --rm --name scrabble-loadtest --network scrabble-internal \
docker run --rm --cpus=3 --name scrabble-loadtest --network scrabble-internal \
-e POSTGRES_PASSWORD="$TEST_POSTGRES_PASSWORD" \
scrabble-loadtest run
```
Defaults assume the contour service names: `postgres:5432` and `gateway:8081`. The
DAWGs are baked into the image (`/opt/dawg`, pinned to the dictionary release). Run with
Each virtual player gets its own `edge.Client` (its own h2c connection), mirroring real
clients rather than multiplexing every player over one transport. Defaults assume the
contour service names: `postgres:5432` and `gateway:8081`. The DAWGs are baked into the
image (`/opt/dawg`, pinned to the dictionary release). On a host shared with the contour,
cap the harness (`--cpus=3`) so the contour keeps the spare cores. Run with
`--name scrabble-loadtest` so the harness's own CPU/memory show up as a `scrabble-*`
series in cAdvisor (keeping it separable from the system under test). Capture the
resource baseline from the Grafana **Scrabble — Resources** dashboard
(cAdvisor + postgres_exporter) while the run is in progress.
series in the metrics (keeping it separable from the system under test). Capture the
resource baseline from the Grafana **Scrabble — Resources** dashboard (the otelcol
`docker_stats` receiver + postgres_exporter), or from `docker stats` directly, while the
run is in progress.
## Commands & flags
@@ -80,15 +84,25 @@ DB wipe (`DROP SCHEMA backend CASCADE` + backend restart).
```sh
go build ./loadtest/...
go vet ./loadtest/...
BACKEND_DICT_DIR=../scrabble-solver/dawg go test -count=1 ./loadtest/...
BACKEND_DICT_DIR="$PWD/../scrabble-solver/dawg" go test -count=1 ./loadtest/...
```
The DAWG-backed `moves` test runs only when `BACKEND_DICT_DIR` is set (as the engine
tests use); the pure logic (hashing, board replay, rack build, move selection, report)
runs unconditionally.
runs unconditionally. Use an **absolute** path (here via `$PWD`): `go test ./loadtest/...`
runs each package from its own directory, so a relative `BACKEND_DICT_DIR` would not
resolve.
## Trip reports
The two stress passes are written up in the repo: the early pass in
[`REPORT-R2.md`](REPORT-R2.md) and the final, tuned pass in
[`REPORT-R7.md`](REPORT-R7.md).
## Caveat
The harness shares the host CPU with the contour, so the early-pass resource baseline
is read with the harness's own container series in mind; a cleaner number on separate
hardware is future work. The moderate ramp keeps the generator from being the bottleneck.
The harness shares the host CPU with the contour, so its own `scrabble-loadtest`
container series is read alongside the system under test; capping it with `--cpus`
keeps the contour's quota. Per-player transports (R7) removed the shared-transport
artifact that inflated R2's `transport_error`, so the figures reflect the system. A
fully isolated ceiling on separate hardware remains future work.