R2: stress harness + contour resource observability + early run #33

Merged
developer merged 4 commits from feature/r2-loadtest-observability into development 2026-06-09 23:01:30 +00:00
Owner

R2 — Stress harness + contour observability

Builds the reusable load-test harness, adds resource observability to the contour, and (next commit on this branch) runs the early pass + writes the trip report.

New scrabble/loadtest module

  • Seed (direct Postgres, schema backend): 1000 guest + 10000 durable accounts with pre-created sessions; token hash matches backend/internal/session (hex(sha256)), so seeded sessions resolve. Marker-tagged for cleanup.
  • Drive (edge protocol over h2c): real 2–4p games assembled via the invitation flow (no robots); each player polls game.state, replays game.history and submits a mid-ranked legal move generated locally by the embedded scrabble-solver (the edge carries no board); a fraction do nudge/chat/check-word/draft/profile-update/stats; each holds a Subscribe stream. Moderate ramp 50 → 200 → 500.
  • Gateway-hammer: exceeds the per-user limit to verify the limiter holds.
  • Report: per-op latency percentiles, throughput, result-code breakdown, event tally.
  • Go unit tests for the pure pieces (hashing, board replay vs board.Parse, rack build, mid-rank, report); DAWG-backed move test under BACKEND_DICT_DIR.
  • loadtest/Dockerfile (distroless, DAWGs baked) + loadtest/README.md.

Contour observability (deploy/)

  • cadvisor + postgres_exporter services (version-pinned), two Prometheus scrape jobs, and a new Scrabble — Resources Grafana dashboard.

CI / docs

  • ./loadtest/... added to the path filter + vet/build/test.
  • docs/TESTING.md, docs/ARCHITECTURE.md, project CLAUDE.md repo layout.

Locked decisions (owner interview)

Game assembly = invitations · scale = moderate (50/200/500, ~12 min/step) · pass bar = diagnostic · run model = one-shot container on scrabble-internal, in this PR.

Still to land on this branch

The early-pass run against the freshly-deployed contour + loadtest/REPORT-R2.md + the R2 done-marker in PRERELEASE.md.

## R2 — Stress harness + contour observability Builds the reusable load-test harness, adds resource observability to the contour, and (next commit on this branch) runs the early pass + writes the trip report. ### New `scrabble/loadtest` module - **Seed** (direct Postgres, schema `backend`): 1000 guest + 10000 durable accounts with pre-created sessions; token hash matches `backend/internal/session` (`hex(sha256)`), so seeded sessions resolve. Marker-tagged for cleanup. - **Drive** (edge protocol over h2c): real 2–4p games assembled via the invitation flow (no robots); each player polls `game.state`, replays `game.history` and submits a **mid-ranked** legal move generated locally by the embedded `scrabble-solver` (the edge carries no board); a fraction do nudge/chat/check-word/draft/profile-update/stats; each holds a `Subscribe` stream. Moderate ramp **50 → 200 → 500**. - **Gateway-hammer**: exceeds the per-user limit to verify the limiter holds. - **Report**: per-op latency percentiles, throughput, result-code breakdown, event tally. - Go unit tests for the pure pieces (hashing, board replay vs `board.Parse`, rack build, mid-rank, report); DAWG-backed move test under `BACKEND_DICT_DIR`. - `loadtest/Dockerfile` (distroless, DAWGs baked) + `loadtest/README.md`. ### Contour observability (`deploy/`) - `cadvisor` + `postgres_exporter` services (version-pinned), two Prometheus scrape jobs, and a new **Scrabble — Resources** Grafana dashboard. ### CI / docs - `./loadtest/...` added to the path filter + vet/build/test. - `docs/TESTING.md`, `docs/ARCHITECTURE.md`, project `CLAUDE.md` repo layout. ### Locked decisions (owner interview) Game assembly = invitations · scale = moderate (50/200/500, ~12 min/step) · pass bar = diagnostic · run model = one-shot container on `scrabble-internal`, in this PR. ### Still to land on this branch The early-pass run against the freshly-deployed contour + `loadtest/REPORT-R2.md` + the R2 done-marker in `PRERELEASE.md`.
developer added 1 commit 2026-06-09 21:46:06 +00:00
R2: load-test harness + contour resource observability
CI / changes (pull_request) Successful in 2s
CI / unit (pull_request) Successful in 9s
CI / integration (pull_request) Successful in 11s
CI / ui (pull_request) Successful in 38s
CI / gate (pull_request) Successful in 0s
CI / deploy (pull_request) Failing after 3s
aa137e3558
New scrabble/loadtest module (the pre-release stress harness): seeds 1000 guest +
10000 durable accounts with pre-created sessions directly in Postgres (token hash
matches backend/internal/session), drives virtual players through the edge protocol
(real 2-4p games assembled via invitations, mid-ranked legal moves generated locally
by the embedded scrabble-solver — the edge carries no board, so the client replays
history), plus nudge/chat/check-word/draft/profile/stats and a gateway-hammer that
verifies the rate limiter. Prints a trip-report summary (per-op latency percentiles,
result codes, live-event tally). Go unit tests cover the pure pieces; the DAWG-backed
move test runs under BACKEND_DICT_DIR.

Contour: add cAdvisor + postgres_exporter + a 'Scrabble - Resources' Grafana
dashboard and the two Prometheus scrape jobs, for the R2/R7 stress-run resource
baseline.

CI: gate ./loadtest/... (path filter + vet/build/test). Docs: TESTING, ARCHITECTURE,
project CLAUDE repo layout.
developer added 1 commit 2026-06-09 21:57:37 +00:00
R2: drop ./loadtest from the backend/gateway/telegram image builds
CI / changes (pull_request) Successful in 1s
CI / unit (pull_request) Successful in 8s
CI / integration (pull_request) Successful in 13s
CI / ui (pull_request) Successful in 36s
CI / gate (pull_request) Successful in 0s
CI / deploy (pull_request) Successful in 1m8s
0c55574ddd
Adding the loadtest module to go.work (use ./loadtest + the scrabble/gateway
replace it needs) broke the other services' Docker builds: their reduced
workspace still referenced ./loadtest (not in their build context), failing with
'cannot load module loadtest: open loadtest/go.mod: no such file or directory'.
Each service Dockerfile now also -dropuse=./loadtest; backend and telegram (which
do not COPY ./gateway) additionally -dropreplace the loadtest-only scrabble/gateway
replace. Verified by building all three images plus loadtest locally.
developer added 2 commits 2026-06-09 22:47:27 +00:00
- display-name marker: letters-only 'Zzloadtest' (the editable-name validator
  forbids digits/colons), so profile.update resends the seeded name successfully.
- draft.save: rack_order is a string in the backend draft DTO (was sent as []),
  fixing the bad_request.
Both confirmed ok against the contour. chat_not_your_turn / nudge_own_turn are
by-design turn gates (backend/internal/social/chat.go), correctly exercised.
R2: early-pass trip report + mark R2 done
CI / changes (pull_request) Successful in 1s
CI / unit (pull_request) Successful in 9s
CI / integration (pull_request) Successful in 13s
CI / ui (pull_request) Successful in 37s
CI / gate (pull_request) Successful in 0s
CI / deploy (pull_request) Successful in 57s
a2265a122e
Ran the moderate early pass (50/200/500, 10 min/step) against the contour: ramped
clean to 500 players, 1.2 M edge calls, 48 870 plays, 2 798 games finished, no
crash/deadlock; cleanup removed all 11 000 seeded accounts. The per-user limiter held
under the gateway-hammer (99.97 % rejected, p99 2 ms).

Top finding: ~14 % transport_error on game.state at 500 players under CPU saturation
(backend/gateway/Postgres each ~1 core), amplified by the harness's single shared
http2.Transport (the harness itself peaked at 86 % of a core on the same host).
Observability finding: cAdvisor yields only the root cgroup on the contour host
(separate XFS /var/lib/docker); per-container metrics captured via docker stats; R7
should adopt the otelcol docker_stats receiver. Full report in loadtest/REPORT-R2.md;
PRERELEASE refinements logged; R2 marked done.
owner approved these changes 2026-06-09 22:59:48 +00:00
developer merged commit c23ac94c4e into development 2026-06-09 23:01:30 +00:00
developer deleted branch feature/r2-loadtest-observability 2026-06-09 23:01:30 +00:00
Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: developer/scrabble-game#33