R2: load-test harness + contour resource observability
CI / changes (pull_request) Successful in 2s
CI / unit (pull_request) Successful in 9s
CI / integration (pull_request) Successful in 11s
CI / ui (pull_request) Successful in 38s
CI / gate (pull_request) Successful in 0s
CI / deploy (pull_request) Failing after 3s
CI / changes (pull_request) Successful in 2s
CI / unit (pull_request) Successful in 9s
CI / integration (pull_request) Successful in 11s
CI / ui (pull_request) Successful in 38s
CI / gate (pull_request) Successful in 0s
CI / deploy (pull_request) Failing after 3s
New scrabble/loadtest module (the pre-release stress harness): seeds 1000 guest + 10000 durable accounts with pre-created sessions directly in Postgres (token hash matches backend/internal/session), drives virtual players through the edge protocol (real 2-4p games assembled via invitations, mid-ranked legal moves generated locally by the embedded scrabble-solver — the edge carries no board, so the client replays history), plus nudge/chat/check-word/draft/profile/stats and a gateway-hammer that verifies the rate limiter. Prints a trip-report summary (per-op latency percentiles, result codes, live-event tally). Go unit tests cover the pure pieces; the DAWG-backed move test runs under BACKEND_DICT_DIR. Contour: add cAdvisor + postgres_exporter + a 'Scrabble - Resources' Grafana dashboard and the two Prometheus scrape jobs, for the R2/R7 stress-run resource baseline. CI: gate ./loadtest/... (path filter + vet/build/test). Docs: TESTING, ARCHITECTURE, project CLAUDE repo layout.
This commit is contained in:
@@ -0,0 +1,94 @@
|
||||
# loadtest — R2 stress harness
|
||||
|
||||
Reusable load harness for the pre-release stress pass (`PRERELEASE.md` R2/R7). It
|
||||
seeds a large account population with pre-created sessions, drives virtual players
|
||||
through the **gateway edge protocol** in realistic games, hammers the rate limiter,
|
||||
and prints a trip-report summary. It stays in the repo for repeats.
|
||||
|
||||
## What it does
|
||||
|
||||
1. **Seed** (direct Postgres, schema `backend`): inserts `--durable` durable accounts
|
||||
(each with a confirmed email identity) + `--guest` guest accounts and an active
|
||||
`sessions` row per account, then hands the plaintext bearer tokens to the driver.
|
||||
Token hashes match `backend/internal/session` (`hex(sha256(token))`), so the seeded
|
||||
sessions resolve. Every row is tagged with the `lt:` marker for cleanup.
|
||||
2. **Drive** (edge protocol over h2c): assembles real 2–4 player games via the
|
||||
invitation flow (`invitation.create` → `invitation.accept`, no robots), then runs
|
||||
each player's turn loop — poll `game.state`, replay `game.history`, generate a legal
|
||||
**mid-ranked** move with the embedded `scrabble-solver`, and `game.submit_play`
|
||||
(or pass/exchange). A fraction of turns exercise nudge / chat / check-word / draft /
|
||||
profile-update / stats. Each player also holds a live `Subscribe` stream. The
|
||||
moderate ramp is **50 → 200 → 500** concurrent players, ~12 min per step.
|
||||
3. **Hammer**: drives `games.list` from one account far above the per-user rate limit
|
||||
to verify the limiter holds (`rate_limited` results) and measure its cost.
|
||||
4. **Report**: per-operation latency percentiles, throughput, result-code breakdown,
|
||||
live-event tally and the aggregate error rate.
|
||||
|
||||
The driver runs the solver **locally** because the edge protocol carries no board: the
|
||||
client reconstructs it from decoded history (the same invariant as the UI).
|
||||
|
||||
## Connection model
|
||||
|
||||
The harness reaches Postgres and the gateway directly, so run it as a one-shot
|
||||
container on the contour's docker network (this bypasses the host→gateway hairpin):
|
||||
|
||||
```sh
|
||||
# from the repo root
|
||||
docker build -f loadtest/Dockerfile -t scrabble-loadtest .
|
||||
|
||||
docker run --rm --name scrabble-loadtest --network scrabble-internal \
|
||||
-e POSTGRES_PASSWORD="$TEST_POSTGRES_PASSWORD" \
|
||||
scrabble-loadtest run
|
||||
```
|
||||
|
||||
Defaults assume the contour service names: `postgres:5432` and `gateway:8081`. The
|
||||
DAWGs are baked into the image (`/opt/dawg`, pinned to the dictionary release). Run with
|
||||
`--name scrabble-loadtest` so the harness's own CPU/memory show up as a `scrabble-*`
|
||||
series in cAdvisor (keeping it separable from the system under test). Capture the
|
||||
resource baseline from the Grafana **Scrabble — Resources** dashboard
|
||||
(cAdvisor + postgres_exporter) while the run is in progress.
|
||||
|
||||
## Commands & flags
|
||||
|
||||
```
|
||||
loadtest run [flags] seed, drive the ramp + hammer, print the report
|
||||
loadtest cleanup [flags] delete everything the harness seeded (matched by the lt: marker)
|
||||
```
|
||||
|
||||
Key `run` flags (env in parentheses):
|
||||
|
||||
| flag | default | meaning |
|
||||
|------|---------|---------|
|
||||
| `--gateway` (`LOADTEST_GATEWAY_URL`) | `http://gateway:8081` | gateway base URL |
|
||||
| `--dsn` (`LOADTEST_DSN`) | from `POSTGRES_*` | backend Postgres DSN (schema `backend`) |
|
||||
| `--dawg` (`LOADTEST_DAWG_DIR`) | `/dawg` (image: `/opt/dawg`) | committed `*.dawg` directory |
|
||||
| `--durable` / `--guest` | `10000` / `1000` | accounts to seed |
|
||||
| `--steps` | `50,200,500` | concurrent-player ramp steps |
|
||||
| `--step-dur` | `12m` | hold time per step |
|
||||
| `--games-per-player` | `0` (random 3–5) | target concurrent games per player |
|
||||
| `--tick` | `800ms` | per-player op cadence (keeps a player under the per-user limit) |
|
||||
| `--secondary-prob` | `0.08` | chance per tick of a non-move op |
|
||||
| `--hammer-workers` / `--hammer-dur` | `20` / `15s` | gateway-hammer (0 workers disables) |
|
||||
| `--reset` / `--cleanup` | `false` | delete harness rows before / after the run |
|
||||
|
||||
`run` re-seeds every time (plaintext tokens are never stored), so pass `--reset` to
|
||||
clear a prior run's rows first. The authoritative hard reset of the contour remains the
|
||||
DB wipe (`DROP SCHEMA backend CASCADE` + backend restart).
|
||||
|
||||
## Build & test
|
||||
|
||||
```sh
|
||||
go build ./loadtest/...
|
||||
go vet ./loadtest/...
|
||||
BACKEND_DICT_DIR=../scrabble-solver/dawg go test -count=1 ./loadtest/...
|
||||
```
|
||||
|
||||
The DAWG-backed `moves` test runs only when `BACKEND_DICT_DIR` is set (as the engine
|
||||
tests use); the pure logic (hashing, board replay, rack build, move selection, report)
|
||||
runs unconditionally.
|
||||
|
||||
## Caveat
|
||||
|
||||
The harness shares the host CPU with the contour, so the early-pass resource baseline
|
||||
is read with the harness's own container series in mind; a cleaner number on separate
|
||||
hardware is an R7 goal. The moderate ramp keeps the generator from being the bottleneck.
|
||||
Reference in New Issue
Block a user