Stage 5: robot opponent (pool, seed-derived strategy, move driver, matchmaker substitution)
Tests · Go / test (push) Successful in 6s
Tests · Integration / integration (push) Successful in 10s
Tests · Go / test (pull_request) Successful in 5s
Tests · Integration / integration (pull_request) Successful in 10s

- internal/robot: durable kind='robot' account pool (migration 00004); every
  per-game and per-turn choice derived deterministically from the game seed
  (restart-stable FNV mix); a background move driver; margin targeting (band
  1-30, closest-to-band); right-skewed [2,90]min delays (median ~10m);
  opponent-anchored sleep with +/-3h drift; daytime nudge reply + proactive
  12h nudge; friend/chat blocked via profile toggles.
- engine.Candidates (decoded ranked plays); game.Candidates + RobotTurns;
  social.LastNudgeAt.
- matchmaker: 10s wait then robot substitution (reaper) + Poll delivery seam.
- config (BACKEND_ROBOT_DRIVE_INTERVAL, BACKEND_LOBBY_ROBOT_WAIT,
  BACKEND_LOBBY_REAPER_INTERVAL); main wiring + boot-time pool provisioning.
- metrics: robot account_stats (authoritative balance) + robot_games_finished_total
  OTel counter + per-finish log.
- docs: PLAN, ARCHITECTURE, FUNCTIONAL(+ru), TESTING, README; account.go comment.
- tests: robot strategy units, matchmaker reaper/Poll, engine.Candidates; inttest
  robot full-game / substitution / proactive-nudge.
This commit is contained in:
Ilia Denisov
2026-06-02 21:02:20 +02:00
parent 12fc6e498e
commit 85baabe4ba
26 changed files with 1700 additions and 85 deletions
+42 -18
View File
@@ -87,7 +87,8 @@ arrive from a platform rather than completing a mandatory registration).
a platform auto-provisions a durable account bound to that platform identity.
Concretely, platform and email identities share one `identities` table keyed by
a unique `(kind, external_id)`; email is an identity with `kind=email` and a
`confirmed` flag. The **email confirm-code flow** (Stage 4) binds an email to the
`confirmed` flag. A synthetic `kind='robot'` identity (Stage 5) backs each pooled
robot opponent (§7). The **email confirm-code flow** (Stage 4) binds an email to the
authenticated account: a 6-digit code (stored only as a SHA-256 hash, 15-minute
TTL, ≤ 5 attempts) is sent through a `Mailer` seam (an SMTP relay, or a
development log mailer when none is configured) and, once verified, attaches a
@@ -191,20 +192,37 @@ Key points:
## 7. Robot opponent
Substitutes for a human in 2-player auto-match when the pool yields no human
within 10 seconds. Designed to be indistinguishable from a person.
within 10 seconds (§8). It lives in `internal/robot` and plays as an ordinary
seated account through the game service, so only `internal/engine` imports the
solver. It is designed to be indistinguishable from a person.
The robot keeps **no per-game state**: every choice is derived deterministically
from the game's bag `seed` (a restart-stable FNV-1a mix), so a background driver
(`robot.Service.Run`, mirroring the turn-timeout sweeper) recomputes the same
behaviour on every scan and after a restart — the same philosophy as journal
replay. A pool of durable accounts — each a `kind='robot'` identity (§4),
provisioned at startup with chat and friend requests blocked — backs the
human-like name pool; those two profile toggles are all the friend/DM blocking
requires (there is no DM surface; chat is per-game).
- **Balance**: at game start it decides once whether to play to win, with
`P(play-to-win) ≈ 0.40` (so the human wins ≈ 60%). Adaptive difficulty is
post-MVP.
- **Margin targeting**: each turn it picks from `GenerateMoves` a move that
keeps the resulting lead (when playing to win) or deficit (when playing to
lose) small (≈ 120 points), rather than always the maximum.
`P(play-to-win) ≈ 0.40` (so the human wins ≈ 60%), derived from the seed.
Adaptive difficulty is post-MVP.
- **Margin targeting**: each turn it picks from the ranked candidates
(`engine.Candidates`) the move whose resulting lead (playing to win) or deficit
(playing to lose) is closest to a small band (**130 points**), rather than
always the maximum; with no legal play it exchanges a full rack when the bag can
refill it, else passes.
- **Timing**: per-move delay sampled from a right-skewed distribution (short
delays frequent), clamped to **[2, 90] minutes**; **sleeps 00:0007:00** in
the opponent's profile timezone (fallback UTC); on a daytime nudge after 60
minutes idle it replies within **210 minutes**; it proactively nudges the
human after 12 hours idle.
- Blocks friend requests and direct messages; uses a human-like name pool.
delays frequent, median ≈ 10 min), clamped to **[2, 90] minutes**; it
**sleeps 00:0007:00** anchored to the **opponent's** profile timezone with a
per-game drift of **±3 h** (fallback UTC), so its night overlaps the human's
rather than running anti-phase; on a daytime nudge it replies within
**210 minutes**; it proactively nudges the human after **12 hours** idle
(subject to the once-per-hour chat limit).
- **Observability**: robot accounts accrue ordinary statistics (§9) — the
authoritative balance metric (target ≈ 40% robot wins) — and a
`robot_games_finished_total` OTel counter plus a per-finish log give a live view.
## 8. Lobby & social
@@ -212,8 +230,10 @@ within 10 seconds. Designed to be indistinguishable from a person.
fixes the board language), pairing the next two humans into a two-player
auto-match with the seat order randomised for first-move fairness. The pool is
lost on restart (players re-queue) and is anonymous, so it does not consult
blocks. The 10 s wait and the **robot substitution** for a missing human are
added in Stage 5.
blocks. After **10 s** with no human a background reaper substitutes a pooled
robot (§7) and starts the game. A queued player learns of a pairing or a
substitution through the matchmaker's `Poll`, the interim delivery seam until the
live match-found notification (§10).
- **Friends**: a **request → accept** graph (one `friendships` table) — add by
friend list or internal ID now, by platform deep-link with Stage 8. Declining or
cancelling removes the pending request; blocking someone severs an existing
@@ -252,7 +272,8 @@ within 10 seconds. Designed to be indistinguishable from a person.
keys are application-generated **UUIDv7**.
- Tables: `accounts` (durable internal accounts; Stage 3 added the away-window
columns `away_start`/`away_end` and the hint wallet `hint_balance`),
`identities` (platform/email identities, unique `(kind, external_id)`),
`identities` (platform/email/robot identities, unique `(kind, external_id)`;
Stage 5's migration `00004` admits the `robot` kind),
`sessions` (revoke-only opaque-token hashes), the Stage 3 game tables
`games` (Stage 4 added the `dropout_tiles` disposition column), `game_players`,
`game_moves` (the move journal), `complaints` and `account_stats`, and the
@@ -301,9 +322,12 @@ does not cover.
Two channels: **platform-native push** (out-of-app, via the platform
side-service — your-turn, nudge) and the **in-app live stream** (chat,
opponent-moved, while the app is open). Backend emits notification intents;
delivery fans out to the appropriate channel. Stage 4 **persists** the
notification-worthy events (chat messages and nudges) but does not yet deliver
them: the gRPC stream to the gateway and the platform push arrive in Stage 6 / 8.
delivery fans out to the appropriate channel. A **match-found** event (a human
pairing or a robot substitution in auto-match, §8) belongs to the same fabric.
Stage 4 **persists** the notification-worthy events (chat messages and nudges) but
does not yet deliver them, and Stage 5's match-found has no live channel yet: the
gRPC stream to the gateway and the platform push arrive in Stage 6 / 8. Until then
a waiting client retrieves its started game by polling the matchmaker (`Poll`).
## 11. Observability
+8 -3
View File
@@ -49,9 +49,14 @@ the bag or removed from play) is chosen when the game is created, and the leaver
rack is never shown to the others.
### Robot opponent *(Stage 5)*
Indistinguishable-from-human substitute in auto-match. Decides once whether to
play to win (~40%), targets a small score margin, plays with human-like timing
and a night sleep window, and nudges/answers nudges like a person.
When auto-match finds no human within ten seconds, a robot opponent takes the empty
seat so the game starts without waiting. It is meant to feel like a person: it
decides once per game whether to play to win (about 40% of the time, so the human
wins most games), aims for a close score rather than crushing or throwing the game,
and plays at a human pace — short thinking times for most moves, the occasional long
one, and a night-time pause that tracks the player's own day. It answers a nudge
within a few minutes and nudges back when the player has been away a long time. It
carries a human-like name and neither chats nor accepts friend requests.
### Social: friends, block, chat, nudge *(Stage 4)*
Send a friend request and have it accepted (decline or cancel withdraws it,
+8 -3
View File
@@ -48,9 +48,14 @@ session-токен; backend сопоставляет его с внутренн
показывается остальным.
### Робот-соперник *(Stage 5)*
Неотличимый от человека дублёр в авто-подборе. Один раз решает, играть ли на
победу (~40%), целится в небольшой отрыв по очкам, ходит с человеческим
таймингом и ночным сном, делает и принимает nudge как человек.
Если авто-подбор не находит человека за десять секунд, свободное место занимает
робот-соперник, и партия стартует без ожидания. Он задуман неотличимым от человека:
один раз за партию решает, играть ли на победу (примерно в 40% случаев, так что
человек выигрывает большинство партий), целится в близкий счёт, а не в разгром или
поддавки, и ходит с человеческим темпом — чаще короткие раздумья, изредка долгие, и
ночная пауза, подстроенная под день игрока. На nudge отвечает за несколько минут и
сам шлёт nudge, когда игрок надолго пропал. Носит человекоподобное имя, не общается
в чате и не принимает заявки в друзья.
### Социальное: друзья, блок, чат, nudge *(Stage 4)*
Заявка в друзья и её принятие (отклонение или отмена снимают заявку, удаление —
+17 -8
View File
@@ -32,21 +32,30 @@ tests or touching CI.
Postgres-backed integration tests in `inttest` (full lifecycle to a natural
end, **journal-replay equivalence**, the turn-timeout sweep with away-window
grace, resign win/loss and statistics, the hint allowance-then-wallet policy,
word-check and complaint capture, and per-game-lock serialisation). The robot
balance/margin regression tests arrive with Stage 5. Stage 4 adds the engine's
**multi-player drop-out** cases (continue after one resign, last-survivor win,
the tile-disposition bag effect) and a domain integration test for a 3-player
**timeout that continues**.
word-check and complaint capture, and per-game-lock serialisation). Stage 4 adds
the engine's **multi-player drop-out** cases (continue after one resign,
last-survivor win, the tile-disposition bag effect) and a domain integration test
for a 3-player **timeout that continues**. The engine also gains a `Candidates`
ranked/decoded test (Stage 5).
- **Social & lobby** *(Stage 4+)*`backend/internal/social` unit-tests the chat
**content filter** (links/emails/phones plus obfuscated forms) and
`backend/internal/lobby` unit-tests the in-memory **matchmaker** (FIFO pairing,
cancel, per-variant pools) with a fake game creator. Postgres-backed `inttest`
covers the friend request/accept lifecycle with the block/toggle guards, the
per-user block (and its severing of friendships), chat post/list with the IP,
cancel, per-variant pools, plus the Stage 5 **robot substitution** reaper and
`Poll` delivery) with fake game-creator and robot-provider seams. Postgres-backed
`inttest` covers the friend request/accept lifecycle with the block/toggle guards,
the per-user block (and its severing of friendships), chat post/list with the IP,
content and block-visibility rules, the nudge turn/rate-limit rules, the
invitation flow (all-accept starts the game, decline cancels, lazy expiry,
inviter-only cancel), and the email confirm-code flow (request/confirm, taken
email, expiry and attempt-cap) with a fixture mailer.
- **Robot** *(Stage 5+)*`backend/internal/robot` unit-tests the pure strategy:
the ≈ 40% play-to-win split over many seeds, the right-skewed move-delay
(bounds, ~10-min median, determinism), the margin selection (win/lose, in-band
and out-of-band fallbacks, no-play exchange/pass), the sleep window with drift
and the midnight wrap, and mix restart-stability. Postgres-backed `inttest`
drives a robot through a full auto-match to a natural end (asserting a robot
statistics row), the matchmaker substitution end-to-end (enqueue → reap →
`[human, robot]`, discoverable via `Poll`), and a proactive 12-hour nudge.
## Principles