Stage 5: robot opponent (pool, seed-derived strategy, move driver, matchmaker substitution)
- internal/robot: durable kind='robot' account pool (migration 00004); every per-game and per-turn choice derived deterministically from the game seed (restart-stable FNV mix); a background move driver; margin targeting (band 1-30, closest-to-band); right-skewed [2,90]min delays (median ~10m); opponent-anchored sleep with +/-3h drift; daytime nudge reply + proactive 12h nudge; friend/chat blocked via profile toggles. - engine.Candidates (decoded ranked plays); game.Candidates + RobotTurns; social.LastNudgeAt. - matchmaker: 10s wait then robot substitution (reaper) + Poll delivery seam. - config (BACKEND_ROBOT_DRIVE_INTERVAL, BACKEND_LOBBY_ROBOT_WAIT, BACKEND_LOBBY_REAPER_INTERVAL); main wiring + boot-time pool provisioning. - metrics: robot account_stats (authoritative balance) + robot_games_finished_total OTel counter + per-finish log. - docs: PLAN, ARCHITECTURE, FUNCTIONAL(+ru), TESTING, README; account.go comment. - tests: robot strategy units, matchmaker reaper/Poll, engine.Candidates; inttest robot full-game / substitution / proactive-nudge.
This commit is contained in:
+42
-18
@@ -87,7 +87,8 @@ arrive from a platform rather than completing a mandatory registration).
|
||||
a platform auto-provisions a durable account bound to that platform identity.
|
||||
Concretely, platform and email identities share one `identities` table keyed by
|
||||
a unique `(kind, external_id)`; email is an identity with `kind=email` and a
|
||||
`confirmed` flag. The **email confirm-code flow** (Stage 4) binds an email to the
|
||||
`confirmed` flag. A synthetic `kind='robot'` identity (Stage 5) backs each pooled
|
||||
robot opponent (§7). The **email confirm-code flow** (Stage 4) binds an email to the
|
||||
authenticated account: a 6-digit code (stored only as a SHA-256 hash, 15-minute
|
||||
TTL, ≤ 5 attempts) is sent through a `Mailer` seam (an SMTP relay, or a
|
||||
development log mailer when none is configured) and, once verified, attaches a
|
||||
@@ -191,20 +192,37 @@ Key points:
|
||||
## 7. Robot opponent
|
||||
|
||||
Substitutes for a human in 2-player auto-match when the pool yields no human
|
||||
within 10 seconds. Designed to be indistinguishable from a person.
|
||||
within 10 seconds (§8). It lives in `internal/robot` and plays as an ordinary
|
||||
seated account through the game service, so only `internal/engine` imports the
|
||||
solver. It is designed to be indistinguishable from a person.
|
||||
|
||||
The robot keeps **no per-game state**: every choice is derived deterministically
|
||||
from the game's bag `seed` (a restart-stable FNV-1a mix), so a background driver
|
||||
(`robot.Service.Run`, mirroring the turn-timeout sweeper) recomputes the same
|
||||
behaviour on every scan and after a restart — the same philosophy as journal
|
||||
replay. A pool of durable accounts — each a `kind='robot'` identity (§4),
|
||||
provisioned at startup with chat and friend requests blocked — backs the
|
||||
human-like name pool; those two profile toggles are all the friend/DM blocking
|
||||
requires (there is no DM surface; chat is per-game).
|
||||
|
||||
- **Balance**: at game start it decides once whether to play to win, with
|
||||
`P(play-to-win) ≈ 0.40` (so the human wins ≈ 60%). Adaptive difficulty is
|
||||
post-MVP.
|
||||
- **Margin targeting**: each turn it picks from `GenerateMoves` a move that
|
||||
keeps the resulting lead (when playing to win) or deficit (when playing to
|
||||
lose) small (≈ 1–20 points), rather than always the maximum.
|
||||
`P(play-to-win) ≈ 0.40` (so the human wins ≈ 60%), derived from the seed.
|
||||
Adaptive difficulty is post-MVP.
|
||||
- **Margin targeting**: each turn it picks from the ranked candidates
|
||||
(`engine.Candidates`) the move whose resulting lead (playing to win) or deficit
|
||||
(playing to lose) is closest to a small band (**1–30 points**), rather than
|
||||
always the maximum; with no legal play it exchanges a full rack when the bag can
|
||||
refill it, else passes.
|
||||
- **Timing**: per-move delay sampled from a right-skewed distribution (short
|
||||
delays frequent), clamped to **[2, 90] minutes**; **sleeps 00:00–07:00** in
|
||||
the opponent's profile timezone (fallback UTC); on a daytime nudge after 60
|
||||
minutes idle it replies within **2–10 minutes**; it proactively nudges the
|
||||
human after 12 hours idle.
|
||||
- Blocks friend requests and direct messages; uses a human-like name pool.
|
||||
delays frequent, median ≈ 10 min), clamped to **[2, 90] minutes**; it
|
||||
**sleeps 00:00–07:00** anchored to the **opponent's** profile timezone with a
|
||||
per-game drift of **±3 h** (fallback UTC), so its night overlaps the human's
|
||||
rather than running anti-phase; on a daytime nudge it replies within
|
||||
**2–10 minutes**; it proactively nudges the human after **12 hours** idle
|
||||
(subject to the once-per-hour chat limit).
|
||||
- **Observability**: robot accounts accrue ordinary statistics (§9) — the
|
||||
authoritative balance metric (target ≈ 40% robot wins) — and a
|
||||
`robot_games_finished_total` OTel counter plus a per-finish log give a live view.
|
||||
|
||||
## 8. Lobby & social
|
||||
|
||||
@@ -212,8 +230,10 @@ within 10 seconds. Designed to be indistinguishable from a person.
|
||||
fixes the board language), pairing the next two humans into a two-player
|
||||
auto-match with the seat order randomised for first-move fairness. The pool is
|
||||
lost on restart (players re-queue) and is anonymous, so it does not consult
|
||||
blocks. The 10 s wait and the **robot substitution** for a missing human are
|
||||
added in Stage 5.
|
||||
blocks. After **10 s** with no human a background reaper substitutes a pooled
|
||||
robot (§7) and starts the game. A queued player learns of a pairing or a
|
||||
substitution through the matchmaker's `Poll`, the interim delivery seam until the
|
||||
live match-found notification (§10).
|
||||
- **Friends**: a **request → accept** graph (one `friendships` table) — add by
|
||||
friend list or internal ID now, by platform deep-link with Stage 8. Declining or
|
||||
cancelling removes the pending request; blocking someone severs an existing
|
||||
@@ -252,7 +272,8 @@ within 10 seconds. Designed to be indistinguishable from a person.
|
||||
keys are application-generated **UUIDv7**.
|
||||
- Tables: `accounts` (durable internal accounts; Stage 3 added the away-window
|
||||
columns `away_start`/`away_end` and the hint wallet `hint_balance`),
|
||||
`identities` (platform/email identities, unique `(kind, external_id)`),
|
||||
`identities` (platform/email/robot identities, unique `(kind, external_id)`;
|
||||
Stage 5's migration `00004` admits the `robot` kind),
|
||||
`sessions` (revoke-only opaque-token hashes), the Stage 3 game tables
|
||||
`games` (Stage 4 added the `dropout_tiles` disposition column), `game_players`,
|
||||
`game_moves` (the move journal), `complaints` and `account_stats`, and the
|
||||
@@ -301,9 +322,12 @@ does not cover.
|
||||
Two channels: **platform-native push** (out-of-app, via the platform
|
||||
side-service — your-turn, nudge) and the **in-app live stream** (chat,
|
||||
opponent-moved, while the app is open). Backend emits notification intents;
|
||||
delivery fans out to the appropriate channel. Stage 4 **persists** the
|
||||
notification-worthy events (chat messages and nudges) but does not yet deliver
|
||||
them: the gRPC stream to the gateway and the platform push arrive in Stage 6 / 8.
|
||||
delivery fans out to the appropriate channel. A **match-found** event (a human
|
||||
pairing or a robot substitution in auto-match, §8) belongs to the same fabric.
|
||||
Stage 4 **persists** the notification-worthy events (chat messages and nudges) but
|
||||
does not yet deliver them, and Stage 5's match-found has no live channel yet: the
|
||||
gRPC stream to the gateway and the platform push arrive in Stage 6 / 8. Until then
|
||||
a waiting client retrieves its started game by polling the matchmaker (`Poll`).
|
||||
|
||||
## 11. Observability
|
||||
|
||||
|
||||
+8
-3
@@ -49,9 +49,14 @@ the bag or removed from play) is chosen when the game is created, and the leaver
|
||||
rack is never shown to the others.
|
||||
|
||||
### Robot opponent *(Stage 5)*
|
||||
Indistinguishable-from-human substitute in auto-match. Decides once whether to
|
||||
play to win (~40%), targets a small score margin, plays with human-like timing
|
||||
and a night sleep window, and nudges/answers nudges like a person.
|
||||
When auto-match finds no human within ten seconds, a robot opponent takes the empty
|
||||
seat so the game starts without waiting. It is meant to feel like a person: it
|
||||
decides once per game whether to play to win (about 40% of the time, so the human
|
||||
wins most games), aims for a close score rather than crushing or throwing the game,
|
||||
and plays at a human pace — short thinking times for most moves, the occasional long
|
||||
one, and a night-time pause that tracks the player's own day. It answers a nudge
|
||||
within a few minutes and nudges back when the player has been away a long time. It
|
||||
carries a human-like name and neither chats nor accepts friend requests.
|
||||
|
||||
### Social: friends, block, chat, nudge *(Stage 4)*
|
||||
Send a friend request and have it accepted (decline or cancel withdraws it,
|
||||
|
||||
@@ -48,9 +48,14 @@ session-токен; backend сопоставляет его с внутренн
|
||||
показывается остальным.
|
||||
|
||||
### Робот-соперник *(Stage 5)*
|
||||
Неотличимый от человека дублёр в авто-подборе. Один раз решает, играть ли на
|
||||
победу (~40%), целится в небольшой отрыв по очкам, ходит с человеческим
|
||||
таймингом и ночным сном, делает и принимает nudge как человек.
|
||||
Если авто-подбор не находит человека за десять секунд, свободное место занимает
|
||||
робот-соперник, и партия стартует без ожидания. Он задуман неотличимым от человека:
|
||||
один раз за партию решает, играть ли на победу (примерно в 40% случаев, так что
|
||||
человек выигрывает большинство партий), целится в близкий счёт, а не в разгром или
|
||||
поддавки, и ходит с человеческим темпом — чаще короткие раздумья, изредка долгие, и
|
||||
ночная пауза, подстроенная под день игрока. На nudge отвечает за несколько минут и
|
||||
сам шлёт nudge, когда игрок надолго пропал. Носит человекоподобное имя, не общается
|
||||
в чате и не принимает заявки в друзья.
|
||||
|
||||
### Социальное: друзья, блок, чат, nudge *(Stage 4)*
|
||||
Заявка в друзья и её принятие (отклонение или отмена снимают заявку, удаление —
|
||||
|
||||
+17
-8
@@ -32,21 +32,30 @@ tests or touching CI.
|
||||
Postgres-backed integration tests in `inttest` (full lifecycle to a natural
|
||||
end, **journal-replay equivalence**, the turn-timeout sweep with away-window
|
||||
grace, resign win/loss and statistics, the hint allowance-then-wallet policy,
|
||||
word-check and complaint capture, and per-game-lock serialisation). The robot
|
||||
balance/margin regression tests arrive with Stage 5. Stage 4 adds the engine's
|
||||
**multi-player drop-out** cases (continue after one resign, last-survivor win,
|
||||
the tile-disposition bag effect) and a domain integration test for a 3-player
|
||||
**timeout that continues**.
|
||||
word-check and complaint capture, and per-game-lock serialisation). Stage 4 adds
|
||||
the engine's **multi-player drop-out** cases (continue after one resign,
|
||||
last-survivor win, the tile-disposition bag effect) and a domain integration test
|
||||
for a 3-player **timeout that continues**. The engine also gains a `Candidates`
|
||||
ranked/decoded test (Stage 5).
|
||||
- **Social & lobby** *(Stage 4+)* — `backend/internal/social` unit-tests the chat
|
||||
**content filter** (links/emails/phones plus obfuscated forms) and
|
||||
`backend/internal/lobby` unit-tests the in-memory **matchmaker** (FIFO pairing,
|
||||
cancel, per-variant pools) with a fake game creator. Postgres-backed `inttest`
|
||||
covers the friend request/accept lifecycle with the block/toggle guards, the
|
||||
per-user block (and its severing of friendships), chat post/list with the IP,
|
||||
cancel, per-variant pools, plus the Stage 5 **robot substitution** reaper and
|
||||
`Poll` delivery) with fake game-creator and robot-provider seams. Postgres-backed
|
||||
`inttest` covers the friend request/accept lifecycle with the block/toggle guards,
|
||||
the per-user block (and its severing of friendships), chat post/list with the IP,
|
||||
content and block-visibility rules, the nudge turn/rate-limit rules, the
|
||||
invitation flow (all-accept starts the game, decline cancels, lazy expiry,
|
||||
inviter-only cancel), and the email confirm-code flow (request/confirm, taken
|
||||
email, expiry and attempt-cap) with a fixture mailer.
|
||||
- **Robot** *(Stage 5+)* — `backend/internal/robot` unit-tests the pure strategy:
|
||||
the ≈ 40% play-to-win split over many seeds, the right-skewed move-delay
|
||||
(bounds, ~10-min median, determinism), the margin selection (win/lose, in-band
|
||||
and out-of-band fallbacks, no-play exchange/pass), the sleep window with drift
|
||||
and the midnight wrap, and mix restart-stability. Postgres-backed `inttest`
|
||||
drives a robot through a full auto-match to a natural end (asserting a robot
|
||||
statistics row), the matchmaker substitution end-to-end (enqueue → reap →
|
||||
`[human, robot]`, discoverable via `Poll`), and a proactive 12-hour nudge.
|
||||
|
||||
## Principles
|
||||
|
||||
|
||||
Reference in New Issue
Block a user