feat: backend service
This commit is contained in:
@@ -0,0 +1,277 @@
|
||||
# Domain and Protocol Flows
|
||||
|
||||
This document collects the multi-step interactions inside `backend`
|
||||
that span domain modules. Each section assumes the reader is familiar
|
||||
with `../README.md` and `../../ARCHITECTURE.md`.
|
||||
|
||||
## Registration (send + confirm)
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Client
|
||||
participant Gateway
|
||||
participant Auth
|
||||
participant User
|
||||
participant Geo
|
||||
participant Mail
|
||||
participant Mailpit as SMTP relay
|
||||
|
||||
Client->>Gateway: POST /api/v1/public/auth/send-email-code\nbody: {email}; header Accept-Language
|
||||
Gateway->>Auth: forward + Accept-Language
|
||||
Auth->>Auth: hash code (bcrypt cost 10)
|
||||
Auth->>Auth: persist auth_challenges row<br/>(stores preferred_language)
|
||||
Auth->>Mail: EnqueueLoginCode(email, code, ttl)
|
||||
Mail-->>Auth: delivery_id
|
||||
Auth-->>Gateway: 200 {challenge_id}
|
||||
Gateway-->>Client: 200 {challenge_id}
|
||||
Mail->>Mailpit: SMTP delivery (worker)
|
||||
|
||||
Client->>Gateway: POST /api/v1/public/auth/confirm-email-code\nbody: {challenge_id, code, client_public_key, time_zone}
|
||||
Gateway->>Auth: forward
|
||||
Auth->>Auth: SELECT FOR UPDATE auth_challenges<br/>(increment attempts, enforce ceiling)
|
||||
Auth->>Auth: bcrypt verify
|
||||
Auth->>User: EnsureByEmail(email, preferred_language, time_zone, source_ip)
|
||||
User->>User: insert account if missing<br/>(synth Player-XXXXXXXX)
|
||||
User->>Geo: SetDeclaredCountryAtRegistration(user_id, source_ip)
|
||||
User-->>Auth: user_id
|
||||
Auth->>Auth: SELECT FOR UPDATE again,<br/>mark consumed,<br/>insert device_session,<br/>cache write-through
|
||||
Auth-->>Gateway: 200 {device_session_id}
|
||||
Gateway-->>Client: 200 {device_session_id}
|
||||
```
|
||||
|
||||
Re-confirming the same `challenge_id` returns the existing session and
|
||||
clears the throttle window (the throttle reuses the latest un-consumed
|
||||
challenge rather than dropping the request). `accounts.user_name` is
|
||||
synthesised once and never overwritten on subsequent sign-ins; the same
|
||||
account always lands the same handle.
|
||||
|
||||
## Authenticated request lifecycle
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Client
|
||||
participant Gateway
|
||||
participant Backend HTTP
|
||||
participant Cache
|
||||
participant Domain
|
||||
participant Postgres
|
||||
|
||||
Client->>Gateway: signed gRPC ExecuteCommand
|
||||
Gateway->>Gateway: verify signature, payload_hash,<br/>freshness, anti-replay
|
||||
Gateway->>Backend HTTP: GET /api/v1/internal/sessions/{id}
|
||||
Backend HTTP-->>Gateway: 200 {user_id, status:active}
|
||||
Gateway->>Backend HTTP: forward command\nas REST + X-User-ID
|
||||
Backend HTTP->>Cache: lookup
|
||||
Cache-->>Backend HTTP: hit / miss
|
||||
alt cache miss
|
||||
Backend HTTP->>Postgres: read
|
||||
Postgres-->>Backend HTTP: row
|
||||
Backend HTTP->>Cache: warm
|
||||
end
|
||||
Backend HTTP->>Domain: business logic
|
||||
Domain->>Postgres: write
|
||||
Domain->>Cache: write-through after commit
|
||||
Domain-->>Backend HTTP: result
|
||||
Backend HTTP-->>Gateway: JSON
|
||||
Gateway->>Gateway: encode FlatBuffers,<br/>sign response envelope
|
||||
Gateway-->>Client: signed gRPC response
|
||||
```
|
||||
|
||||
`X-User-ID` is the sole identity input on the user surface. The geo
|
||||
counter middleware fires off `geo.IncrementCounterAsync` after the
|
||||
handler returns successfully; the request itself does not block on
|
||||
that.
|
||||
|
||||
## Lobby state machine and Race Name Directory
|
||||
|
||||
The lobby state machine is the closed transition graph below. Owner
|
||||
endpoints (or admin overrides for public games owned by NULL) drive
|
||||
forward transitions; the runtime callback is the only path that flips
|
||||
`starting → running`. Every transition checks ownership, target state,
|
||||
and idempotency.
|
||||
|
||||
```mermaid
|
||||
stateDiagram-v2
|
||||
[*] --> draft
|
||||
draft --> enrollment_open: open-enrollment
|
||||
enrollment_open --> ready_to_start: ready-to-start (auto on min_players)
|
||||
ready_to_start --> starting: start
|
||||
starting --> running: runtime ack
|
||||
starting --> start_failed: runtime error
|
||||
start_failed --> ready_to_start: retry-start
|
||||
running --> paused: pause
|
||||
paused --> running: resume
|
||||
running --> finished: engine finish callback
|
||||
running --> cancelled: cancel
|
||||
paused --> cancelled: cancel
|
||||
starting --> cancelled: cancel
|
||||
enrollment_open --> cancelled: cancel
|
||||
ready_to_start --> cancelled: cancel
|
||||
draft --> cancelled: cancel
|
||||
cancelled --> [*]
|
||||
finished --> [*]
|
||||
```
|
||||
|
||||
The Race Name Directory has three tiers:
|
||||
|
||||
- **registered** — platform-unique. Single live binding per canonical
|
||||
key.
|
||||
- **reservation** — per-game; a user can hold the same canonical key
|
||||
in multiple active games concurrently.
|
||||
- **pending_registration** — issued after a "capable finish"
|
||||
(`max_planets > initial AND max_population > initial`). The pending
|
||||
entry is auto-promoted to `registered` if the user calls
|
||||
`POST /api/v1/user/lobby/race-names/register` within
|
||||
`BACKEND_LOBBY_PENDING_REGISTRATION_TTL` (default 30 days);
|
||||
otherwise the sweeper releases it.
|
||||
|
||||
Canonicalisation goes through
|
||||
[`disciplinedware/go-confusables`](https://github.com/disciplinedware/go-confusables)
|
||||
plus a small anti-fraud map (digit-letter substitution for common
|
||||
look-alikes). Cross-user uniqueness across reservations and pending
|
||||
registrations is enforced with a per-canonical advisory lock at write
|
||||
time, since `race_names` is a composite PK that does not express that
|
||||
invariant alone.
|
||||
|
||||
## Mail outbox
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Producer
|
||||
participant Mail
|
||||
participant Postgres
|
||||
participant Worker
|
||||
participant SMTP
|
||||
participant Admin
|
||||
|
||||
Producer->>Mail: EnqueueLoginCode / EnqueueTemplate
|
||||
Mail->>Postgres: insert mail_payloads + mail_deliveries<br/>(unique on template_id, idempotency_key)
|
||||
Mail-->>Producer: delivery_id
|
||||
|
||||
loop every BACKEND_MAIL_WORKER_INTERVAL
|
||||
Worker->>Postgres: SELECT FOR UPDATE SKIP LOCKED
|
||||
Postgres-->>Worker: row
|
||||
Worker->>SMTP: send via wneessen/go-mail
|
||||
alt success
|
||||
Worker->>Postgres: insert mail_attempts(success),<br/>mark delivery sent
|
||||
else transient
|
||||
Worker->>Postgres: insert mail_attempts(transient),<br/>schedule next_attempt_at + jitter
|
||||
else permanent or attempts >= MAX
|
||||
Worker->>Postgres: insert mail_attempts(permanent),<br/>move to mail_dead_letters
|
||||
Worker->>Admin: notification intent (mail.dead_lettered)
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
`mail_attempts.attempt_no` is monotonic across the entire history of a
|
||||
single delivery. Resend on a `pending` / `retrying` / `dead_lettered`
|
||||
row re-arms the row; resend on `sent` returns `409 Conflict`.
|
||||
|
||||
## Notification fan-out
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Producer
|
||||
participant Notif
|
||||
participant Postgres
|
||||
participant Push
|
||||
participant Mail
|
||||
|
||||
Producer->>Notif: Submit(intent)
|
||||
Notif->>Notif: validate kind + payload
|
||||
Notif->>Postgres: INSERT notifications ON CONFLICT (kind, idempotency_key) DO NOTHING
|
||||
Notif->>Postgres: materialise notification_routes<br/>per channel from catalog
|
||||
Notif->>Push: PublishClientEvent(user_id, payload)
|
||||
Notif->>Mail: EnqueueTemplate(template_id, recipient,<br/>payload, route_id)
|
||||
Notif-->>Producer: ok (best-effort dispatch)
|
||||
|
||||
loop every BACKEND_NOTIFICATION_WORKER_INTERVAL
|
||||
Postgres-->>Notif: routes still in pending / retrying
|
||||
Notif->>Push: retry push (or)
|
||||
Notif->>Mail: re-arm mail row
|
||||
end
|
||||
```
|
||||
|
||||
`auth.login_code` bypasses notification entirely: auth writes the
|
||||
delivery row directly so the challenge commit is atomic with the mail
|
||||
queue insert. Catalog entries that target administrators land email
|
||||
on `BACKEND_NOTIFICATION_ADMIN_EMAIL`; if the variable is empty the
|
||||
route lands with `status='skipped'` and an operator log line records
|
||||
the configuration miss.
|
||||
|
||||
## Runtime job lifecycle
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Lobby
|
||||
participant Runtime
|
||||
participant Workers
|
||||
participant Docker
|
||||
participant Engine
|
||||
participant Reconciler
|
||||
|
||||
Lobby->>Runtime: StartGame(game_id)
|
||||
Runtime->>Workers: enqueue start job
|
||||
Runtime-->>Lobby: ack
|
||||
|
||||
Workers->>Docker: pull / create / start engine container
|
||||
Docker-->>Workers: container id
|
||||
Workers->>Engine: POST /api/v1/admin/init
|
||||
Engine-->>Workers: ok / error
|
||||
Workers->>Runtime: write runtime_records (running or start_failed)
|
||||
Workers->>Lobby: OnRuntimeJobResult
|
||||
|
||||
loop scheduler tick
|
||||
Workers->>Engine: PUT /api/v1/admin/turn
|
||||
Engine-->>Workers: snapshot
|
||||
Workers->>Runtime: persist runtime_records
|
||||
Workers->>Lobby: OnRuntimeSnapshot
|
||||
end
|
||||
|
||||
Reconciler->>Docker: list containers labelled galaxy.backend=1
|
||||
alt missing recorded container
|
||||
Reconciler->>Runtime: mark removed
|
||||
Reconciler->>Lobby: OnRuntimeJobResult(removed)
|
||||
else unrecorded labelled container
|
||||
Reconciler->>Runtime: adopt
|
||||
end
|
||||
```
|
||||
|
||||
Per-game serialisation is enforced by a `sync.Map[game_id]*sync.Mutex`
|
||||
inside `runtime.Service`, so concurrent start / stop / patch attempts
|
||||
on the same `game_id` cannot race. `runtime_operation_log` records
|
||||
every operation for audit.
|
||||
|
||||
## Push gRPC
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Backend
|
||||
participant Ring
|
||||
participant Gateway
|
||||
|
||||
loop domain emits client_event / session_invalidation
|
||||
Backend->>Ring: append, allocate cursor
|
||||
end
|
||||
|
||||
Gateway->>Backend: SubscribePush(GatewaySubscribeRequest{cursor?})
|
||||
alt cursor present and within ring TTL
|
||||
Backend->>Gateway: replay events newer than cursor
|
||||
else cursor missing or aged out
|
||||
Backend->>Gateway: stream from current head
|
||||
end
|
||||
|
||||
loop event published
|
||||
Backend->>Gateway: PushEvent
|
||||
end
|
||||
|
||||
Gateway->>Backend: same gateway_client_id reconnects
|
||||
Backend->>Backend: cancel previous stream (codes.Aborted)
|
||||
Backend->>Gateway: stream again
|
||||
```
|
||||
|
||||
The cursor is a zero-padded decimal `uint64` minted by an in-process
|
||||
counter; backend resets the sequence after a restart, so cursors are
|
||||
only meaningful within a single process lifetime. Per-connection
|
||||
backpressure is drop-oldest, with a log line on each drop so the
|
||||
gateway side can correlate gaps.
|
||||
Reference in New Issue
Block a user