Files
galaxy-game/user/docs/runtime.md
T
2026-04-26 20:34:39 +02:00

196 lines
6.4 KiB
Markdown

# Runtime and Components
The diagram below focuses on the deployed `galaxy/user` process and its
runtime dependencies.
```mermaid
flowchart LR
subgraph Callers
Auth["Auth / Session Service"]
Gateway["Edge Gateway"]
Lobby["Game Lobby Service"]
Geo["Geo Profile Service"]
Admin["Trusted admin callers"]
end
subgraph User["User Service process"]
InternalHTTP["Trusted internal HTTP listener\n/api/v1/internal/*"]
AdminHTTP["Optional admin HTTP listener\n/metrics"]
Services["Application services"]
Telemetry["Logs, traces, metrics"]
end
Redis["Redis\nkeyspace + domain-events stream"]
Auth --> InternalHTTP
Gateway --> InternalHTTP
Lobby --> InternalHTTP
Geo --> InternalHTTP
Admin --> InternalHTTP
InternalHTTP --> Services
Services --> Redis
InternalHTTP --> Telemetry
AdminHTTP --> Telemetry
```
## Listeners
`userservice` exposes two HTTP listeners:
| Listener | Default addr | Purpose |
| --- | --- | --- |
| Internal HTTP | `:8091` | Trusted business API under `/api/v1/internal/*` |
| Admin HTTP | disabled | Optional Prometheus metrics on `/metrics` |
Shared listener defaults:
- read-header timeout: `2s`
- read timeout: `10s`
- idle timeout: `1m`
The internal application timeout is configured separately through
`USERSERVICE_INTERNAL_HTTP_REQUEST_TIMEOUT`.
Intentional omissions:
- no public listener
- no authenticated edge gRPC listener
- no built-in `/healthz`
- no built-in `/readyz`
## Startup Wiring
`cmd/userservice` loads config, constructs logging and telemetry, and then
creates the runtime through `internal/app.NewRuntime`.
The runtime wires, in order:
- one shared `*redis.Client` opened through `pkg/redisconn` plus a Ping
- one PostgreSQL pool opened through `pkg/postgres`, instrumented with
`db.sql.connection.*` metrics, pinged, and migrated forward via the
embedded `internal/adapters/postgres/migrations` filesystem
- the PostgreSQL-backed user store from
`internal/adapters/postgres/userstore` (accounts, blocked-emails,
entitlement snapshot/history/lifecycle, sanction history/lifecycle,
limit history/lifecycle, listing index)
- two Redis Stream publishers
(`internal/adapters/redis/domainevents` for auxiliary domain events,
`internal/adapters/redis/lifecycleevents` for trusted user-lifecycle
events) sharing the same `*redis.Client`
- the trusted internal HTTP router
- the optional admin metrics listener
- service-local helpers for clock, IDs, and validation/policy adapters
Startup fails fast when Redis or PostgreSQL connectivity is unavailable, the
mandatory connection-topology environment variables are missing, the
embedded migration sequence cannot be applied, or configuration is otherwise
invalid. The HTTP listeners do not open until every dependency check passes.
## Storage Backends
The service is split between two backends per
[`../../ARCHITECTURE.md §Persistence Backends`](../../ARCHITECTURE.md):
PostgreSQL holds source-of-truth durable state in the `user` schema:
- `accounts` (with `email` and `user_name` UNIQUE; `deleted_at` records the
Stage 22 soft-delete state)
- `blocked_emails` (one row per blocked address)
- `entitlement_records` plus the denormalised `entitlement_snapshots`
one-row-per-user current view
- `sanction_records` plus `sanction_active(user_id, sanction_code)`
- `limit_records` plus `limit_active(user_id, limit_code)`
Indexes carry the listing surface (`accounts(created_at DESC, user_id
DESC)`), reverse-lookup filters (`accounts(declared_country)`,
`entitlement_snapshots(plan_code, is_paid)`,
`entitlement_snapshots(ends_at) WHERE is_paid AND ends_at IS NOT NULL`,
`sanction_active(sanction_code)`, `limit_active(limit_code)`), and the
per-user history scans.
Redis hosts only the two Stream publishers
(`USERSERVICE_REDIS_DOMAIN_EVENTS_STREAM`,
`USERSERVICE_REDIS_LIFECYCLE_EVENTS_STREAM`). It does not store any
durable user state after Stage 3 of `PG_PLAN.md`.
Decision records:
[`postgres-migration.md`](postgres-migration.md) for the schema and
storage decisions.
## Configuration Groups
Required for all process starts:
- `USERSERVICE_REDIS_MASTER_ADDR`
- `USERSERVICE_REDIS_PASSWORD`
- `USERSERVICE_POSTGRES_PRIMARY_DSN`
Core process config:
- `USERSERVICE_SHUTDOWN_TIMEOUT`
- `USERSERVICE_LOG_LEVEL`
Internal HTTP config:
- `USERSERVICE_INTERNAL_HTTP_ADDR`
- `USERSERVICE_INTERNAL_HTTP_READ_HEADER_TIMEOUT`
- `USERSERVICE_INTERNAL_HTTP_READ_TIMEOUT`
- `USERSERVICE_INTERNAL_HTTP_IDLE_TIMEOUT`
- `USERSERVICE_INTERNAL_HTTP_REQUEST_TIMEOUT`
Admin HTTP config:
- `USERSERVICE_ADMIN_HTTP_ADDR`
- `USERSERVICE_ADMIN_HTTP_READ_HEADER_TIMEOUT`
- `USERSERVICE_ADMIN_HTTP_READ_TIMEOUT`
- `USERSERVICE_ADMIN_HTTP_IDLE_TIMEOUT`
Redis connectivity (consumed by `pkg/redisconn`):
- `USERSERVICE_REDIS_REPLICA_ADDRS` (optional, comma-separated)
- `USERSERVICE_REDIS_DB`
- `USERSERVICE_REDIS_OPERATION_TIMEOUT`
Stream-shape (kept service-local):
- `USERSERVICE_REDIS_DOMAIN_EVENTS_STREAM`
- `USERSERVICE_REDIS_DOMAIN_EVENTS_STREAM_MAX_LEN`
- `USERSERVICE_REDIS_LIFECYCLE_EVENTS_STREAM`
- `USERSERVICE_REDIS_LIFECYCLE_EVENTS_STREAM_MAX_LEN`
PostgreSQL connectivity (consumed by `pkg/postgres`):
- `USERSERVICE_POSTGRES_REPLICA_DSNS` (optional, comma-separated)
- `USERSERVICE_POSTGRES_OPERATION_TIMEOUT`
- `USERSERVICE_POSTGRES_MAX_OPEN_CONNS`
- `USERSERVICE_POSTGRES_MAX_IDLE_CONNS`
- `USERSERVICE_POSTGRES_CONN_MAX_LIFETIME`
The retired Redis variables `USERSERVICE_REDIS_ADDR`,
`USERSERVICE_REDIS_USERNAME`, `USERSERVICE_REDIS_TLS_ENABLED`,
`USERSERVICE_REDIS_KEYSPACE_PREFIX` produce a startup error from
`pkg/redisconn` if set; unset them before starting the service.
Telemetry:
- `OTEL_SERVICE_NAME`
- `OTEL_TRACES_EXPORTER`
- `OTEL_METRICS_EXPORTER`
- `OTEL_EXPORTER_OTLP_PROTOCOL`
- `OTEL_EXPORTER_OTLP_TRACES_PROTOCOL`
- `OTEL_EXPORTER_OTLP_METRICS_PROTOCOL`
- `USERSERVICE_OTEL_STDOUT_TRACES_ENABLED`
- `USERSERVICE_OTEL_STDOUT_METRICS_ENABLED`
## Runtime Notes
- The service remains internal REST only; gateway owns external authenticated
gRPC and FlatBuffers.
- Gateway self-service traffic reaches this service over REST/JSON after
gateway-side authentication and FlatBuffers transcoding.
- Current direct synchronous callers are `Auth / Session Service`,
`Edge Gateway`, `Game Lobby Service`, `Geo Profile Service`, and trusted
admin callers.
- Domain-event publication is auxiliary. A failed auxiliary consumer must not
become the source of truth for current account state.