feat: use postgres
This commit is contained in:
@@ -0,0 +1,206 @@
|
||||
# PostgreSQL Migration
|
||||
|
||||
PG_PLAN.md §3 migrated `galaxy/user` from a Redis-only durable store to the
|
||||
steady-state split codified in `ARCHITECTURE.md §Persistence Backends`:
|
||||
PostgreSQL is the source of truth for table-shaped business state, and Redis
|
||||
keeps only the two streams that publish auxiliary domain events
|
||||
(`user:domain_events`) and trusted user-lifecycle events
|
||||
(`user:lifecycle_events`).
|
||||
|
||||
This document records the schema decisions and the non-obvious agreements
|
||||
behind them. Use it together with the migration script
|
||||
(`internal/adapters/postgres/migrations/00001_init.sql`) and the runtime
|
||||
wiring (`internal/app/runtime.go`).
|
||||
|
||||
## Outcomes
|
||||
|
||||
- Schema `user` (provisioned externally) holds the durable state: `accounts`,
|
||||
`blocked_emails`, `entitlement_records`, `entitlement_snapshots`,
|
||||
`sanction_records`, `sanction_active`, `limit_records`, `limit_active`.
|
||||
- The runtime opens one PostgreSQL pool via `pkg/postgres.OpenPrimary`,
|
||||
applies embedded goose migrations strictly before any HTTP listener
|
||||
becomes ready, and exits non-zero when migration or ping fails.
|
||||
- The runtime opens one shared `*redis.Client` via
|
||||
`pkg/redisconn.NewMasterClient` and passes it to both stream publishers
|
||||
(`internal/adapters/redis/domainevents`,
|
||||
`internal/adapters/redis/lifecycleevents`); the publishers no longer hold
|
||||
their own connection topology fields.
|
||||
- `internal/adapters/redis/userstore/` and the entire
|
||||
`internal/adapters/redisstate/` package are removed. The Redis Lua scripts,
|
||||
Watch/Multi optimistic-concurrency loops, and ZSET indexes are gone.
|
||||
- Configuration drops `USERSERVICE_REDIS_USERNAME`,
|
||||
`USERSERVICE_REDIS_TLS_ENABLED`, and `USERSERVICE_REDIS_KEYSPACE_PREFIX`.
|
||||
`USERSERVICE_REDIS_ADDR` is replaced by
|
||||
`USERSERVICE_REDIS_MASTER_ADDR` + optional
|
||||
`USERSERVICE_REDIS_REPLICA_ADDRS`. Postgres-specific knobs live under
|
||||
`USERSERVICE_POSTGRES_*` per the architectural rule.
|
||||
|
||||
## Decisions
|
||||
|
||||
### 1. One schema, externally-provisioned role
|
||||
|
||||
**Decision.** The `user` schema and the matching `userservice` role are
|
||||
created outside the migration sequence (in tests, by
|
||||
`integration/internal/harness/postgres_container.go::EnsureRoleAndSchema`;
|
||||
in production, by an ops init script not in scope for this stage). The
|
||||
embedded migration `00001_init.sql` only contains DDL for tables and
|
||||
indexes and assumes it runs as the schema owner with `search_path=user`.
|
||||
|
||||
**Why.** Mixing role creation, schema creation, and table DDL into one
|
||||
script forces every consumer of the migration to run as a superuser. The
|
||||
schema-per-service architectural rule
|
||||
(`ARCHITECTURE.md §Persistence Backends`) lines up neatly with the
|
||||
operational split: ops provisions roles and schemas, the service applies
|
||||
schema-scoped migrations.
|
||||
|
||||
### 2. `entitlement_snapshots` stays denormalised
|
||||
|
||||
**Decision.** A dedicated `entitlement_snapshots` table holds exactly one
|
||||
row per `user_id` mirroring the current effective fields (`plan_code`,
|
||||
`is_paid`, `starts_at`, `ends_at`, `source`, `actor_*`, `reason_code`,
|
||||
`updated_at`). Lifecycle operations (`Grant`, `Extend`, `Revoke`,
|
||||
`RepairExpired`) write the history row and the snapshot row inside one
|
||||
transaction.
|
||||
|
||||
**Why.** The lobby-eligibility hot-path reads exactly one row per user; a
|
||||
JOIN over `entitlement_records` to compute the current segment would add
|
||||
latency and wire-format complexity. Keeping the snapshot denormalised
|
||||
matches the previous Redis shape where the hot read returned a
|
||||
pre-materialised JSON blob, which preserves the existing service-layer
|
||||
contract and the public REST envelope.
|
||||
|
||||
### 3. `sanction_active` / `limit_active` are the source of truth for "active"
|
||||
|
||||
**Decision.** The active state of a sanction or a user-specific limit is
|
||||
expressed by a small dedicated table (`sanction_active`, `limit_active`)
|
||||
whose primary key is `(user_id, code)`. Each row references the matching
|
||||
history record by `record_id`. Lifecycle operations maintain both tables
|
||||
inside one transaction.
|
||||
|
||||
**Why.** The lobby-eligibility hot path needs to enumerate active
|
||||
sanctions/limits without scanning the full history. Encoding "active"
|
||||
as a partial index on `removed_at IS NULL` would still require dedup
|
||||
because a user can apply, remove, and re-apply the same code. Two narrow
|
||||
tables let the same predicates that the Redis adapter encoded as
|
||||
`active` keys remain index-only.
|
||||
|
||||
### 4. Eligibility flags are computed predicates, not stored columns
|
||||
|
||||
**Decision.** No `can_login`, `can_create_private_game`, `can_join_game`
|
||||
columns or indexes exist. The admin listing surface (and the lobby
|
||||
eligibility snapshot) compute these from `entitlement_snapshots` and
|
||||
`sanction_active` at read time.
|
||||
|
||||
**Why.** Stage 21 expanded the eligibility marker catalogue and Stage 22
|
||||
added `permanent_block`. Each addition would have required schema work
|
||||
plus a backfill if eligibility flags were materialised columns. Computed
|
||||
predicates push that complexity into one place — the SQL query — and
|
||||
keep the schema small.
|
||||
|
||||
### 5. Atomic flows use explicit `BEGIN … COMMIT` with per-row `FOR UPDATE`
|
||||
|
||||
**Decision.** Composite operations (`AuthDirectoryStore.{Resolve,
|
||||
Ensure, Block*}`, `EntitlementLifecycleStore.{Grant, Extend, Revoke,
|
||||
RepairExpired}`, `PolicyLifecycleStore.{ApplySanction, RemoveSanction,
|
||||
SetLimit, RemoveLimit}`) execute inside `store.withTx` and acquire row
|
||||
locks with `SELECT … FOR UPDATE` on the rows they intend to mutate.
|
||||
Optimistic-replacement guards (`Expected*Record`, `Expected*Snapshot`)
|
||||
are validated against the locked rows before the write goes through;
|
||||
mismatches surface as `ports.ErrConflict`.
|
||||
|
||||
**Why.** PostgreSQL's default `READ COMMITTED` isolation plus row-level
|
||||
locks gives us the serialisation property the previous Redis
|
||||
WATCH/MULTI loops achieved without needing the application to retry on
|
||||
optimistic-failure errors. The explicit `FOR UPDATE` keeps intent
|
||||
visible; ad-hoc CTE patterns would obscure the locking shape.
|
||||
|
||||
### 6. Query layer is `go-jet/jet/v2`
|
||||
|
||||
**Decision.** All `userstore` packages build SQL through the jet
|
||||
builder API (`pgtable.<Table>.INSERT/SELECT/UPDATE/DELETE` plus the
|
||||
`pg.AND/OR/SET/...` DSL). `cmd/jetgen` (invoked via `make jet`) brings
|
||||
up a transient PostgreSQL container, applies the embedded migrations,
|
||||
and runs `github.com/go-jet/jet/v2/generator/postgres.GenerateDB`
|
||||
against the provisioned schema; the generated table/model code lives
|
||||
under `internal/adapters/postgres/jet/user/{model,table}/*.go` and is
|
||||
committed to the repo, so build consumers do not need Docker.
|
||||
Statements are run through the `database/sql` API
|
||||
(`stmt.Sql() → db.Exec/Query/QueryRow`); manual `rowScanner` helpers
|
||||
preserve domain-type marshalling.
|
||||
|
||||
**Why.** Aligns with `PG_PLAN.md` §Library stack ("Query layer:
|
||||
`github.com/go-jet/jet/v2` (PostgreSQL dialect). Generated code lives
|
||||
under each service `internal/adapters/postgres/jet/`, regenerated via
|
||||
a `make jet` target and committed to the repo"). Constructs the jet
|
||||
builder does not cover natively (`FOR UPDATE`, keyset-pagination
|
||||
row-comparison, partial UNIQUE WHERE in `CREATE INDEX`) are expressed
|
||||
through the per-DSL helpers (`.FOR(pg.UPDATE())`, `OR/AND` expansion
|
||||
of `(created_at, user_id) < (…)`). The ports contract and the schema
|
||||
do not change.
|
||||
|
||||
### 7. Redis publishers share one `*redis.Client`
|
||||
|
||||
**Decision.** `internal/app/runtime.go` constructs one
|
||||
`redisconn.NewMasterClient(cfg.Redis.Conn)` and passes it to both
|
||||
`domainevents.New(client, cfg)` and `lifecycleevents.New(client,
|
||||
cfg)`. The publishers no longer carry connection-topology fields and
|
||||
no longer close the client; the runtime owns it.
|
||||
|
||||
**Why.** Each subsequent PG_PLAN stage (Mail, Notification, Lobby)
|
||||
ships a similar duo of stream publishers; sharing one client is the
|
||||
shape we want all stages to converge on. Per-publisher clients
|
||||
multiplied TCP connections, ping points, and OpenTelemetry
|
||||
instrumentation hooks for no functional benefit.
|
||||
|
||||
### 8. Mandatory Redis password in tests as well
|
||||
|
||||
**Decision.** Unit tests for the publishers configure
|
||||
`miniredis.RequireAuth("integration")` and pass a matching password
|
||||
through their direct `redis.NewClient(...)` construction. The runtime
|
||||
contract test
|
||||
(`runtime_contract_test.go::newRuntimeContractHarness`) does the same
|
||||
plus boots a Postgres container.
|
||||
|
||||
**Why.** The architectural rule forbids password-less Redis
|
||||
connections; carrying the constraint into tests prevents the rule
|
||||
from drifting.
|
||||
|
||||
### 9. Listing surface keeps storage-thin pagination
|
||||
|
||||
**Decision.** `UserListStore.ListUserIDs` paginates only on
|
||||
`(created_at DESC, user_id DESC)` with keyset cursors carried by the
|
||||
opaque page token. Filter matrix evaluation (paid_state,
|
||||
declared_country, sanction_code, limit_code, can_*) is performed by
|
||||
the service-layer `adminusers.Lister`, which loads each candidate
|
||||
through the per-user loader. This mirrors the previous Redis
|
||||
behaviour exactly.
|
||||
|
||||
**Why.** Pushing the filter matrix into SQL is desirable — it eliminates
|
||||
candidate over-fetching — but doing it without changing the public
|
||||
`UserListStore.ListUserIDs` contract (which returns a page of
|
||||
`UserID`, not full records) requires a JOIN-driven query. That work
|
||||
is a non-breaking optimisation and is intentionally deferred so this
|
||||
stage focuses on the storage cut-over rather than throughput
|
||||
improvements. The page-token wire format is preserved bit-for-bit so
|
||||
already-issued tokens keep working.
|
||||
|
||||
## Cross-References
|
||||
|
||||
- `PG_PLAN.md §3` (Stage 3 — User Service migration / pilot).
|
||||
- `ARCHITECTURE.md §Persistence Backends`.
|
||||
- `internal/adapters/postgres/migrations/00001_init.sql` and
|
||||
`internal/adapters/postgres/migrations/migrations.go`.
|
||||
- `internal/adapters/postgres/userstore/{store,accounts,blocked_emails,
|
||||
auth_directory,entitlement_store,policy_store,list_store,page_token,
|
||||
helpers}.go` plus the testcontainers-backed unit suite under
|
||||
`userstore/{harness,store}_test.go`.
|
||||
- `internal/adapters/postgres/jet/user/{model,table}/*.go` (committed
|
||||
generated code) plus `cmd/jetgen/main.go` and the `make jet`
|
||||
Makefile target that regenerate it.
|
||||
- `internal/config/config.go` (`PostgresConfig`, `RedisConfig` reshape).
|
||||
- `internal/app/runtime.go` (PG pool open + migration + shared Redis
|
||||
client wiring).
|
||||
- `internal/adapters/redis/{domainevents,lifecycleevents}/publisher.go`
|
||||
(refactored to accept the shared `*redis.Client`).
|
||||
- `runtime_contract_test.go::startPostgresForContractTest` (shows the
|
||||
inline Postgres bootstrap used by the existing runtime contract).
|
||||
Reference in New Issue
Block a user