# PostgreSQL Migration PG_PLAN.md §3 migrated `galaxy/user` from a Redis-only durable store to the steady-state split codified in `ARCHITECTURE.md §Persistence Backends`: PostgreSQL is the source of truth for table-shaped business state, and Redis keeps only the two streams that publish auxiliary domain events (`user:domain_events`) and trusted user-lifecycle events (`user:lifecycle_events`). This document records the schema decisions and the non-obvious agreements behind them. Use it together with the migration script (`internal/adapters/postgres/migrations/00001_init.sql`) and the runtime wiring (`internal/app/runtime.go`). ## Outcomes - Schema `user` (provisioned externally) holds the durable state: `accounts`, `blocked_emails`, `entitlement_records`, `entitlement_snapshots`, `sanction_records`, `sanction_active`, `limit_records`, `limit_active`. - The runtime opens one PostgreSQL pool via `pkg/postgres.OpenPrimary`, applies embedded goose migrations strictly before any HTTP listener becomes ready, and exits non-zero when migration or ping fails. - The runtime opens one shared `*redis.Client` via `pkg/redisconn.NewMasterClient` and passes it to both stream publishers (`internal/adapters/redis/domainevents`, `internal/adapters/redis/lifecycleevents`); the publishers no longer hold their own connection topology fields. - `internal/adapters/redis/userstore/` and the entire `internal/adapters/redisstate/` package are removed. The Redis Lua scripts, Watch/Multi optimistic-concurrency loops, and ZSET indexes are gone. - Configuration drops `USERSERVICE_REDIS_USERNAME`, `USERSERVICE_REDIS_TLS_ENABLED`, and `USERSERVICE_REDIS_KEYSPACE_PREFIX`. `USERSERVICE_REDIS_ADDR` is replaced by `USERSERVICE_REDIS_MASTER_ADDR` + optional `USERSERVICE_REDIS_REPLICA_ADDRS`. Postgres-specific knobs live under `USERSERVICE_POSTGRES_*` per the architectural rule. ## Decisions ### 1. One schema, externally-provisioned role **Decision.** The `user` schema and the matching `userservice` role are created outside the migration sequence (in tests, by `integration/internal/harness/postgres_container.go::EnsureRoleAndSchema`; in production, by an ops init script not in scope for this stage). The embedded migration `00001_init.sql` only contains DDL for tables and indexes and assumes it runs as the schema owner with `search_path=user`. **Why.** Mixing role creation, schema creation, and table DDL into one script forces every consumer of the migration to run as a superuser. The schema-per-service architectural rule (`ARCHITECTURE.md §Persistence Backends`) lines up neatly with the operational split: ops provisions roles and schemas, the service applies schema-scoped migrations. ### 2. `entitlement_snapshots` stays denormalised **Decision.** A dedicated `entitlement_snapshots` table holds exactly one row per `user_id` mirroring the current effective fields (`plan_code`, `is_paid`, `starts_at`, `ends_at`, `source`, `actor_*`, `reason_code`, `updated_at`). Lifecycle operations (`Grant`, `Extend`, `Revoke`, `RepairExpired`) write the history row and the snapshot row inside one transaction. **Why.** The lobby-eligibility hot-path reads exactly one row per user; a JOIN over `entitlement_records` to compute the current segment would add latency and wire-format complexity. Keeping the snapshot denormalised matches the previous Redis shape where the hot read returned a pre-materialised JSON blob, which preserves the existing service-layer contract and the public REST envelope. ### 3. `sanction_active` / `limit_active` are the source of truth for "active" **Decision.** The active state of a sanction or a user-specific limit is expressed by a small dedicated table (`sanction_active`, `limit_active`) whose primary key is `(user_id, code)`. Each row references the matching history record by `record_id`. Lifecycle operations maintain both tables inside one transaction. **Why.** The lobby-eligibility hot path needs to enumerate active sanctions/limits without scanning the full history. Encoding "active" as a partial index on `removed_at IS NULL` would still require dedup because a user can apply, remove, and re-apply the same code. Two narrow tables let the same predicates that the Redis adapter encoded as `active` keys remain index-only. ### 4. Eligibility flags are computed predicates, not stored columns **Decision.** No `can_login`, `can_create_private_game`, `can_join_game` columns or indexes exist. The admin listing surface (and the lobby eligibility snapshot) compute these from `entitlement_snapshots` and `sanction_active` at read time. **Why.** Stage 21 expanded the eligibility marker catalogue and Stage 22 added `permanent_block`. Each addition would have required schema work plus a backfill if eligibility flags were materialised columns. Computed predicates push that complexity into one place — the SQL query — and keep the schema small. ### 5. Atomic flows use explicit `BEGIN … COMMIT` with per-row `FOR UPDATE` **Decision.** Composite operations (`AuthDirectoryStore.{Resolve, Ensure, Block*}`, `EntitlementLifecycleStore.{Grant, Extend, Revoke, RepairExpired}`, `PolicyLifecycleStore.{ApplySanction, RemoveSanction, SetLimit, RemoveLimit}`) execute inside `store.withTx` and acquire row locks with `SELECT … FOR UPDATE` on the rows they intend to mutate. Optimistic-replacement guards (`Expected*Record`, `Expected*Snapshot`) are validated against the locked rows before the write goes through; mismatches surface as `ports.ErrConflict`. **Why.** PostgreSQL's default `READ COMMITTED` isolation plus row-level locks gives us the serialisation property the previous Redis WATCH/MULTI loops achieved without needing the application to retry on optimistic-failure errors. The explicit `FOR UPDATE` keeps intent visible; ad-hoc CTE patterns would obscure the locking shape. ### 6. Query layer is `go-jet/jet/v2` **Decision.** All `userstore` packages build SQL through the jet builder API (`pgtable..INSERT/SELECT/UPDATE/DELETE` plus the `pg.AND/OR/SET/...` DSL). `cmd/jetgen` (invoked via `make jet`) brings up a transient PostgreSQL container, applies the embedded migrations, and runs `github.com/go-jet/jet/v2/generator/postgres.GenerateDB` against the provisioned schema; the generated table/model code lives under `internal/adapters/postgres/jet/user/{model,table}/*.go` and is committed to the repo, so build consumers do not need Docker. Statements are run through the `database/sql` API (`stmt.Sql() → db.Exec/Query/QueryRow`); manual `rowScanner` helpers preserve domain-type marshalling. **Why.** Aligns with `PG_PLAN.md` §Library stack ("Query layer: `github.com/go-jet/jet/v2` (PostgreSQL dialect). Generated code lives under each service `internal/adapters/postgres/jet/`, regenerated via a `make jet` target and committed to the repo"). Constructs the jet builder does not cover natively (`FOR UPDATE`, keyset-pagination row-comparison, partial UNIQUE WHERE in `CREATE INDEX`) are expressed through the per-DSL helpers (`.FOR(pg.UPDATE())`, `OR/AND` expansion of `(created_at, user_id) < (…)`). The ports contract and the schema do not change. ### 7. Redis publishers share one `*redis.Client` **Decision.** `internal/app/runtime.go` constructs one `redisconn.NewMasterClient(cfg.Redis.Conn)` and passes it to both `domainevents.New(client, cfg)` and `lifecycleevents.New(client, cfg)`. The publishers no longer carry connection-topology fields and no longer close the client; the runtime owns it. **Why.** Each subsequent PG_PLAN stage (Mail, Notification, Lobby) ships a similar duo of stream publishers; sharing one client is the shape we want all stages to converge on. Per-publisher clients multiplied TCP connections, ping points, and OpenTelemetry instrumentation hooks for no functional benefit. ### 8. Mandatory Redis password in tests as well **Decision.** Unit tests for the publishers configure `miniredis.RequireAuth("integration")` and pass a matching password through their direct `redis.NewClient(...)` construction. The runtime contract test (`runtime_contract_test.go::newRuntimeContractHarness`) does the same plus boots a Postgres container. **Why.** The architectural rule forbids password-less Redis connections; carrying the constraint into tests prevents the rule from drifting. ### 9. Listing surface keeps storage-thin pagination **Decision.** `UserListStore.ListUserIDs` paginates only on `(created_at DESC, user_id DESC)` with keyset cursors carried by the opaque page token. Filter matrix evaluation (paid_state, declared_country, sanction_code, limit_code, can_*) is performed by the service-layer `adminusers.Lister`, which loads each candidate through the per-user loader. This mirrors the previous Redis behaviour exactly. **Why.** Pushing the filter matrix into SQL is desirable — it eliminates candidate over-fetching — but doing it without changing the public `UserListStore.ListUserIDs` contract (which returns a page of `UserID`, not full records) requires a JOIN-driven query. That work is a non-breaking optimisation and is intentionally deferred so this stage focuses on the storage cut-over rather than throughput improvements. The page-token wire format is preserved bit-for-bit so already-issued tokens keep working. ## Cross-References - `PG_PLAN.md §3` (Stage 3 — User Service migration / pilot). - `ARCHITECTURE.md §Persistence Backends`. - `internal/adapters/postgres/migrations/00001_init.sql` and `internal/adapters/postgres/migrations/migrations.go`. - `internal/adapters/postgres/userstore/{store,accounts,blocked_emails, auth_directory,entitlement_store,policy_store,list_store,page_token, helpers}.go` plus the testcontainers-backed unit suite under `userstore/{harness,store}_test.go`. - `internal/adapters/postgres/jet/user/{model,table}/*.go` (committed generated code) plus `cmd/jetgen/main.go` and the `make jet` Makefile target that regenerate it. - `internal/config/config.go` (`PostgresConfig`, `RedisConfig` reshape). - `internal/app/runtime.go` (PG pool open + migration + shared Redis client wiring). - `internal/adapters/redis/{domainevents,lifecycleevents}/publisher.go` (refactored to accept the shared `*redis.Client`). - `runtime_contract_test.go::startPostgresForContractTest` (shows the inline Postgres bootstrap used by the existing runtime contract).