feat: use postgres

This commit is contained in:
Ilia Denisov
2026-04-26 20:34:39 +02:00
committed by GitHub
parent 48b0056b49
commit fe829285a6
365 changed files with 29223 additions and 24049 deletions
+101 -48
View File
@@ -155,7 +155,9 @@ Intentional runtime omissions in v1:
Required:
- `NOTIFICATION_REDIS_ADDR`
- `NOTIFICATION_REDIS_MASTER_ADDR`
- `NOTIFICATION_REDIS_PASSWORD`
- `NOTIFICATION_POSTGRES_PRIMARY_DSN`
- `NOTIFICATION_USER_SERVICE_BASE_URL`
Primary configuration groups:
@@ -168,12 +170,18 @@ Primary configuration groups:
- `NOTIFICATION_INTERNAL_HTTP_READ_HEADER_TIMEOUT` with default `2s`
- `NOTIFICATION_INTERNAL_HTTP_READ_TIMEOUT` with default `10s`
- `NOTIFICATION_INTERNAL_HTTP_IDLE_TIMEOUT` with default `1m`
- Redis connectivity:
- `NOTIFICATION_REDIS_USERNAME`
- `NOTIFICATION_REDIS_PASSWORD`
- Redis connectivity (master/replica/password shape; the deprecated
`NOTIFICATION_REDIS_ADDR`, `NOTIFICATION_REDIS_USERNAME`, and
`NOTIFICATION_REDIS_TLS_ENABLED` env vars are rejected at startup):
- `NOTIFICATION_REDIS_REPLICA_ADDRS` (optional, comma-separated)
- `NOTIFICATION_REDIS_DB`
- `NOTIFICATION_REDIS_TLS_ENABLED`
- `NOTIFICATION_REDIS_OPERATION_TIMEOUT`
- PostgreSQL connectivity:
- `NOTIFICATION_POSTGRES_REPLICA_DSNS` (optional, comma-separated)
- `NOTIFICATION_POSTGRES_OPERATION_TIMEOUT`
- `NOTIFICATION_POSTGRES_MAX_OPEN_CONNS`
- `NOTIFICATION_POSTGRES_MAX_IDLE_CONNS`
- `NOTIFICATION_POSTGRES_CONN_MAX_LIFETIME`
- stream names:
- `NOTIFICATION_INTENTS_STREAM` with default `notification:intents`
- `NOTIFICATION_INTENTS_READ_BLOCK_TIMEOUT` with default `2s`
@@ -186,9 +194,13 @@ Primary configuration groups:
- `NOTIFICATION_ROUTE_BACKOFF_MIN` with default `1s`
- `NOTIFICATION_ROUTE_BACKOFF_MAX` with default `5m`
- `NOTIFICATION_ROUTE_LEASE_TTL` with default `5s`
- `NOTIFICATION_DEAD_LETTER_TTL` with default `720h`
- `NOTIFICATION_RECORD_TTL` with default `720h`
- `NOTIFICATION_IDEMPOTENCY_TTL` with default `168h`
- retention (periodic SQL retention worker; replaces the previous
`NOTIFICATION_DEAD_LETTER_TTL` and `NOTIFICATION_RECORD_TTL` Redis-EXPIRE
knobs):
- `NOTIFICATION_RECORD_RETENTION` with default `720h`
- `NOTIFICATION_MALFORMED_INTENT_RETENTION` with default `2160h`
- `NOTIFICATION_CLEANUP_INTERVAL` with default `1h`
- `User Service` enrichment:
- `NOTIFICATION_USER_SERVICE_TIMEOUT` with default `1s`
- administrator routing:
@@ -472,52 +484,90 @@ Materialization rules:
The service-local aggregate notification status is derived from routes and is
not a separate durable source of truth.
## Redis Logical Model
## Persistence Model
Durable storage is split between PostgreSQL (table-shaped business state)
and Redis (streams, runtime coordination). The architectural rules live in
[`ARCHITECTURE.md §Persistence Backends`](../ARCHITECTURE.md#persistence-backends);
the per-service decision record is
[`docs/postgres-migration.md`](docs/postgres-migration.md).
### PostgreSQL durable state
The service owns the `notification` schema. Migrations are embedded in the
binary (`internal/adapters/postgres/migrations`) and applied at startup via
`pkg/postgres.RunMigrations` strictly before any HTTP listener becomes
ready. Every time-valued column is `timestamptz`, normalised to UTC by the
adapter on bind and scan.
| Table | Frozen columns |
| --- | --- |
| `records` | `notification_id`, `notification_type`, `producer`, `audience_kind`, `recipient_user_ids` (jsonb), `payload_json`, `idempotency_key`, `request_fingerprint`, `request_id`, `trace_id`, `occurred_at`, `accepted_at`, `updated_at`, `idempotency_expires_at`; `UNIQUE (producer, idempotency_key)` |
| `routes` | `notification_id`, `route_id`, `channel`, `recipient_ref`, `status`, `attempt_count`, `max_attempts`, `next_attempt_at`, `resolved_email`, `resolved_locale`, `last_error_classification`, `last_error_message`, `last_error_at`, `created_at`, `updated_at`, `published_at`, `dead_lettered_at`, `skipped_at`; PRIMARY KEY `(notification_id, route_id)` |
| `dead_letters` | `notification_id`, `route_id`, `channel`, `recipient_ref`, `final_attempt_count`, `max_attempts`, `failure_classification`, `failure_message`, `recovery_hint`, `created_at`; PRIMARY KEY `(notification_id, route_id)` cascading from `routes` |
| `malformed_intents` | `stream_entry_id`, `notification_type`, `producer`, `idempotency_key`, `failure_code`, `failure_message`, `raw_fields` (jsonb), `recorded_at` |
Storage rules:
- durable records are stored as strict JSON blobs
- timestamps are stored in Unix milliseconds
- dynamic Redis key segments are base64url-encoded
- `notification:route_schedule` is one shared sorted set for both `push` and
`email`
- the durable `records` row IS the idempotency reservation; the
`(producer, idempotency_key)` UNIQUE constraint surfaces conflicts as
`acceptintent.ErrConflict`
- `next_attempt_at` is non-NULL only while the route is a scheduling
candidate (`status=pending|failed`); the partial index `routes_due_idx`
drives the publishers' `ListDueRoutes` scan
- `payload_json` stores the canonical normalized JSON string used for
idempotency fingerprinting; `recipient_user_ids` is JSONB and omitted
for `audience_kind=admin_email`
- terminal transitions clear `next_attempt_at` and stamp the appropriate
terminal column (`published_at` / `dead_lettered_at` / `skipped_at`)
- record-level retention deletes cascade to `routes` and `dead_letters`
via `ON DELETE CASCADE`
### Redis runtime-coordination state
| Logical artifact | Redis key |
| --- | --- |
| `notification_record` | `notification:records:<notification_id>` |
| `notification_route` | `notification:routes:<notification_id>:<route_id>` |
| temporary route lease | `notification:route_leases:<notification_id>:<route_id>` |
| `notification_idempotency_record` | `notification:idempotency:<producer>:<idempotency_key>` |
| `notification_dead_letter_entry` | `notification:dead_letters:<notification_id>:<route_id>` |
| malformed intent record | `notification:malformed_intents:<stream_entry_id>` |
| stream offset record | `notification:stream_offsets:<stream>` |
| ingress stream | `notification:intents` |
| route schedule sorted set | `notification:route_schedule` |
| Record | Frozen fields |
| --- | --- |
| `notification_record` | `notification_id`, `notification_type`, `producer`, `audience_kind`, normalized `recipient_user_ids`, normalized `payload_json`, `idempotency_key`, `request_fingerprint`, optional `request_id`, optional `trace_id`, `occurred_at_ms`, `accepted_at_ms`, `updated_at_ms` |
| `notification_route` | `notification_id`, `route_id`, `channel`, `recipient_ref`, `status`, `attempt_count`, `max_attempts`, `next_attempt_at_ms`, optional `resolved_email`, optional `resolved_locale`, optional `last_error_classification`, optional `last_error_message`, optional `last_error_at_ms`, `created_at_ms`, `updated_at_ms`, optional `published_at_ms`, optional `dead_lettered_at_ms`, optional `skipped_at_ms` |
| `notification_idempotency_record` | `producer`, `idempotency_key`, `notification_id`, `request_fingerprint`, `created_at_ms`, `expires_at_ms` |
| `notification_dead_letter_entry` | `notification_id`, `route_id`, `channel`, `recipient_ref`, `final_attempt_count`, `max_attempts`, `failure_classification`, `failure_message`, `created_at_ms`, optional `recovery_hint` |
| malformed intent record | `stream_entry_id`, optional `notification_type`, optional `producer`, optional `idempotency_key`, `failure_code`, `failure_message`, `raw_fields_json`, `recorded_at_ms` |
| stream offset record | `stream`, `last_processed_entry_id`, `updated_at_ms` |
Storage rules:
`notification_record.recipient_user_ids` stores a normalized array of unique
`user_id` values and is omitted for `audience_kind=admin_email`.
`notification_record.payload_json` stores the canonical normalized JSON string
used for idempotency fingerprinting.
Temporary route lease keys store one opaque worker token and use
`NOTIFICATION_ROUTE_LEASE_TTL`; they are service-local coordination state
rather than durable records.
`notification:route_schedule` stores one member per scheduled route where score
= `next_attempt_at_ms` and member = full Redis route key with encoded dynamic
segments.
Newly accepted publishable routes enter the schedule immediately with
`status=pending` and `next_attempt_at_ms = accepted_at_ms`.
`failed` routes remain scheduled for retry.
`published`, `dead_letter`, and `skipped` are absent from the schedule.
Only the current lease holder may finalize one due publication attempt.
- dynamic Redis key segments are base64url-encoded
- temporary route lease keys store one opaque worker token and use
`NOTIFICATION_ROUTE_LEASE_TTL`; they are service-local coordination
state rather than durable records, retained on Redis as a per-replica
exclusivity hint atop the SQL claim
- stream offset records persist plain-XREAD consumer progress for
`notification:intents` and never expire
- the outbound streams `gateway:client-events` and `mail:delivery_commands`
remain Redis Streams owned by Gateway and Mail Service respectively;
Notification Service emits one entry through `XADD` before committing
the route's PostgreSQL state transition
### Publisher claim and lease coordination
`Push` and `Email` publishers share the same scheduling pattern:
- `routes_due_idx` (the partial index on `next_attempt_at`) replaces the
former `notification:route_schedule` ZSET; the SQL query
`SELECT notification_id, route_id FROM routes WHERE next_attempt_at IS
NOT NULL AND next_attempt_at <= now() ORDER BY next_attempt_at ASC LIMIT
N` returns the next due batch
- `push` publishers filter for `route_id` prefix `push:`; `email`
publishers filter for prefix `email:` so the two workers do not contend
- `push` and `email` replicas coordinate through
`notification:route_leases:<notification_id>:<route_id>` with
`NOTIFICATION_ROUTE_LEASE_TTL`
- only the current lease holder finalises one due publication attempt;
the durable transition is a `Complete*` SQL transaction with optimistic
concurrency on `routes.updated_at` so a stale lease cannot overwrite a
fresher row state
- newly accepted publishable routes enter the partial index immediately
with `status=pending` and `next_attempt_at = accepted_at`
- `failed` routes remain in the partial index for retry
- `published`, `dead_letter`, and `skipped` clear `next_attempt_at` and
drop out of the index
## Retry And Dead-Letter Policy
@@ -550,12 +600,15 @@ Rules:
Retention rules:
- `notification_record` and `notification_route` use
`NOTIFICATION_RECORD_TTL`
- `notification_idempotency_record` uses `NOTIFICATION_IDEMPOTENCY_TTL`
- `notification_dead_letter_entry` and malformed intent records use
`NOTIFICATION_DEAD_LETTER_TTL`
- stream offset records do not use TTL
- `records` and their cascaded `routes` / `dead_letters` use
`NOTIFICATION_RECORD_RETENTION` (deleted by the periodic SQL retention
worker after the configured window; cascade clears dependent rows)
- the per-record idempotency window (`records.idempotency_expires_at`)
uses `NOTIFICATION_IDEMPOTENCY_TTL`
- `malformed_intents` use `NOTIFICATION_MALFORMED_INTENT_RETENTION`
(independent retention pass)
- the retention worker runs once per `NOTIFICATION_CLEANUP_INTERVAL`
- stream offset records do not expire
## Observability