feat: notification service
This commit is contained in:
@@ -0,0 +1,206 @@
|
||||
# Runtime and Components
|
||||
|
||||
The diagram below focuses on the deployed `galaxy/notification` process and
|
||||
its runtime dependencies.
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
subgraph Producers
|
||||
GM["Game Master"]
|
||||
Lobby["Game Lobby"]
|
||||
Geo["Geo Profile Service"]
|
||||
end
|
||||
|
||||
subgraph Notify["Notification Service process"]
|
||||
Probe["Private probe HTTP listener\n/healthz /readyz"]
|
||||
Consumer["Notification intent consumer"]
|
||||
Accept["Intent acceptance service"]
|
||||
Push["Push route publisher"]
|
||||
Email["Email route publisher"]
|
||||
Telemetry["Logs, traces, metrics"]
|
||||
end
|
||||
|
||||
User["User Service"]
|
||||
Gateway["Edge Gateway\nclient-event stream consumer"]
|
||||
Mail["Mail Service\ncommand stream consumer"]
|
||||
Redis["Redis\nstate + streams + schedules"]
|
||||
|
||||
GM --> Redis
|
||||
Lobby --> Redis
|
||||
Geo --> Redis
|
||||
Consumer --> Redis
|
||||
Consumer --> Accept
|
||||
Accept --> User
|
||||
Accept --> Redis
|
||||
Push --> Redis
|
||||
Email --> Redis
|
||||
Push --> Gateway
|
||||
Email --> Mail
|
||||
Probe --> Telemetry
|
||||
Consumer --> Telemetry
|
||||
Push --> Telemetry
|
||||
Email --> Telemetry
|
||||
```
|
||||
|
||||
## Listener
|
||||
|
||||
`notification` exposes exactly one HTTP listener:
|
||||
|
||||
| Listener | Default addr | Purpose |
|
||||
| --- | --- | --- |
|
||||
| Internal probe HTTP | `:8092` | Private liveness and readiness probes |
|
||||
|
||||
Shared listener defaults:
|
||||
|
||||
- read-header timeout: `2s`
|
||||
- read timeout: `10s`
|
||||
- idle timeout: `1m`
|
||||
|
||||
Probe routes:
|
||||
|
||||
- `GET /healthz` returns `{"status":"ok"}`
|
||||
- `GET /readyz` returns `{"status":"ready"}`
|
||||
- `readyz` is process-local after successful startup and does not perform a
|
||||
live Redis ping per request
|
||||
|
||||
Intentional omissions:
|
||||
|
||||
- no public listener
|
||||
- no operator API
|
||||
- there is no `/metrics` route
|
||||
|
||||
## Startup Wiring
|
||||
|
||||
`cmd/notification` loads config, constructs logging, and builds the runtime
|
||||
through `internal/app.NewRuntime`.
|
||||
|
||||
The runtime wires:
|
||||
|
||||
- Redis client with startup connectivity check
|
||||
- `User Service` HTTP client for recipient enrichment
|
||||
- private probe HTTP server
|
||||
- plain `XREAD` intent consumer
|
||||
- `push` route publisher for `Gateway`
|
||||
- `email` route publisher for `Mail Service`
|
||||
- Redis-backed accepted-intent, route, idempotency, malformed-intent,
|
||||
dead-letter, stream-offset, and schedule stores
|
||||
- OpenTelemetry traces and metrics exporters
|
||||
|
||||
Startup fails fast on invalid configuration or unavailable Redis.
|
||||
|
||||
## Background Components
|
||||
|
||||
### Intent consumer
|
||||
|
||||
- reads one plain `XREAD` stream, default `notification:intents`
|
||||
- starts from stored offset or `0-0`
|
||||
- advances offset only after durable acceptance or durable malformed-intent
|
||||
recording
|
||||
- stops without offset advancement when `User Service` enrichment has a
|
||||
temporary failure
|
||||
|
||||
### Acceptance service
|
||||
|
||||
- validates the normalized intent envelope
|
||||
- applies idempotency rules for `(producer, idempotency_key)`
|
||||
- enriches user-targeted recipients before durable route write
|
||||
- materializes route slots for `push` and `email`
|
||||
- stores malformed-intent records for invalid payloads, idempotency conflicts,
|
||||
and unresolved users
|
||||
|
||||
### Push publisher
|
||||
|
||||
- scans `notification:route_schedule`
|
||||
- processes only scheduled route IDs beginning with `push:`
|
||||
- coordinates replicas with temporary route leases
|
||||
- publishes Gateway client events with `XADD MAXLEN ~`
|
||||
- omits `device_session_id` so Gateway fans out to all active streams for the
|
||||
target user
|
||||
|
||||
### Email publisher
|
||||
|
||||
- scans `notification:route_schedule`
|
||||
- processes only scheduled route IDs beginning with `email:`
|
||||
- coordinates replicas with temporary route leases
|
||||
- publishes Mail Service generic commands with plain `XADD`
|
||||
- always uses `payload_mode=template`
|
||||
|
||||
## Configuration Groups
|
||||
|
||||
Required:
|
||||
|
||||
- `NOTIFICATION_REDIS_ADDR`
|
||||
- `NOTIFICATION_USER_SERVICE_BASE_URL`
|
||||
|
||||
Core process config:
|
||||
|
||||
- `NOTIFICATION_SHUTDOWN_TIMEOUT`
|
||||
- `NOTIFICATION_LOG_LEVEL`
|
||||
|
||||
Internal HTTP config:
|
||||
|
||||
- `NOTIFICATION_INTERNAL_HTTP_ADDR` with default `:8092`
|
||||
- `NOTIFICATION_INTERNAL_HTTP_READ_HEADER_TIMEOUT` with default `2s`
|
||||
- `NOTIFICATION_INTERNAL_HTTP_READ_TIMEOUT` with default `10s`
|
||||
- `NOTIFICATION_INTERNAL_HTTP_IDLE_TIMEOUT` with default `1m`
|
||||
|
||||
Redis connectivity:
|
||||
|
||||
- `NOTIFICATION_REDIS_USERNAME`
|
||||
- `NOTIFICATION_REDIS_PASSWORD`
|
||||
- `NOTIFICATION_REDIS_DB`
|
||||
- `NOTIFICATION_REDIS_TLS_ENABLED`
|
||||
- `NOTIFICATION_REDIS_OPERATION_TIMEOUT`
|
||||
- `NOTIFICATION_INTENTS_STREAM`
|
||||
- `NOTIFICATION_INTENTS_READ_BLOCK_TIMEOUT`
|
||||
- `NOTIFICATION_GATEWAY_CLIENT_EVENTS_STREAM`
|
||||
- `NOTIFICATION_GATEWAY_CLIENT_EVENTS_STREAM_MAX_LEN`
|
||||
- `NOTIFICATION_MAIL_DELIVERY_COMMANDS_STREAM`
|
||||
|
||||
Retry and retention:
|
||||
|
||||
- `NOTIFICATION_PUSH_RETRY_MAX_ATTEMPTS`
|
||||
- `NOTIFICATION_EMAIL_RETRY_MAX_ATTEMPTS`
|
||||
- `NOTIFICATION_ROUTE_BACKOFF_MIN`
|
||||
- `NOTIFICATION_ROUTE_BACKOFF_MAX`
|
||||
- `NOTIFICATION_ROUTE_LEASE_TTL`
|
||||
- `NOTIFICATION_DEAD_LETTER_TTL`
|
||||
- `NOTIFICATION_RECORD_TTL`
|
||||
- `NOTIFICATION_IDEMPOTENCY_TTL`
|
||||
|
||||
User enrichment:
|
||||
|
||||
- `NOTIFICATION_USER_SERVICE_TIMEOUT` with default `1s`
|
||||
|
||||
Administrator routing:
|
||||
|
||||
- `NOTIFICATION_ADMIN_EMAILS_GEO_REVIEW_RECOMMENDED`
|
||||
- `NOTIFICATION_ADMIN_EMAILS_GAME_GENERATION_FAILED`
|
||||
- `NOTIFICATION_ADMIN_EMAILS_LOBBY_RUNTIME_PAUSED_AFTER_START`
|
||||
- `NOTIFICATION_ADMIN_EMAILS_LOBBY_APPLICATION_SUBMITTED`
|
||||
|
||||
Telemetry:
|
||||
|
||||
- `OTEL_SERVICE_NAME`
|
||||
- `OTEL_TRACES_EXPORTER`
|
||||
- `OTEL_METRICS_EXPORTER`
|
||||
- `OTEL_EXPORTER_OTLP_PROTOCOL`
|
||||
- `OTEL_EXPORTER_OTLP_TRACES_PROTOCOL`
|
||||
- `OTEL_EXPORTER_OTLP_METRICS_PROTOCOL`
|
||||
- `NOTIFICATION_OTEL_STDOUT_TRACES_ENABLED`
|
||||
- `NOTIFICATION_OTEL_STDOUT_METRICS_ENABLED`
|
||||
|
||||
## Runtime Notes
|
||||
|
||||
- `Notification Service` does not create or own notification audiences; it
|
||||
trusts producers to publish concrete user recipients.
|
||||
- Administrator recipients are type-specific configuration, not a global list.
|
||||
- A missing user is treated as a producer input defect.
|
||||
- A temporary `User Service` outage pauses stream progress for the affected
|
||||
entry and allows replay after restart.
|
||||
- Go producers use `galaxy/notificationintent` to build compatible intents.
|
||||
- Producers append intents with plain `XADD`; producer-side publish failure is
|
||||
notification degradation and must not roll back already committed source
|
||||
business state.
|
||||
- Dead-letter replay is performed by publishing a new compatible intent with a
|
||||
new `idempotency_key`.
|
||||
Reference in New Issue
Block a user