207 lines
6.0 KiB
Markdown
207 lines
6.0 KiB
Markdown
# Runtime and Components
|
|
|
|
The diagram below focuses on the deployed `galaxy/notification` process and
|
|
its runtime dependencies.
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
subgraph Producers
|
|
GM["Game Master"]
|
|
Lobby["Game Lobby"]
|
|
Geo["Geo Profile Service"]
|
|
end
|
|
|
|
subgraph Notify["Notification Service process"]
|
|
Probe["Private probe HTTP listener\n/healthz /readyz"]
|
|
Consumer["Notification intent consumer"]
|
|
Accept["Intent acceptance service"]
|
|
Push["Push route publisher"]
|
|
Email["Email route publisher"]
|
|
Telemetry["Logs, traces, metrics"]
|
|
end
|
|
|
|
User["User Service"]
|
|
Gateway["Edge Gateway\nclient-event stream consumer"]
|
|
Mail["Mail Service\ncommand stream consumer"]
|
|
Redis["Redis\nstate + streams + schedules"]
|
|
|
|
GM --> Redis
|
|
Lobby --> Redis
|
|
Geo --> Redis
|
|
Consumer --> Redis
|
|
Consumer --> Accept
|
|
Accept --> User
|
|
Accept --> Redis
|
|
Push --> Redis
|
|
Email --> Redis
|
|
Push --> Gateway
|
|
Email --> Mail
|
|
Probe --> Telemetry
|
|
Consumer --> Telemetry
|
|
Push --> Telemetry
|
|
Email --> Telemetry
|
|
```
|
|
|
|
## Listener
|
|
|
|
`notification` exposes exactly one HTTP listener:
|
|
|
|
| Listener | Default addr | Purpose |
|
|
| --- | --- | --- |
|
|
| Internal probe HTTP | `:8092` | Private liveness and readiness probes |
|
|
|
|
Shared listener defaults:
|
|
|
|
- read-header timeout: `2s`
|
|
- read timeout: `10s`
|
|
- idle timeout: `1m`
|
|
|
|
Probe routes:
|
|
|
|
- `GET /healthz` returns `{"status":"ok"}`
|
|
- `GET /readyz` returns `{"status":"ready"}`
|
|
- `readyz` is process-local after successful startup and does not perform a
|
|
live Redis ping per request
|
|
|
|
Intentional omissions:
|
|
|
|
- no public listener
|
|
- no operator API
|
|
- there is no `/metrics` route
|
|
|
|
## Startup Wiring
|
|
|
|
`cmd/notification` loads config, constructs logging, and builds the runtime
|
|
through `internal/app.NewRuntime`.
|
|
|
|
The runtime wires:
|
|
|
|
- Redis client with startup connectivity check
|
|
- `User Service` HTTP client for recipient enrichment
|
|
- private probe HTTP server
|
|
- plain `XREAD` intent consumer
|
|
- `push` route publisher for `Gateway`
|
|
- `email` route publisher for `Mail Service`
|
|
- Redis-backed accepted-intent, route, idempotency, malformed-intent,
|
|
dead-letter, stream-offset, and schedule stores
|
|
- OpenTelemetry traces and metrics exporters
|
|
|
|
Startup fails fast on invalid configuration or unavailable Redis.
|
|
|
|
## Background Components
|
|
|
|
### Intent consumer
|
|
|
|
- reads one plain `XREAD` stream, default `notification:intents`
|
|
- starts from stored offset or `0-0`
|
|
- advances offset only after durable acceptance or durable malformed-intent
|
|
recording
|
|
- stops without offset advancement when `User Service` enrichment has a
|
|
temporary failure
|
|
|
|
### Acceptance service
|
|
|
|
- validates the normalized intent envelope
|
|
- applies idempotency rules for `(producer, idempotency_key)`
|
|
- enriches user-targeted recipients before durable route write
|
|
- materializes route slots for `push` and `email`
|
|
- stores malformed-intent records for invalid payloads, idempotency conflicts,
|
|
and unresolved users
|
|
|
|
### Push publisher
|
|
|
|
- scans `notification:route_schedule`
|
|
- processes only scheduled route IDs beginning with `push:`
|
|
- coordinates replicas with temporary route leases
|
|
- publishes Gateway client events with `XADD MAXLEN ~`
|
|
- omits `device_session_id` so Gateway fans out to all active streams for the
|
|
target user
|
|
|
|
### Email publisher
|
|
|
|
- scans `notification:route_schedule`
|
|
- processes only scheduled route IDs beginning with `email:`
|
|
- coordinates replicas with temporary route leases
|
|
- publishes Mail Service generic commands with plain `XADD`
|
|
- always uses `payload_mode=template`
|
|
|
|
## Configuration Groups
|
|
|
|
Required:
|
|
|
|
- `NOTIFICATION_REDIS_ADDR`
|
|
- `NOTIFICATION_USER_SERVICE_BASE_URL`
|
|
|
|
Core process config:
|
|
|
|
- `NOTIFICATION_SHUTDOWN_TIMEOUT`
|
|
- `NOTIFICATION_LOG_LEVEL`
|
|
|
|
Internal HTTP config:
|
|
|
|
- `NOTIFICATION_INTERNAL_HTTP_ADDR` with default `:8092`
|
|
- `NOTIFICATION_INTERNAL_HTTP_READ_HEADER_TIMEOUT` with default `2s`
|
|
- `NOTIFICATION_INTERNAL_HTTP_READ_TIMEOUT` with default `10s`
|
|
- `NOTIFICATION_INTERNAL_HTTP_IDLE_TIMEOUT` with default `1m`
|
|
|
|
Redis connectivity:
|
|
|
|
- `NOTIFICATION_REDIS_USERNAME`
|
|
- `NOTIFICATION_REDIS_PASSWORD`
|
|
- `NOTIFICATION_REDIS_DB`
|
|
- `NOTIFICATION_REDIS_TLS_ENABLED`
|
|
- `NOTIFICATION_REDIS_OPERATION_TIMEOUT`
|
|
- `NOTIFICATION_INTENTS_STREAM`
|
|
- `NOTIFICATION_INTENTS_READ_BLOCK_TIMEOUT`
|
|
- `NOTIFICATION_GATEWAY_CLIENT_EVENTS_STREAM`
|
|
- `NOTIFICATION_GATEWAY_CLIENT_EVENTS_STREAM_MAX_LEN`
|
|
- `NOTIFICATION_MAIL_DELIVERY_COMMANDS_STREAM`
|
|
|
|
Retry and retention:
|
|
|
|
- `NOTIFICATION_PUSH_RETRY_MAX_ATTEMPTS`
|
|
- `NOTIFICATION_EMAIL_RETRY_MAX_ATTEMPTS`
|
|
- `NOTIFICATION_ROUTE_BACKOFF_MIN`
|
|
- `NOTIFICATION_ROUTE_BACKOFF_MAX`
|
|
- `NOTIFICATION_ROUTE_LEASE_TTL`
|
|
- `NOTIFICATION_DEAD_LETTER_TTL`
|
|
- `NOTIFICATION_RECORD_TTL`
|
|
- `NOTIFICATION_IDEMPOTENCY_TTL`
|
|
|
|
User enrichment:
|
|
|
|
- `NOTIFICATION_USER_SERVICE_TIMEOUT` with default `1s`
|
|
|
|
Administrator routing:
|
|
|
|
- `NOTIFICATION_ADMIN_EMAILS_GEO_REVIEW_RECOMMENDED`
|
|
- `NOTIFICATION_ADMIN_EMAILS_GAME_GENERATION_FAILED`
|
|
- `NOTIFICATION_ADMIN_EMAILS_LOBBY_RUNTIME_PAUSED_AFTER_START`
|
|
- `NOTIFICATION_ADMIN_EMAILS_LOBBY_APPLICATION_SUBMITTED`
|
|
|
|
Telemetry:
|
|
|
|
- `OTEL_SERVICE_NAME`
|
|
- `OTEL_TRACES_EXPORTER`
|
|
- `OTEL_METRICS_EXPORTER`
|
|
- `OTEL_EXPORTER_OTLP_PROTOCOL`
|
|
- `OTEL_EXPORTER_OTLP_TRACES_PROTOCOL`
|
|
- `OTEL_EXPORTER_OTLP_METRICS_PROTOCOL`
|
|
- `NOTIFICATION_OTEL_STDOUT_TRACES_ENABLED`
|
|
- `NOTIFICATION_OTEL_STDOUT_METRICS_ENABLED`
|
|
|
|
## Runtime Notes
|
|
|
|
- `Notification Service` does not create or own notification audiences; it
|
|
trusts producers to publish concrete user recipients.
|
|
- Administrator recipients are type-specific configuration, not a global list.
|
|
- A missing user is treated as a producer input defect.
|
|
- A temporary `User Service` outage pauses stream progress for the affected
|
|
entry and allows replay after restart.
|
|
- Go producers use `galaxy/notificationintent` to build compatible intents.
|
|
- Producers append intents with plain `XADD`; producer-side publish failure is
|
|
notification degradation and must not roll back already committed source
|
|
business state.
|
|
- Dead-letter replay is performed by publishing a new compatible intent with a
|
|
new `idempotency_key`.
|