Files
galaxy-game/mail/docs/runtime.md
T
2026-04-26 20:34:39 +02:00

198 lines
5.2 KiB
Markdown

# Runtime and Components
The diagram below focuses on the deployed `galaxy/mail` process and its runtime
dependencies.
```mermaid
flowchart LR
subgraph Callers
Auth["Auth / Session Service"]
Notify["Notification Service"]
Ops["Trusted operators"]
end
subgraph Mail["Mail Service process"]
InternalHTTP["Trusted internal HTTP listener\n/api/v1/internal/*"]
Consumer["Redis Stream command consumer"]
Scheduler["Attempt scheduler"]
Workers["Attempt worker pool"]
Cleanup["Index cleanup worker"]
Services["Application services"]
Templates["Immutable template catalog"]
Telemetry["Logs, traces, metrics"]
end
Redis["Redis\nstate + streams + indexes"]
Provider["SMTP or stub provider"]
Auth --> InternalHTTP
Ops --> InternalHTTP
Notify --> Redis
InternalHTTP --> Services
Consumer --> Services
Scheduler --> Services
Workers --> Services
Cleanup --> Services
Services --> Templates
Services --> Redis
Services --> Provider
InternalHTTP --> Telemetry
Consumer --> Telemetry
Scheduler --> Telemetry
Workers --> Telemetry
```
## Listener
`mail` exposes exactly one HTTP listener:
| Listener | Default addr | Purpose |
| --- | --- | --- |
| Internal HTTP | `:8080` | Trusted intake, operator reads, and resend |
Shared listener defaults:
- read-header timeout: `2s`
- read timeout: `10s`
- idle timeout: `1m`
Intentional omissions:
- no public listener
- no `/healthz`
- no `/readyz`
- no `/metrics`
## Startup Wiring
`cmd/mail` loads config, constructs logging, and builds the runtime through
`internal/app.NewRuntime`.
The runtime wires:
- Redis clients for state access and blocking stream consumption
- filesystem-backed template catalog
- provider adapter selected by `MAIL_SMTP_MODE`
- acceptance, render, execution, operator-read, and resend services
- internal HTTP server
- command consumer
- scheduler
- attempt worker pool
- cleanup worker
Before startup completes, the process performs bounded `PING` checks for both
Redis clients and validates the template catalog. Startup fails fast on invalid
configuration or unavailable Redis.
## Background Components
### Command consumer
- reads one plain `XREAD` stream
- starts from stored offset or `0-0`
- advances offset only after durable command acceptance or durable malformed
command recording
### Scheduler
- polls due work every `250ms`
- recovers stale claims every `30s`
- derives recovery deadline from `MAIL_SMTP_TIMEOUT + 30s`
### Attempt worker pool
- processes only already claimed work items
- concurrency is controlled by `MAIL_ATTEMPT_WORKER_CONCURRENCY`
### SQL retention worker
- periodically deletes expired `deliveries` rows whose retention window has
elapsed; cascades to `attempts`, `dead_letters`, `delivery_payloads`, and
`delivery_recipients`
- periodically deletes expired `malformed_commands` rows
- runs an immediate first pass at startup, then on `MAIL_CLEANUP_INTERVAL`
## Configuration Groups
Required for all starts:
- `MAIL_REDIS_MASTER_ADDR`
- `MAIL_REDIS_PASSWORD`
- `MAIL_POSTGRES_PRIMARY_DSN`
Core process config:
- `MAIL_SHUTDOWN_TIMEOUT`
- `MAIL_LOG_LEVEL`
Internal HTTP config:
- `MAIL_INTERNAL_HTTP_ADDR`
- `MAIL_INTERNAL_HTTP_READ_HEADER_TIMEOUT`
- `MAIL_INTERNAL_HTTP_READ_TIMEOUT`
- `MAIL_INTERNAL_HTTP_IDLE_TIMEOUT`
Redis connectivity (`pkg/redisconn` shape):
- `MAIL_REDIS_MASTER_ADDR`
- `MAIL_REDIS_REPLICA_ADDRS`
- `MAIL_REDIS_PASSWORD`
- `MAIL_REDIS_DB`
- `MAIL_REDIS_OPERATION_TIMEOUT`
- `MAIL_REDIS_COMMAND_STREAM`
PostgreSQL connectivity (`pkg/postgres` shape):
- `MAIL_POSTGRES_PRIMARY_DSN`
- `MAIL_POSTGRES_REPLICA_DSNS`
- `MAIL_POSTGRES_OPERATION_TIMEOUT`
- `MAIL_POSTGRES_MAX_OPEN_CONNS`
- `MAIL_POSTGRES_MAX_IDLE_CONNS`
- `MAIL_POSTGRES_CONN_MAX_LIFETIME`
SMTP provider:
- `MAIL_SMTP_MODE`
- `MAIL_SMTP_ADDR`
- `MAIL_SMTP_USERNAME`
- `MAIL_SMTP_PASSWORD`
- `MAIL_SMTP_FROM_EMAIL`
- `MAIL_SMTP_FROM_NAME`
- `MAIL_SMTP_TIMEOUT`
- `MAIL_SMTP_INSECURE_SKIP_VERIFY`
Templates and workers:
- `MAIL_TEMPLATE_DIR`
- `MAIL_ATTEMPT_WORKER_CONCURRENCY`
- `MAIL_STREAM_BLOCK_TIMEOUT`
- `MAIL_OPERATOR_REQUEST_TIMEOUT`
- `MAIL_IDEMPOTENCY_TTL`
- `MAIL_DELIVERY_RETENTION`
- `MAIL_MALFORMED_COMMAND_RETENTION`
- `MAIL_CLEANUP_INTERVAL`
Telemetry:
- `OTEL_SERVICE_NAME`
- `OTEL_TRACES_EXPORTER`
- `OTEL_METRICS_EXPORTER`
- `OTEL_EXPORTER_OTLP_PROTOCOL`
- `OTEL_EXPORTER_OTLP_TRACES_PROTOCOL`
- `OTEL_EXPORTER_OTLP_METRICS_PROTOCOL`
- `MAIL_OTEL_STDOUT_TRACES_ENABLED`
- `MAIL_OTEL_STDOUT_METRICS_ENABLED`
## Runtime Notes
- `MAIL_REDIS_COMMAND_STREAM` is the only Redis key override that currently
changes runtime behavior; durable mail state otherwise lives in PostgreSQL
- `MAIL_SMTP_INSECURE_SKIP_VERIFY` is a local-development escape hatch for
self-signed SMTP capture only and should remain disabled in production
- the SQL retention worker is the only periodic durable cleanup; PostgreSQL
indexes are maintained by the engine
- template catalog parsing is eager and immutable
- auth deliveries in `MAIL_SMTP_MODE=stub` surface as `suppressed`
- auth deliveries in `MAIL_SMTP_MODE=smtp` surface as `queued` and later move
through normal attempt execution