Mail Service
Mail Service is the internal e-mail delivery service of Galaxy.
Canonical contracts:
Purpose
Mail Service owns durable intake, rendering, execution, retry, audit, and
operator recovery for outbound e-mail.
It does not decide whether a business event should become e-mail. That
decision belongs to Notification Service.
Responsibility Boundaries
Mail Service is responsible for:
- direct auth-code mail intake from
Auth / Session Service - async generic mail intake from
Notification Service - validation of recipient envelope, payload shape, locale, and attachments
- deterministic template rendering for template-mode deliveries
- provider execution through
stuborsmtp - retry scheduling, dead-letter escalation, and operator-visible audit state
- trusted operator reads and resend by clone creation
Mail Service is not responsible for:
- end-user authentication or authorization
- notification preference ownership
- deciding whether non-auth mail should be sent at all
- direct calls from
Geo Profile Service - hot-reloading templates or editing template catalog state at runtime
Cross-service routing rules:
Auth / Session Service -> Mail Serviceis synchronous trusted RESTNotification Service -> Mail Serviceis asynchronousRedis StreamsGeo Profile Servicemust route optional admin e-mail throughNotification Service, not directly toMail Service- auth-code delivery remains a direct
Auth / Session Service -> Mail Serviceflow and does not pass throughNotification Service
Runtime Surface
cmd/mail starts one internal-only process with:
- one trusted internal HTTP listener on
MAIL_INTERNAL_HTTP_ADDR - one async command consumer
- one attempt scheduler
- one attempt worker pool
- one cleanup worker
The service has no public ingress and no dedicated admin listener.
Intentional runtime omissions:
- no
/healthz - no
/readyz - no
/metrics
Operational behavior:
- startup performs bounded Redis connectivity checks and fails fast on invalid runtime configuration
- the template catalog is parsed once at startup and kept immutable for the lifetime of the process
- template changes require process restart
- operator handlers execute under
MAIL_OPERATOR_REQUEST_TIMEOUT
Configuration
Required for all starts:
MAIL_REDIS_ADDR
Primary configuration groups:
- process and logging:
MAIL_SHUTDOWN_TIMEOUTMAIL_LOG_LEVEL
- internal HTTP:
MAIL_INTERNAL_HTTP_ADDRMAIL_INTERNAL_HTTP_READ_HEADER_TIMEOUTMAIL_INTERNAL_HTTP_READ_TIMEOUTMAIL_INTERNAL_HTTP_IDLE_TIMEOUT
- Redis connectivity:
MAIL_REDIS_USERNAMEMAIL_REDIS_PASSWORDMAIL_REDIS_DBMAIL_REDIS_TLS_ENABLEDMAIL_REDIS_OPERATION_TIMEOUTMAIL_REDIS_COMMAND_STREAM
- SMTP provider:
MAIL_SMTP_MODE=stub|smtpMAIL_SMTP_ADDRMAIL_SMTP_USERNAMEMAIL_SMTP_PASSWORDMAIL_SMTP_FROM_EMAILMAIL_SMTP_FROM_NAMEMAIL_SMTP_TIMEOUTMAIL_SMTP_INSECURE_SKIP_VERIFY
- template catalog:
MAIL_TEMPLATE_DIR
- worker and operator behavior:
MAIL_ATTEMPT_WORKER_CONCURRENCYMAIL_STREAM_BLOCK_TIMEOUTMAIL_OPERATOR_REQUEST_TIMEOUT
- OpenTelemetry:
OTEL_SERVICE_NAMEOTEL_TRACES_EXPORTEROTEL_METRICS_EXPORTEROTEL_EXPORTER_OTLP_PROTOCOLOTEL_EXPORTER_OTLP_TRACES_PROTOCOLOTEL_EXPORTER_OTLP_METRICS_PROTOCOLMAIL_OTEL_STDOUT_TRACES_ENABLEDMAIL_OTEL_STDOUT_METRICS_ENABLED
Defaults worth knowing:
MAIL_INTERNAL_HTTP_ADDR=:8080MAIL_SMTP_MODE=stubMAIL_SMTP_TIMEOUT=15s
Additional SMTP note:
MAIL_SMTP_INSECURE_SKIP_VERIFY=falseby default and is intended only for local self-signed SMTP capture or similar non-production environmentsMAIL_TEMPLATE_DIR=templatesMAIL_ATTEMPT_WORKER_CONCURRENCY=4MAIL_STREAM_BLOCK_TIMEOUT=2sMAIL_OPERATOR_REQUEST_TIMEOUT=5sMAIL_SHUTDOWN_TIMEOUT=5s
Current implementation caveats:
MAIL_REDIS_COMMAND_STREAMis effective for the async command consumerMAIL_REDIS_ATTEMPT_SCHEDULE_KEYandMAIL_REDIS_DEAD_LETTER_PREFIXare parsed but the Redis adapters still use the fixed keysmail:attempt_scheduleandmail:dead_letters:<delivery_id>MAIL_IDEMPOTENCY_TTL,MAIL_DELIVERY_TTL, andMAIL_ATTEMPT_TTLare parsed but the Redis adapters still enforce fixed retentions of7d,30d, and90d
Stable Input Contracts
1. Auth delivery REST
Route:
POST /api/v1/internal/login-code-deliveries
Headers:
- required
Idempotency-Key
Request body:
emailcodelocale
Stable success outcomes:
sentsuppressed
Important semantics:
sentmeans the request was durably accepted into the internal mail-delivery pipelinesentdoes not mean that SMTP delivery has already completed- new durable auth deliveries surface as:
queuedinMAIL_SMTP_MODE=smtpsuppressedinMAIL_SMTP_MODE=stub
- duplicate replays with the same normalized request return the same stable outcome
- mismatched replays on the same
(source, idempotency_key)return409 conflict
2. Async generic command intake
Ingress stream:
mail:delivery_commands
Stable envelope fields:
delivery_idsourcepayload_modeidempotency_keyrequested_at_msrequest_idtrace_idpayload_json
Contract rules:
- async
sourceis fixed tonotification - supported
payload_modevalues arerenderedandtemplate Notification Serviceuses onlypayload_mode=templatefor notification-generated mail, even though the generic async contract keeps bothrenderedandtemplate- notification-owned
template_idvalues are identical to thenotification_typevocabulary, for examplegame.turn.readyandlobby.membership.approved - the real
Notification Service -> Mail Serviceintegration suite verifies template-mode handoff for notification-owned mail requested_at_msstores the publisher-side original request timestamp in Unix millisecondsrequest_idandtrace_idare observability-only metadata and do not participate in idempotency fingerprinting- malformed commands are metered, logged, and recorded as dedicated malformed-command entries
- malformed commands do not create a durable delivery record
- stream offset advances only after durable acceptance or durable malformed-command recording
3. Trusted operator REST
Routes:
GET /api/v1/internal/deliveriesGET /api/v1/internal/deliveries/{delivery_id}GET /api/v1/internal/deliveries/{delivery_id}/attemptsPOST /api/v1/internal/deliveries/{delivery_id}/resend
List filters:
recipientstatussourcetemplate_ididempotency_keyfrom_created_at_msto_created_at_mslimitcursor
Stable list behavior:
- ordering is
created_at_ms DESC, thendelivery_id DESC - cursor is an opaque base64url encoding of
created_at_ms:delivery_id idempotency_keywithoutsourcematches across all stable sources
Stable resend rules:
- resend is clone-only
- resend is allowed only for terminal delivery states
- resend creates a new delivery with
source=operator_resend - resend clones preserve audit history of the original instead of mutating it
Delivery Model
Source vocabulary
Stable mail_delivery.source values:
authsessionnotificationoperator_resend
Payload modes
Stable mail_delivery.payload_mode values:
renderedtemplate
Rules:
renderedstores finalsubject,text_body, and optionalhtml_bodytemplatestorestemplate_id, canonicallocale, and strict JSON-objecttemplate_variables- raw attachment bodies are stored separately from the delivery audit record
Delivery statuses
Stable operator-visible mail_delivery.status values:
queuedrenderedsendingsentsuppressedfaileddead_letter
Status meanings:
queued: durable intake completed and the next attempt is scheduledrendered: template content has been materializedsending: one worker currently owns the active attemptsent: provider accepted the envelopesuppressed: delivery was intentionally skipped as a successful business outcomefailed: terminal failure without dead-letter escalationdead_letter: retry budget was exhausted and operator follow-up is required
Stable transition rules:
- newly accepted durable deliveries surface as
queuedorsuppressed queued -> renderedis used only forpayload_mode=templatequeued|rendered -> sendinghappens on successful claimsending -> sent|suppressed|failed|queued|dead_letterdepends on provider classification and retry policy
The internal type delivery.StatusAccepted still exists in code, but it is
not part of the stable public delivery-status vocabulary and is not emitted by
the current runtime.
Attempt statuses
Stable mail_attempt.status values:
scheduledin_progressrender_failedprovider_acceptedprovider_rejectedtransport_failedtimed_out
Rules:
- there is at most one active
in_progressattempt per delivery render_failedmeans template rendering failed before provider executionprovider_acceptedends the delivery assentprovider_rejectedis used for:- provider-side suppression ending in
suppressed - permanent provider failure ending in
failed
- provider-side suppression ending in
- retryable paths are expressed through:
transport_failedtimed_out
Template and Locale Policy
Template layout:
<template_id>/<locale>/subject.tmpl<template_id>/<locale>/text.tmpl- optional
<template_id>/<locale>/html.tmpl
Required auth fallback files:
auth.login_code/en/subject.tmplauth.login_code/en/text.tmpl
Notification-owned English template directories are frozen by
../notification/README.md and the service-local
Notification Service docs.
auth.login_code remains the required auth template family for the direct
Auth / Session Service -> Mail Service flow and is not part of the
notification-owned template set.
Rendering rules:
- the process loads the full catalog at startup
- exact locale match is attempted first
- the only fallback locale is
en - there are no intermediate reductions such as
fr-CA -> fr -> en locale_fallback_used=trueis stored durably when fallback is applied- subject and text use
text/template - optional HTML uses
html/template - missing required variables and template lookup failures are classified into stable render-failure codes
Redis Logical Model
Primary keys:
mail:deliveries:<delivery_id>mail:attempts:<delivery_id>:<attempt_no>mail:idempotency:<source>:<idempotency_key>mail:dead_letters:<delivery_id>mail:delivery_payloads:<delivery_id>mail:malformed_commands:<stream_entry_id>mail:stream_offsets:<stream>
Scheduling and ingress keys:
mail:delivery_commandsmail:attempt_schedule
Operator indexes:
mail:idx:recipient:<email>mail:idx:status:<status>mail:idx:source:<source>mail:idx:template:<template_id>mail:idx:idempotency:<source>:<idempotency_key>mail:idx:created_atmail:idx:malformed_command:created_at
Storage rules:
- dynamic Redis key segments are base64url-encoded
- durable records are stored as strict JSON blobs
- timestamps are stored in Unix milliseconds
- raw attachment payloads are separated from audit metadata
- malformed async commands are stored idempotently by
stream_entry_id
Current fixed retentions:
- idempotency:
7d - deliveries and payload audit:
30d - attempts and dead letters:
90d - malformed commands:
90d
Provider, Retry, and Failure Policy
Provider modes:
stubsmtp
SMTP rules:
- outbound SMTP requires
STARTTLS - servers without
STARTTLSsupport are treated as permanent failure - SMTP authentication is enabled only when both username and password are set
Retry ladder:
- attempt
1 -> 2:1m - attempt
2 -> 3:5m - attempt
3 -> 4:30m - after attempt
4:dead_letter
Failure handling:
- retryable provider failures become
transport_failedortimed_out, then either reschedule or escalate todead_letter - permanent provider failures become
failed - render failures become
failedwithrender_failed - stale claimed work is recovered after
MAIL_SMTP_TIMEOUT + 30s
Observability
The runtime exports telemetry through configured OpenTelemetry exporters only.
Main signals:
mail.delivery.accepted_authmail.delivery.accepted_genericmail.delivery.suppressedmail.delivery.status_transitionsmail.attempt.outcomesmail.delivery.dead_lettersmail.template.locale_fallbackmail.attempt_schedule.depthmail.attempt_schedule.oldest_age_msmail.provider.send.duration_msmail.stream_commands.malformed
Additional behavior:
- internal HTTP uses
otelhttp - Redis clients use
redisotel - structured logs include
otel_trace_idandotel_span_idwhen available
Verification
Relevant commands:
cd mail && go test ./...cd integration && go test ./authsessionmail/...cd integration && go test ./gatewayauthsessionmail/...
Extended references: