Stand up a production-mirror monitoring stack in the long-lived dev
contour, all on galaxy-dev-internal with no host ports (reached only via
the in-repo galaxy-dev-caddy):
- Prometheus scrapes backend:9100, gateway:9191, node-exporter and
cadvisor (30s interval, 15d retention); Loki (7d) + promtail (Docker
service discovery by the galaxy.stack=dev-deploy label) for logs;
Tempo (3d) for traces.
- Backend and gateway now export OTLP traces to Tempo over plaintext
gRPC on the internal network (OTEL_EXPORTER_OTLP_INSECURE).
- Grafana provisioned as code (Prometheus/Loki/Tempo datasources plus a
starter dashboard), served under /grafana/ via Caddy sub-path mode;
admin password from the GALAXY_DEV_GRAFANA_ADMIN_PASSWORD secret.
- Expose the Mailpit capture UI under /mailpit/ (Caddy basic-auth +
MP_WEBROOT) so every captured message is readable regardless of relay.
- dev-deploy.yaml seeds the monitoring config to a stable, reboot-
surviving host path and injects the Grafana admin secret.
Per-service memory limits keep the footprint within budget. All
collector config lives under tools/dev-deploy/monitoring/ for dev/prod
parity.