Commit Graph

2 Commits

Author SHA1 Message Date
Ilia Denisov 814eae0802 docs: observability stack + the single /_gm gate for Grafana/Mailpit
Tests · Go / test (pull_request) Successful in 1m56s
Tests · Integration / integration (pull_request) Successful in 1m41s
Tests · UI / test (pull_request) Successful in 3m23s
- ARCHITECTURE §17: the dev (production-mirror) collection stack
  (Prometheus / Loki / Tempo / promtail / node-exporter / cAdvisor) and
  the single /_gm Basic Auth gate fronting Grafana and the Mailpit UI.
- tools/dev-deploy/monitoring/README.md (new): services, what is
  collected, Grafana-behind-the-gate access, config delivery, tuning.
- tools/dev-deploy/README.md: an Observability section; the Mailpit UI
  under /_gm/mailpit/; Networking diagram and Files list updated.
- FUNCTIONAL §10.2.1 (+ ru mirror): the operator console nav links to
  Grafana and Mailpit under the same /_gm gate, one sign-in for all.
2026-06-01 06:37:24 +02:00
Ilia Denisov 84a0ccb23f feat(dev-deploy): full observability stack (Prometheus/Grafana/Loki/Tempo)
Stand up a production-mirror monitoring stack in the long-lived dev
contour, all on galaxy-dev-internal with no host ports (reached only via
the in-repo galaxy-dev-caddy):

- Prometheus scrapes backend:9100, gateway:9191, node-exporter and
  cadvisor (30s interval, 15d retention); Loki (7d) + promtail (Docker
  service discovery by the galaxy.stack=dev-deploy label) for logs;
  Tempo (3d) for traces.
- Backend and gateway now export OTLP traces to Tempo over plaintext
  gRPC on the internal network (OTEL_EXPORTER_OTLP_INSECURE).
- Grafana provisioned as code (Prometheus/Loki/Tempo datasources plus a
  starter dashboard), served under /grafana/ via Caddy sub-path mode;
  admin password from the GALAXY_DEV_GRAFANA_ADMIN_PASSWORD secret.
- Expose the Mailpit capture UI under /mailpit/ (Caddy basic-auth +
  MP_WEBROOT) so every captured message is readable regardless of relay.
- dev-deploy.yaml seeds the monitoring config to a stable, reboot-
  surviving host path and injects the Grafana admin secret.

Per-service memory limits keep the footprint within budget. All
collector config lives under tools/dev-deploy/monitoring/ for dev/prod
parity.
2026-05-31 23:39:06 +02:00