Stage 16: deploy infra & test contour #17

Merged
developer merged 4 commits from feature/stage-16-deploy-test-contour into development 2026-06-05 15:00:46 +00:00
Owner

Stage 16 — Deploy infra & test contour

Builds the test contour and the deploy machinery.

  • Backend + gateway multi-stage distroless Dockerfiles; the gateway embeds the UI (go:embed) and serves the SPA at / and /telegram/ (a node stage in the gateway image bakes in the real build).
  • Root deploy/docker-compose.yml: backend + gateway + Postgres + Telegram connector (VPN sidecar) + OTel Collector + Prometheus (15d) + Tempo (72h) + Grafana, fronted by a caddy that owns a single /_gm Basic-Auth (admin console + Grafana sub-path). Inter-service on a private network; only caddy on the external edge. The connector-scoped compose is retired.
  • New metrics: backend accounts_created_total{kind} (robots excluded) and an in-memory gateway active_users{window=24h,7d} gauge.
  • CI reshaped: a single .gitea/workflows/ci.yaml (unit/integration/ui + a gated deploy) on the new feature/* -> development -> master model; deploy auto-rolls the test contour on a PR into / push to development.

Verified locally: gofmt/vet/build clean, full unit + integration (testcontainers Postgres) green, both images build, compose/caddy/otelcol/prometheus configs validated.

Owner setup the deploy job needs before it can go green

  1. TEST_ secrets/variables in Gitea (see deploy/.env.example): secrets TEST_POSTGRES_PASSWORD, TEST_AWG_CONF, TEST_GM_BASICAUTH_HASH (caddy hash-password bcrypt), TEST_GRAFANA_ADMIN_PASSWORD, TEST_TELEGRAM_BOT_TOKEN_EN/_RU; vars TEST_TELEGRAM_MINIAPP_URL, TEST_TELEGRAM_GAME_CHANNEL_ID_EN/_RU, TEST_VITE_*, TEST_CADDY_SITE_ADDRESS (:80), TEST_GRAFANA_ROOT_URL.
  2. A host-caddy route <test domain> -> scrabble:80 on the runner host (the in-compose caddy's edge alias is scrabble).
  3. Branch protection: master's required status checks move from Tests · Go / test + Tests · Integration / integration to CI / unit, CI / integration, CI / ui (old names will never report). Decide whether development is protected too.
## Stage 16 — Deploy infra & test contour Builds the **test contour** and the deploy machinery. - Backend + gateway multi-stage distroless **Dockerfiles**; the gateway **embeds** the UI (`go:embed`) and serves the SPA at `/` and `/telegram/` (a node stage in the gateway image bakes in the real build). - Root **`deploy/docker-compose.yml`**: backend + gateway + Postgres + Telegram connector (VPN sidecar) + **OTel Collector + Prometheus (15d) + Tempo (72h) + Grafana**, fronted by a **caddy** that owns a single `/_gm` Basic-Auth (admin console + Grafana sub-path). Inter-service on a private network; only caddy on the external `edge`. The connector-scoped compose is retired. - New metrics: backend `accounts_created_total{kind}` (robots excluded) and an in-memory gateway `active_users{window=24h,7d}` gauge. - **CI reshaped**: a single `.gitea/workflows/ci.yaml` (`unit`/`integration`/`ui` + a gated `deploy`) on the new **`feature/* -> development -> master`** model; deploy auto-rolls the test contour on a PR into / push to `development`. Verified locally: `gofmt`/`vet`/`build` clean, full unit + integration (testcontainers Postgres) green, both images build, compose/caddy/otelcol/prometheus configs validated. ### Owner setup the `deploy` job needs before it can go green 1. **`TEST_` secrets/variables** in Gitea (see `deploy/.env.example`): secrets `TEST_POSTGRES_PASSWORD`, `TEST_AWG_CONF`, `TEST_GM_BASICAUTH_HASH` (`caddy hash-password` bcrypt), `TEST_GRAFANA_ADMIN_PASSWORD`, `TEST_TELEGRAM_BOT_TOKEN_EN`/`_RU`; vars `TEST_TELEGRAM_MINIAPP_URL`, `TEST_TELEGRAM_GAME_CHANNEL_ID_EN`/`_RU`, `TEST_VITE_*`, `TEST_CADDY_SITE_ADDRESS` (`:80`), `TEST_GRAFANA_ROOT_URL`. 2. A **host-caddy route** `<test domain> -> scrabble:80` on the runner host (the in-compose caddy's `edge` alias is `scrabble`). 3. **Branch protection**: master's required status checks move from `Tests · Go / test` + `Tests · Integration / integration` to **`CI / unit`, `CI / integration`, `CI / ui`** (old names will never report). Decide whether `development` is protected too.
developer added 1 commit 2026-06-05 09:42:49 +00:00
Stage 16: deploy infra & test contour
CI / unit (pull_request) Successful in 9s
CI / integration (pull_request) Successful in 11s
CI / ui (pull_request) Successful in 19s
CI / deploy (pull_request) Failing after 1s
8700fbfae1
- backend + gateway multi-stage distroless Dockerfiles; the gateway embeds and
  serves the SPA at / and /telegram/ via go:embed (committed dist placeholder,
  real build baked in by the image's node stage)
- deploy/docker-compose.yml: backend + gateway + Postgres + Telegram connector
  (VPN sidecar) + OTel Collector + Prometheus (15d) + Tempo (72h) + Grafana,
  fronted by a caddy owning a single /_gm Basic-Auth (admin console + Grafana
  subpath); inter-service on a private network, only caddy on the edge network
- new metrics: backend accounts_created_total{kind} (robots excluded) and an
  in-memory gateway active_users{window=24h,7d} gauge
- CI: single .gitea/workflows/ci.yaml (unit/integration/ui + a gated test-contour
  deploy) on the new feature/* -> development -> master branch model; the old
  go-unit/integration/ui-test workflows are folded in; the connector-scoped
  compose is retired (superseded by deploy/)
- docs: ARCHITECTURE §11/§12/§13, root + gateway READMEs, CLAUDE.md branching,
  PLAN.md (stage 16 done + refinements + Stage 17 forward-notes)
Author
Owner

CI on 8700fbf: unit / integration / ui are green — the code is validated.

The deploy job fails as expected (not a code issue): the runner job aborts at
docker compose interpolation — required variable TELEGRAM_MINIAPP_URL is missing — because the TEST_ Gitea secrets/variables are not set yet. The deploy job itself ran (so the auto-deploy trigger is wired correctly); it will go green once the owner setup in the PR description is in place (the TEST_ set, the host-caddy scrabble:80 route, and the renamed branch-protection checks CI / unit|integration|ui).

CI on `8700fbf`: **`unit` / `integration` / `ui` are green** — the code is validated. The **`deploy` job fails as expected** (not a code issue): the runner job aborts at `docker compose` interpolation — `required variable TELEGRAM_MINIAPP_URL is missing` — because the `TEST_` Gitea secrets/variables are not set yet. The `deploy` job itself ran (so the auto-deploy trigger is wired correctly); it will go green once the owner setup in the PR description is in place (the `TEST_` set, the host-caddy `scrabble:80` route, and the renamed branch-protection checks `CI / unit|integration|ui`).
developer added 1 commit 2026-06-05 10:01:31 +00:00
Stage 16: deploy/README.md — full environment-variable reference
CI / unit (pull_request) Successful in 9s
CI / integration (pull_request) Successful in 10s
CI / ui (pull_request) Successful in 20s
CI / deploy (pull_request) Successful in 20s
ee8d4fd85e
- deploy/README.md documents the services, how to run it locally and in CI, and
  every variable: required (the four :? ones + ≥1 bot token) and optional with
  defaults, marked secret-vs-variable and with the TEST_/PROD_ Gitea mapping;
  plus the fixed internal wiring and the host-side setup.
- ci.yaml maps the remaining POSTGRES_DB/USER, DICT_VERSION and LOG_LEVEL (unset
  renders empty -> the compose ":-" defaults apply), so every documented var is
  per-contour overridable.
- .env.example points at the README for the full reference.
developer added 1 commit 2026-06-05 14:44:11 +00:00
Stage 16: connector test-env via UseTestEnvironment; pin it in the test contour
CI / unit (pull_request) Successful in 8s
CI / integration (pull_request) Successful in 10s
CI / ui (pull_request) Successful in 20s
CI / deploy (pull_request) Successful in 30s
0ea35fe991
- bot.New now selects Telegram's test environment with the library's native
  tgbot.UseTestEnvironment() instead of a token += "/test" hack (functionally
  identical URL /bot<token>/test/METHOD, but idiomatic) + a bot test asserting
  the getMe path for both test and prod.
- ci.yaml pins TELEGRAM_TEST_ENV=true for the test contour (it IS the test
  environment) instead of a TEST_TELEGRAM_TEST_ENV variable: removes the
  confusing double-TEST, telegram-specific, prefixed operator knob and the
  secret-vs-variable footgun. Prod (Stage 17) leaves it false.
- deploy/README.md + PLAN.md updated.
developer added 1 commit 2026-06-05 14:57:19 +00:00
Stage 16: insert Stage 17 (test-contour verification); renumber prod deploy to 18
CI / unit (pull_request) Successful in 9s
CI / integration (pull_request) Successful in 12s
CI / ui (pull_request) Successful in 20s
CI / deploy (pull_request) Successful in 21s
efbaf657c6
- PLAN.md: new Stage 17 "Test-contour verification & defect fixes" (exercise the
  deployed contour end-to-end and fix what it surfaces — connector liveness check,
  path-conditional CI); the former prod-deploy stage becomes Stage 18.
- Renumber every "Stage 17" prod-deploy reference to "Stage 18" across docs,
  compose, Caddyfile, ci.yaml and CLAUDE.md; the post-Stage-14 split range is now
  "Stages 15–18".
owner approved these changes 2026-06-05 14:59:45 +00:00
developer merged commit dce3edacee into development 2026-06-05 15:00:46 +00:00
developer deleted branch feature/stage-16-deploy-test-contour 2026-06-05 15:00:46 +00:00
Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: developer/scrabble-game#17