# deploy The full Scrabble contour: `backend` + `gateway` + Postgres + the Telegram connector (with a VPN sidecar) + the observability stack (OTel Collector → Prometheus + Tempo → Grafana), fronted by a **caddy** that owns a single `/_gm` Basic-Auth (the admin console + Grafana). Topology and the decision record are in [`../docs/ARCHITECTURE.md`](../docs/ARCHITECTURE.md) §13; this file is the operational reference for **every environment variable**. ## Services | Service | Image | Role | | --- | --- | --- | | `caddy` | `caddy:2-alpine` | Edge proxy (alias `scrabble` on `edge`): single `/_gm` Basic-Auth → admin console + Grafana; everything else → gateway. TLS per `CADDY_SITE_ADDRESS`. | | `gateway` | built (`gateway/Dockerfile`) | Public edge; serves the embedded landing at `/` and the game SPA at `/app/` + `/telegram/`; Connect-RPC edge. | | `backend` | built (`backend/Dockerfile`) | Domain service; bakes in the DAWG dictionaries; runs migrations at boot. | | `postgres` | `postgres:17-alpine` | Database (named volume, `pg_isready` healthcheck). | | `vpn` + `telegram` | sidecar + built (`platform/telegram/Dockerfile`) | Telegram connector; egresses through the AmneziaWG sidecar; internal gRPC at `telegram:9091`. | | `otelcol` | `otel/opentelemetry-collector-contrib` | OTLP/gRPC `:4317` → Prometheus scrape (`:9464`) + Tempo. | | `prometheus` | `prom/prometheus` | Metrics, 15d retention. | | `tempo` | `grafana/tempo` | Traces, 72h retention. | | `grafana` | `grafana/grafana` | Dashboards (provisioned), anonymous-admin behind caddy's `/_gm/grafana`. | Networking: inter-service traffic is on the private `internal` network (project-scoped DNS); only `caddy` joins the shared external `edge` network so the host caddy can reach it at `scrabble:80`. `edge` must already exist on the host (`docker network create edge`). ## Run it **Locally** — copy the template, fill the required values, bring it up: ```sh cp deploy/.env.example deploy/.env # then edit deploy/.env docker network create edge # once, if it does not exist cd deploy && docker compose up -d --build ``` **In CI** (the test contour) — `.gitea/workflows/ci.yaml`'s `deploy` job maps the Gitea **`TEST_`-prefixed** secrets/variables onto the unprefixed names below and runs `docker compose up -d --build` on the runner host. Stage 18 (prod) maps the **`PROD_`** set the same way. So a Gitea secret named `TEST_POSTGRES_PASSWORD` feeds the compose's `POSTGRES_PASSWORD`, etc. The deploy job also **seeds the config files** (`caddy`, `otelcol`, `prometheus`, `tempo`, `grafana`) to a stable host path (`$HOME/.scrabble-deploy`) and sets `SCRABBLE_CONFIG_DIR` to it before `up`. The runner's checkout is an ephemeral act workspace that is removed after the job — binding config straight from it would dangle the mounts in the long-lived containers (Grafana would log `no such file or directory`). Locally `SCRABBLE_CONFIG_DIR` defaults to `.`, so the compose binds from this directory. ## Required variables `docker compose` aborts immediately if any of these is unset (they use `:?`): | Variable | Gitea kind | Purpose | | --- | --- | --- | | `POSTGRES_PASSWORD` | secret | Postgres password (also embedded in `BACKEND_POSTGRES_DSN`). | | `AWG_CONF` | secret | AmneziaWG config for the VPN sidecar (the connector's only egress). **Must not contain a `DNS=` line** — it hijacks the shared netns's resolv.conf and breaks the connector resolving `otelcol` (telemetry export). Without it, Docker's resolver handles both `otelcol` and `api.telegram.org`. | | `GM_BASICAUTH_HASH` | secret | bcrypt hash gating `/_gm` (admin console + Grafana). Generate with `docker run --rm caddy:2-alpine caddy hash-password --plaintext ''`. | | `TELEGRAM_MINIAPP_URL` | variable | The Mini App URL the connector hands out in deep links / buttons. | **Plus at least one bot token** — `TELEGRAM_BOT_TOKEN_EN` or `TELEGRAM_BOT_TOKEN_RU` (secrets). Compose cannot express "one of", so they default to empty, but the connector **fails at boot** if both are empty. ## Optional variables (with defaults) | Variable | Gitea kind | Default | Purpose | | --- | --- | --- | --- | | `POSTGRES_DB` | variable | `scrabble` | Database name. | | `POSTGRES_USER` | variable | `scrabble` | Database user. | | `DICT_VERSION` | variable | `v1.0.0` | `scrabble-dictionary` release tag baked into the backend image (build-arg). | | `LOG_LEVEL` | variable | `info` | Shared log level for backend / gateway / connector (`debug\|info\|warn\|error`). | | `CADDY_SITE_ADDRESS` | variable | `:80` | Caddy site address. Test: `:80` (host caddy terminates TLS). Prod: a domain, so caddy does its own ACME. | | `GM_BASICAUTH_USER` | variable | `gm` | Username for the `/_gm` Basic-Auth. | | `GRAFANA_ROOT_URL` | variable | `/_gm/grafana/` | Grafana root URL (sub-path serving). Set the full `https:///_gm/grafana/` behind a real domain. | | `GRAFANA_ADMIN_PASSWORD` | secret | `admin` | Grafana admin password. Low impact (the login form is disabled, access is anonymous-admin behind caddy) but set it anyway. | | `TELEGRAM_GAME_CHANNEL_ID_EN` | variable | _(empty)_ | English game-channel id; empty/`0` disables channel posts. | | `TELEGRAM_GAME_CHANNEL_ID_RU` | variable | _(empty)_ | Russian game-channel id; empty/`0` disables channel posts. | | `TELEGRAM_TEST_ENV` | _pinned_ | `false` | `true` routes the bot through Telegram's test environment (`.../bot/test/METHOD`). **The CI test contour pins this to `true` in `ci.yaml`** (the contour is the test environment) — it is not a Gitea variable. Set it in `.env` for a local run; prod (Stage 18) leaves it `false`. | | `TELEGRAM_API_BASE_URL` | variable | _(empty)_ | Override the Bot API host (a mock/self-hosted server); empty = `https://api.telegram.org`. | | `GATEWAY_DEFAULT_SUPPORTED_LANGUAGES` | variable | `en,ru` | Variant-gating set for non-Telegram logins (web/email/guest). | | `VITE_TELEGRAM_BOT_ID` | variable | _(empty)_ | UI build-arg: numeric bot id for the web Login Widget. | | `VITE_TELEGRAM_LINK` | variable | _(empty)_ | UI build-arg: deep-link base for share-to-Telegram (e.g. `https://t.me//`). | | `VITE_TELEGRAM_LINK_EN` | variable | _(empty)_ | UI build-arg: the landing "Play in Telegram" link for the **English** bot (e.g. `https://t.me/Scrabble_Game`). | | `VITE_TELEGRAM_LINK_RU` | variable | _(empty)_ | UI build-arg: the landing "Play in Telegram" link for the **Russian** bot (e.g. `https://t.me/Erudit_Game`). | | `VITE_GATEWAY_URL` | variable | _(empty)_ | UI build-arg: gateway origin; empty = same-origin (the usual single-origin deploy). | The five `VITE_*` are **build-args** baked into the gateway image at build time, so changing them requires a rebuild (`--build`), not just a restart. ## Fixed internal wiring (not operator-set) These are hard-wired in `docker-compose.yml` (no `${...}`), pointing the services at each other on the `internal` network — listed here so they are not mistaken for missing config: `BACKEND_POSTGRES_DSN` (→ `postgres`, `search_path=backend`), `GATEWAY_BACKEND_HTTP_URL`/`_GRPC_ADDR` (→ `backend`), `GATEWAY_CONNECTOR_ADDR`/`BACKEND_CONNECTOR_ADDR` (→ `telegram:9091`), and all three services' `*_OTEL_*_EXPORTER=otlp` → `OTEL_EXPORTER_OTLP_ENDPOINT=http://otelcol:4317` (`_INSECURE=true`). The connector shares the VPN sidecar's netns: routing to the collector's internal IP is fine (connected route), but its `AWG_CONF` must **not** set a `DNS=` directive — that hijacks resolv.conf and breaks resolving `otelcol` ("produced zero addresses"); without it the netns uses Docker's resolver, which resolves both `otelcol` and `api.telegram.org`. `GATEWAY_ADMIN_*` is intentionally **unset** — caddy owns `/_gm` in the contour. ## Host-side setup (outside this repo) - **`edge` network** must exist on the host (`docker network create edge`). - **Host caddy** route ` → scrabble:80` (the in-compose caddy serves HTTP in the test contour; the host caddy terminates TLS). Not needed on prod, where the contour caddy owns TLS (set `CADDY_SITE_ADDRESS` to the domain). - **Branch protection** requires the single status check `CI / gate` (Stage 17). The `unit` / `integration` / `ui` jobs are path-conditional (they skip when their code did not change), and the always-running `gate` job aggregates them (passing when each succeeded or was skipped), so a skipped job never blocks a merge. See [`../CLAUDE.md`](../CLAUDE.md) "Branching & CI".