Files
scrabble-game/deploy/README.md
T
Ilia Denisov 0ea35fe991
CI / unit (pull_request) Successful in 8s
CI / integration (pull_request) Successful in 10s
CI / ui (pull_request) Successful in 20s
CI / deploy (pull_request) Successful in 30s
Stage 16: connector test-env via UseTestEnvironment; pin it in the test contour
- bot.New now selects Telegram's test environment with the library's native
  tgbot.UseTestEnvironment() instead of a token += "/test" hack (functionally
  identical URL /bot<token>/test/METHOD, but idiomatic) + a bot test asserting
  the getMe path for both test and prod.
- ci.yaml pins TELEGRAM_TEST_ENV=true for the test contour (it IS the test
  environment) instead of a TEST_TELEGRAM_TEST_ENV variable: removes the
  confusing double-TEST, telegram-specific, prefixed operator knob and the
  secret-vs-variable footgun. Prod (Stage 17) leaves it false.
- deploy/README.md + PLAN.md updated.
2026-06-05 16:44:10 +02:00

103 lines
6.7 KiB
Markdown

# deploy
The full Scrabble contour: `backend` + `gateway` + Postgres + the Telegram
connector (with a VPN sidecar) + the observability stack (OTel Collector →
Prometheus + Tempo → Grafana), fronted by a **caddy** that owns a single `/_gm`
Basic-Auth (the admin console + Grafana). Topology and the decision record are in
[`../docs/ARCHITECTURE.md`](../docs/ARCHITECTURE.md) §13; this file is the
operational reference for **every environment variable**.
## Services
| Service | Image | Role |
| --- | --- | --- |
| `caddy` | `caddy:2-alpine` | Edge proxy (alias `scrabble` on `edge`): single `/_gm` Basic-Auth → admin console + Grafana; everything else → gateway. TLS per `CADDY_SITE_ADDRESS`. |
| `gateway` | built (`gateway/Dockerfile`) | Public edge; serves the embedded SPA at `/` and `/telegram/`; Connect-RPC edge. |
| `backend` | built (`backend/Dockerfile`) | Domain service; bakes in the DAWG dictionaries; runs migrations at boot. |
| `postgres` | `postgres:17-alpine` | Database (named volume, `pg_isready` healthcheck). |
| `vpn` + `telegram` | sidecar + built (`platform/telegram/Dockerfile`) | Telegram connector; egresses through the AmneziaWG sidecar; internal gRPC at `telegram:9091`. |
| `otelcol` | `otel/opentelemetry-collector-contrib` | OTLP/gRPC `:4317` → Prometheus scrape (`:9464`) + Tempo. |
| `prometheus` | `prom/prometheus` | Metrics, 15d retention. |
| `tempo` | `grafana/tempo` | Traces, 72h retention. |
| `grafana` | `grafana/grafana` | Dashboards (provisioned), anonymous-admin behind caddy's `/_gm/grafana`. |
Networking: inter-service traffic is on the private `internal` network
(project-scoped DNS); only `caddy` joins the shared external `edge` network so the
host caddy can reach it at `scrabble:80`. `edge` must already exist on the host
(`docker network create edge`).
## Run it
**Locally** — copy the template, fill the required values, bring it up:
```sh
cp deploy/.env.example deploy/.env # then edit deploy/.env
docker network create edge # once, if it does not exist
cd deploy && docker compose up -d --build
```
**In CI** (the test contour) — `.gitea/workflows/ci.yaml`'s `deploy` job maps the
Gitea **`TEST_`-prefixed** secrets/variables onto the unprefixed names below and
runs `docker compose up -d --build` on the runner host. Stage 17 (prod) maps the
**`PROD_`** set the same way. So a Gitea secret named `TEST_POSTGRES_PASSWORD`
feeds the compose's `POSTGRES_PASSWORD`, etc.
## Required variables
`docker compose` aborts immediately if any of these is unset (they use `:?`):
| Variable | Gitea kind | Purpose |
| --- | --- | --- |
| `POSTGRES_PASSWORD` | secret | Postgres password (also embedded in `BACKEND_POSTGRES_DSN`). |
| `AWG_CONF` | secret | AmneziaWG config for the VPN sidecar (the connector's only egress). |
| `GM_BASICAUTH_HASH` | secret | bcrypt hash gating `/_gm` (admin console + Grafana). Generate with `docker run --rm caddy:2-alpine caddy hash-password --plaintext '<pw>'`. |
| `TELEGRAM_MINIAPP_URL` | variable | The Mini App URL the connector hands out in deep links / buttons. |
**Plus at least one bot token**`TELEGRAM_BOT_TOKEN_EN` or `TELEGRAM_BOT_TOKEN_RU`
(secrets). Compose cannot express "one of", so they default to empty, but the
connector **fails at boot** if both are empty.
## Optional variables (with defaults)
| Variable | Gitea kind | Default | Purpose |
| --- | --- | --- | --- |
| `POSTGRES_DB` | variable | `scrabble` | Database name. |
| `POSTGRES_USER` | variable | `scrabble` | Database user. |
| `DICT_VERSION` | variable | `v1.0.0` | `scrabble-dictionary` release tag baked into the backend image (build-arg). |
| `LOG_LEVEL` | variable | `info` | Shared log level for backend / gateway / connector (`debug\|info\|warn\|error`). |
| `CADDY_SITE_ADDRESS` | variable | `:80` | Caddy site address. Test: `:80` (host caddy terminates TLS). Prod: a domain, so caddy does its own ACME. |
| `GM_BASICAUTH_USER` | variable | `gm` | Username for the `/_gm` Basic-Auth. |
| `GRAFANA_ROOT_URL` | variable | `/_gm/grafana/` | Grafana root URL (sub-path serving). Set the full `https://<domain>/_gm/grafana/` behind a real domain. |
| `GRAFANA_ADMIN_PASSWORD` | secret | `admin` | Grafana admin password. Low impact (the login form is disabled, access is anonymous-admin behind caddy) but set it anyway. |
| `TELEGRAM_GAME_CHANNEL_ID_EN` | variable | _(empty)_ | English game-channel id; empty/`0` disables channel posts. |
| `TELEGRAM_GAME_CHANNEL_ID_RU` | variable | _(empty)_ | Russian game-channel id; empty/`0` disables channel posts. |
| `TELEGRAM_TEST_ENV` | _pinned_ | `false` | `true` routes the bot through Telegram's test environment (`.../bot<token>/test/METHOD`). **The CI test contour pins this to `true` in `ci.yaml`** (the contour is the test environment) — it is not a Gitea variable. Set it in `.env` for a local run; prod (Stage 17) leaves it `false`. |
| `TELEGRAM_API_BASE_URL` | variable | _(empty)_ | Override the Bot API host (a mock/self-hosted server); empty = `https://api.telegram.org`. |
| `GATEWAY_DEFAULT_SUPPORTED_LANGUAGES` | variable | `en,ru` | Variant-gating set for non-Telegram logins (web/email/guest). |
| `VITE_TELEGRAM_BOT_ID` | variable | _(empty)_ | UI build-arg: numeric bot id for the web Login Widget. |
| `VITE_TELEGRAM_LINK` | variable | _(empty)_ | UI build-arg: deep-link base for share-to-Telegram (e.g. `https://t.me/<bot>/<app>`). |
| `VITE_GATEWAY_URL` | variable | _(empty)_ | UI build-arg: gateway origin; empty = same-origin (the usual single-origin deploy). |
The three `VITE_*` are **build-args** baked into the gateway image at build time, so
changing them requires a rebuild (`--build`), not just a restart.
## Fixed internal wiring (not operator-set)
These are hard-wired in `docker-compose.yml` (no `${...}`), pointing the services
at each other on the `internal` network — listed here so they are not mistaken for
missing config: `BACKEND_POSTGRES_DSN` (→ `postgres`, `search_path=backend`),
`GATEWAY_BACKEND_HTTP_URL`/`_GRPC_ADDR` (→ `backend`),
`GATEWAY_CONNECTOR_ADDR`/`BACKEND_CONNECTOR_ADDR` (→ `telegram:9091`), the three
services' `*_OTEL_*_EXPORTER=otlp` + `OTEL_EXPORTER_OTLP_ENDPOINT=http://otelcol:4317`
(`_INSECURE=true`). `GATEWAY_ADMIN_*` is intentionally **unset** — caddy owns `/_gm`
in the contour.
## Host-side setup (outside this repo)
- **`edge` network** must exist on the host (`docker network create edge`).
- **Host caddy** route `<domain> → scrabble:80` (the in-compose caddy serves HTTP
in the test contour; the host caddy terminates TLS). Not needed on prod, where the
contour caddy owns TLS (set `CADDY_SITE_ADDRESS` to the domain).
- **Branch protection** required-status-check names are `CI / unit`,
`CI / integration`, `CI / ui` (see [`../CLAUDE.md`](../CLAUDE.md) "Branching & CI").