e16076c89e
CI / changes (pull_request) Successful in 2s
CI / unit (pull_request) Successful in 9s
CI / integration (pull_request) Successful in 11s
CI / ui (pull_request) Successful in 31s
CI / gate (pull_request) Successful in 0s
CI / deploy (pull_request) Successful in 55s
Close out Stage 17 round 6: - Landing page at / — one Vite build with two entries (index.html = game SPA, landing.html = a lightweight landing reusing the theme/i18n/ aboutContent leaf modules, not the app store). - Move the web game SPA to /app/; the Telegram Mini App stays at /telegram/ (gateway webui.Handler(stripPrefix, indexName): landing at /, SPA at /app/ + /telegram/). Per-language "Play in Telegram" link via new VITE_TELEGRAM_LINK_EN/_RU build vars (button hides when unset). - Cache headers: hash-named /assets/* immutable, HTML shells no-cache (the go:embed zero modtime emitted no validators, so the client re-downloaded the whole bundle every launch). - Live-stream 15s abort fix: an immediate heartbeat on open + a 10s default interval (the first tick at 15s raced the edge idle timeout -> reconnect storm). PLAN/ARCHITECTURE(§13)/FUNCTIONAL(+ru)/gateway+ui+deploy READMEs updated; round 6 closed. Tests: gateway webui/connectsrv units, ui landing unit + e2e, full e2e (60) green.
120 lines
8.3 KiB
Markdown
120 lines
8.3 KiB
Markdown
# deploy
|
|
|
|
The full Scrabble contour: `backend` + `gateway` + Postgres + the Telegram
|
|
connector (with a VPN sidecar) + the observability stack (OTel Collector →
|
|
Prometheus + Tempo → Grafana), fronted by a **caddy** that owns a single `/_gm`
|
|
Basic-Auth (the admin console + Grafana). Topology and the decision record are in
|
|
[`../docs/ARCHITECTURE.md`](../docs/ARCHITECTURE.md) §13; this file is the
|
|
operational reference for **every environment variable**.
|
|
|
|
## Services
|
|
|
|
| Service | Image | Role |
|
|
| --- | --- | --- |
|
|
| `caddy` | `caddy:2-alpine` | Edge proxy (alias `scrabble` on `edge`): single `/_gm` Basic-Auth → admin console + Grafana; everything else → gateway. TLS per `CADDY_SITE_ADDRESS`. |
|
|
| `gateway` | built (`gateway/Dockerfile`) | Public edge; serves the embedded landing at `/` and the game SPA at `/app/` + `/telegram/`; Connect-RPC edge. |
|
|
| `backend` | built (`backend/Dockerfile`) | Domain service; bakes in the DAWG dictionaries; runs migrations at boot. |
|
|
| `postgres` | `postgres:17-alpine` | Database (named volume, `pg_isready` healthcheck). |
|
|
| `vpn` + `telegram` | sidecar + built (`platform/telegram/Dockerfile`) | Telegram connector; egresses through the AmneziaWG sidecar; internal gRPC at `telegram:9091`. |
|
|
| `otelcol` | `otel/opentelemetry-collector-contrib` | OTLP/gRPC `:4317` → Prometheus scrape (`:9464`) + Tempo. |
|
|
| `prometheus` | `prom/prometheus` | Metrics, 15d retention. |
|
|
| `tempo` | `grafana/tempo` | Traces, 72h retention. |
|
|
| `grafana` | `grafana/grafana` | Dashboards (provisioned), anonymous-admin behind caddy's `/_gm/grafana`. |
|
|
|
|
Networking: inter-service traffic is on the private `internal` network
|
|
(project-scoped DNS); only `caddy` joins the shared external `edge` network so the
|
|
host caddy can reach it at `scrabble:80`. `edge` must already exist on the host
|
|
(`docker network create edge`).
|
|
|
|
## Run it
|
|
|
|
**Locally** — copy the template, fill the required values, bring it up:
|
|
|
|
```sh
|
|
cp deploy/.env.example deploy/.env # then edit deploy/.env
|
|
docker network create edge # once, if it does not exist
|
|
cd deploy && docker compose up -d --build
|
|
```
|
|
|
|
**In CI** (the test contour) — `.gitea/workflows/ci.yaml`'s `deploy` job maps the
|
|
Gitea **`TEST_`-prefixed** secrets/variables onto the unprefixed names below and
|
|
runs `docker compose up -d --build` on the runner host. Stage 18 (prod) maps the
|
|
**`PROD_`** set the same way. So a Gitea secret named `TEST_POSTGRES_PASSWORD`
|
|
feeds the compose's `POSTGRES_PASSWORD`, etc.
|
|
|
|
The deploy job also **seeds the config files** (`caddy`, `otelcol`, `prometheus`,
|
|
`tempo`, `grafana`) to a stable host path (`$HOME/.scrabble-deploy`) and sets
|
|
`SCRABBLE_CONFIG_DIR` to it before `up`. The runner's checkout is an ephemeral act
|
|
workspace that is removed after the job — binding config straight from it would
|
|
dangle the mounts in the long-lived containers (Grafana would log
|
|
`no such file or directory`). Locally `SCRABBLE_CONFIG_DIR` defaults to `.`, so the
|
|
compose binds from this directory.
|
|
|
|
## Required variables
|
|
|
|
`docker compose` aborts immediately if any of these is unset (they use `:?`):
|
|
|
|
| Variable | Gitea kind | Purpose |
|
|
| --- | --- | --- |
|
|
| `POSTGRES_PASSWORD` | secret | Postgres password (also embedded in `BACKEND_POSTGRES_DSN`). |
|
|
| `AWG_CONF` | secret | AmneziaWG config for the VPN sidecar (the connector's only egress). **Must not contain a `DNS=` line** — it hijacks the shared netns's resolv.conf and breaks the connector resolving `otelcol` (telemetry export). Without it, Docker's resolver handles both `otelcol` and `api.telegram.org`. |
|
|
| `GM_BASICAUTH_HASH` | secret | bcrypt hash gating `/_gm` (admin console + Grafana). Generate with `docker run --rm caddy:2-alpine caddy hash-password --plaintext '<pw>'`. |
|
|
| `TELEGRAM_MINIAPP_URL` | variable | The Mini App URL the connector hands out in deep links / buttons. |
|
|
|
|
**Plus at least one bot token** — `TELEGRAM_BOT_TOKEN_EN` or `TELEGRAM_BOT_TOKEN_RU`
|
|
(secrets). Compose cannot express "one of", so they default to empty, but the
|
|
connector **fails at boot** if both are empty.
|
|
|
|
## Optional variables (with defaults)
|
|
|
|
| Variable | Gitea kind | Default | Purpose |
|
|
| --- | --- | --- | --- |
|
|
| `POSTGRES_DB` | variable | `scrabble` | Database name. |
|
|
| `POSTGRES_USER` | variable | `scrabble` | Database user. |
|
|
| `DICT_VERSION` | variable | `v1.0.0` | `scrabble-dictionary` release tag baked into the backend image (build-arg). |
|
|
| `LOG_LEVEL` | variable | `info` | Shared log level for backend / gateway / connector (`debug\|info\|warn\|error`). |
|
|
| `CADDY_SITE_ADDRESS` | variable | `:80` | Caddy site address. Test: `:80` (host caddy terminates TLS). Prod: a domain, so caddy does its own ACME. |
|
|
| `GM_BASICAUTH_USER` | variable | `gm` | Username for the `/_gm` Basic-Auth. |
|
|
| `GRAFANA_ROOT_URL` | variable | `/_gm/grafana/` | Grafana root URL (sub-path serving). Set the full `https://<domain>/_gm/grafana/` behind a real domain. |
|
|
| `GRAFANA_ADMIN_PASSWORD` | secret | `admin` | Grafana admin password. Low impact (the login form is disabled, access is anonymous-admin behind caddy) but set it anyway. |
|
|
| `TELEGRAM_GAME_CHANNEL_ID_EN` | variable | _(empty)_ | English game-channel id; empty/`0` disables channel posts. |
|
|
| `TELEGRAM_GAME_CHANNEL_ID_RU` | variable | _(empty)_ | Russian game-channel id; empty/`0` disables channel posts. |
|
|
| `TELEGRAM_TEST_ENV` | _pinned_ | `false` | `true` routes the bot through Telegram's test environment (`.../bot<token>/test/METHOD`). **The CI test contour pins this to `true` in `ci.yaml`** (the contour is the test environment) — it is not a Gitea variable. Set it in `.env` for a local run; prod (Stage 18) leaves it `false`. |
|
|
| `TELEGRAM_API_BASE_URL` | variable | _(empty)_ | Override the Bot API host (a mock/self-hosted server); empty = `https://api.telegram.org`. |
|
|
| `GATEWAY_DEFAULT_SUPPORTED_LANGUAGES` | variable | `en,ru` | Variant-gating set for non-Telegram logins (web/email/guest). |
|
|
| `VITE_TELEGRAM_BOT_ID` | variable | _(empty)_ | UI build-arg: numeric bot id for the web Login Widget. |
|
|
| `VITE_TELEGRAM_LINK` | variable | _(empty)_ | UI build-arg: deep-link base for share-to-Telegram (e.g. `https://t.me/<bot>/<app>`). |
|
|
| `VITE_TELEGRAM_LINK_EN` | variable | _(empty)_ | UI build-arg: the landing "Play in Telegram" link for the **English** bot (e.g. `https://t.me/Scrabble_Game`). |
|
|
| `VITE_TELEGRAM_LINK_RU` | variable | _(empty)_ | UI build-arg: the landing "Play in Telegram" link for the **Russian** bot (e.g. `https://t.me/Erudit_Game`). |
|
|
| `VITE_GATEWAY_URL` | variable | _(empty)_ | UI build-arg: gateway origin; empty = same-origin (the usual single-origin deploy). |
|
|
|
|
The five `VITE_*` are **build-args** baked into the gateway image at build time, so
|
|
changing them requires a rebuild (`--build`), not just a restart.
|
|
|
|
## Fixed internal wiring (not operator-set)
|
|
|
|
These are hard-wired in `docker-compose.yml` (no `${...}`), pointing the services
|
|
at each other on the `internal` network — listed here so they are not mistaken for
|
|
missing config: `BACKEND_POSTGRES_DSN` (→ `postgres`, `search_path=backend`),
|
|
`GATEWAY_BACKEND_HTTP_URL`/`_GRPC_ADDR` (→ `backend`),
|
|
`GATEWAY_CONNECTOR_ADDR`/`BACKEND_CONNECTOR_ADDR` (→ `telegram:9091`), and all three
|
|
services' `*_OTEL_*_EXPORTER=otlp` → `OTEL_EXPORTER_OTLP_ENDPOINT=http://otelcol:4317`
|
|
(`_INSECURE=true`). The connector shares the VPN sidecar's netns: routing to the
|
|
collector's internal IP is fine (connected route), but its `AWG_CONF` must **not**
|
|
set a `DNS=` directive — that hijacks resolv.conf and breaks resolving `otelcol`
|
|
("produced zero addresses"); without it the netns uses Docker's resolver, which
|
|
resolves both `otelcol` and `api.telegram.org`. `GATEWAY_ADMIN_*` is intentionally
|
|
**unset** — caddy owns `/_gm` in the contour.
|
|
|
|
## Host-side setup (outside this repo)
|
|
|
|
- **`edge` network** must exist on the host (`docker network create edge`).
|
|
- **Host caddy** route `<domain> → scrabble:80` (the in-compose caddy serves HTTP
|
|
in the test contour; the host caddy terminates TLS). Not needed on prod, where the
|
|
contour caddy owns TLS (set `CADDY_SITE_ADDRESS` to the domain).
|
|
- **Branch protection** requires the single status check `CI / gate` (Stage 17).
|
|
The `unit` / `integration` / `ui` jobs are path-conditional (they skip when their
|
|
code did not change), and the always-running `gate` job aggregates them (passing
|
|
when each succeeded or was skipped), so a skipped job never blocks a merge. See
|
|
[`../CLAUDE.md`](../CLAUDE.md) "Branching & CI".
|