feat: use postgres

This commit is contained in:
Ilia Denisov
2026-04-26 20:34:39 +02:00
committed by GitHub
parent 48b0056b49
commit fe829285a6
365 changed files with 29223 additions and 24049 deletions
+109
View File
@@ -0,0 +1,109 @@
# Decision: Redis configuration shape
PG_PLAN.md §7. Captures the standing rules adopted by Edge Gateway when it
joined the project-wide Redis topology defined in
`ARCHITECTURE.md §Persistence Backends`.
## Context
Gateway intentionally stays Redis-only. All gateway state Redis serves is
TTL-bounded or runtime-coordination state:
- the session cache is a read-through projection of authsession's
source-of-truth session records (rebuildable via re-authentication);
- the replay store is a short-lived `SETNX` reservation namespace per
authenticated request (`GATEWAY_REPLAY_REDIS_RESERVE_TIMEOUT`);
- the session-events stream is a runtime fan-out of session lifecycle
updates;
- the client-events stream is a runtime push fan-out.
Stage 7 brought gateway in line with the steady-state rules established in
Stage 0: every Galaxy service uses one master plus zero-or-more replicas
with a mandatory password, no TLS, and no Redis ACL username; the connection
is configured by the shared `pkg/redisconn` helper.
## Decisions
### One shared `*redis.Client` owned by the runtime
`cmd/gateway/main.go` constructs a single `*redis.Client` via
`internal/redisclient.NewClient`, attaches OpenTelemetry tracing and metrics
via `internal/redisclient.InstrumentClient`, performs one bounded `PING`
via `internal/redisclient.Ping`, and registers `client.Close` for shutdown.
The session cache, replay store, session-events subscriber, and
client-events subscriber all receive this same client.
Adapters no longer build or own a Redis client. Their `Config` structs hold
only behavior settings (key prefix, stream name, per-subsystem timeouts).
Adapter constructors take `(*redis.Client, …)`. The stream subscribers'
`Close`/`Shutdown` methods became no-ops; the runtime's context cancellation
unblocks the `XRead` loop and the runtime closes the shared client.
### One env-var prefix for the connection
Connection topology is loaded from a single `GATEWAY_REDIS_*` group via
`redisconn.LoadFromEnv("GATEWAY")`:
- `GATEWAY_REDIS_MASTER_ADDR` (required)
- `GATEWAY_REDIS_REPLICA_ADDRS` (optional, comma-separated; currently
unused, reserved for future read-routing)
- `GATEWAY_REDIS_PASSWORD` (required)
- `GATEWAY_REDIS_DB` (default `0`)
- `GATEWAY_REDIS_OPERATION_TIMEOUT` (default `250ms`)
Per-subsystem behavior env vars keep their existing prefixes — they do not
describe connection topology, only namespace and timing:
- `GATEWAY_SESSION_CACHE_REDIS_KEY_PREFIX`,
`GATEWAY_SESSION_CACHE_REDIS_LOOKUP_TIMEOUT`
- `GATEWAY_REPLAY_REDIS_KEY_PREFIX`,
`GATEWAY_REPLAY_REDIS_RESERVE_TIMEOUT`
- `GATEWAY_SESSION_EVENTS_REDIS_STREAM`,
`GATEWAY_SESSION_EVENTS_REDIS_READ_BLOCK_TIMEOUT`
- `GATEWAY_CLIENT_EVENTS_REDIS_STREAM`,
`GATEWAY_CLIENT_EVENTS_REDIS_READ_BLOCK_TIMEOUT`
### Retired env vars (hard removal)
The following variables are no longer read or honored:
- `GATEWAY_SESSION_CACHE_REDIS_ADDR` — replaced by
`GATEWAY_REDIS_MASTER_ADDR`.
- `GATEWAY_SESSION_CACHE_REDIS_USERNAME` — Redis ACL not used.
- `GATEWAY_SESSION_CACHE_REDIS_PASSWORD` — replaced by
`GATEWAY_REDIS_PASSWORD`.
- `GATEWAY_SESSION_CACHE_REDIS_DB` — replaced by `GATEWAY_REDIS_DB`.
- `GATEWAY_SESSION_CACHE_REDIS_TLS_ENABLED` — TLS disabled by policy.
`pkg/redisconn.LoadFromEnv` rejects `GATEWAY_REDIS_TLS_ENABLED` and
`GATEWAY_REDIS_USERNAME` at startup with a clear error pointing to
`ARCHITECTURE.md §Persistence Backends`.
> **Compound legacy prefixes (`GATEWAY_SESSION_CACHE_REDIS_USERNAME` etc.)
> are not actively rejected.** `pkg/redisconn`'s deprecated-env detector
> only watches the canonical `GATEWAY_REDIS_*` form. The compound legacy
> vars become silently inert. The architecture rule explicitly accepts this
> ("no backward-compat shim — fresh project, no production deploys to
> migrate"); operators upgrading should remove the variables from their
> deployment manifests.
### Telemetry
`redisconn.Instrument` wires `redisotel.InstrumentTracing` (with
`WithDBStatement(false)`) and `redisotel.InstrumentMetrics`. This is the
first gateway release that emits Redis tracing and connection-pool metrics;
downstream dashboards will start populating without further changes.
## Consequences
- Gateway test code that previously constructed a Redis client per adapter
must now construct one client and pass it to every adapter under test
(see `internal/session/redis_test.go`, `internal/replay/redis_test.go`,
`internal/events/subscriber_test.go`,
`internal/events/client_subscriber_test.go`).
- Operators must set `GATEWAY_REDIS_PASSWORD`. A passwordless local Redis
is still acceptable as long as a placeholder password is supplied to the
binary; Redis without `requirepass` accepts AUTH unconditionally.
- The integration test harness passes `GATEWAY_REDIS_PASSWORD =
"integration"` alongside `GATEWAY_REDIS_MASTER_ADDR` (see
`integration/internal/harness/gatewayservice.go`).
+17 -14
View File
@@ -7,25 +7,28 @@ readiness, shutdown, and push or revoke incidents.
Before starting the process, confirm:
- `GATEWAY_SESSION_CACHE_REDIS_ADDR` points to the Redis deployment used for
session lookup and both internal event streams.
- `GATEWAY_REDIS_MASTER_ADDR` and `GATEWAY_REDIS_PASSWORD` point to the Redis
deployment used for session lookup, replay reservations, session-events
consumption, and client-events fan-out. Optional read replicas may be
listed in `GATEWAY_REDIS_REPLICA_ADDRS` (currently unused; reserved for
future read-routing).
- `GATEWAY_SESSION_EVENTS_REDIS_STREAM` and
`GATEWAY_CLIENT_EVENTS_REDIS_STREAM` reference existing Redis Stream keys or
the names publishers will use.
`GATEWAY_CLIENT_EVENTS_REDIS_STREAM` reference existing Redis Stream keys
or the names publishers will use.
- `GATEWAY_RESPONSE_SIGNER_PRIVATE_KEY_PEM_PATH` points to a readable PKCS#8
PEM-encoded Ed25519 private key.
- the configured Redis ACL, DB, TLS, and key-prefix settings match the target
environment.
- the configured Redis DB and key-prefix settings match the target
environment. Per `ARCHITECTURE.md §Persistence Backends`, Redis traffic is
password-protected and TLS is disabled by policy; the deprecated
`GATEWAY_REDIS_TLS_ENABLED` and `GATEWAY_REDIS_USERNAME` variables are no
longer accepted and cause a hard fail at startup.
At startup the process performs bounded `PING` checks for:
At startup the process opens one shared `*redis.Client` (instrumented via
OpenTelemetry tracing and metrics) and performs one bounded `PING`. The
session cache, replay store, session-events subscriber, and client-events
subscriber all use that client.
- the Redis-backed session cache adapter;
- the replay store;
- the session event subscriber;
- the client event subscriber.
Startup fails fast if any of those checks fail or if the signer key cannot be
loaded.
Startup fails fast if the ping fails or if the signer key cannot be loaded.
Expected listener state after a healthy start: