Files
galaxy-game/gateway/docs/redis-config.md
T
2026-04-26 20:34:39 +02:00

4.8 KiB

Decision: Redis configuration shape

PG_PLAN.md §7. Captures the standing rules adopted by Edge Gateway when it joined the project-wide Redis topology defined in ARCHITECTURE.md §Persistence Backends.

Context

Gateway intentionally stays Redis-only. All gateway state Redis serves is TTL-bounded or runtime-coordination state:

  • the session cache is a read-through projection of authsession's source-of-truth session records (rebuildable via re-authentication);
  • the replay store is a short-lived SETNX reservation namespace per authenticated request (GATEWAY_REPLAY_REDIS_RESERVE_TIMEOUT);
  • the session-events stream is a runtime fan-out of session lifecycle updates;
  • the client-events stream is a runtime push fan-out.

Stage 7 brought gateway in line with the steady-state rules established in Stage 0: every Galaxy service uses one master plus zero-or-more replicas with a mandatory password, no TLS, and no Redis ACL username; the connection is configured by the shared pkg/redisconn helper.

Decisions

One shared *redis.Client owned by the runtime

cmd/gateway/main.go constructs a single *redis.Client via internal/redisclient.NewClient, attaches OpenTelemetry tracing and metrics via internal/redisclient.InstrumentClient, performs one bounded PING via internal/redisclient.Ping, and registers client.Close for shutdown. The session cache, replay store, session-events subscriber, and client-events subscriber all receive this same client.

Adapters no longer build or own a Redis client. Their Config structs hold only behavior settings (key prefix, stream name, per-subsystem timeouts). Adapter constructors take (*redis.Client, …). The stream subscribers' Close/Shutdown methods became no-ops; the runtime's context cancellation unblocks the XRead loop and the runtime closes the shared client.

One env-var prefix for the connection

Connection topology is loaded from a single GATEWAY_REDIS_* group via redisconn.LoadFromEnv("GATEWAY"):

  • GATEWAY_REDIS_MASTER_ADDR (required)
  • GATEWAY_REDIS_REPLICA_ADDRS (optional, comma-separated; currently unused, reserved for future read-routing)
  • GATEWAY_REDIS_PASSWORD (required)
  • GATEWAY_REDIS_DB (default 0)
  • GATEWAY_REDIS_OPERATION_TIMEOUT (default 250ms)

Per-subsystem behavior env vars keep their existing prefixes — they do not describe connection topology, only namespace and timing:

  • GATEWAY_SESSION_CACHE_REDIS_KEY_PREFIX, GATEWAY_SESSION_CACHE_REDIS_LOOKUP_TIMEOUT
  • GATEWAY_REPLAY_REDIS_KEY_PREFIX, GATEWAY_REPLAY_REDIS_RESERVE_TIMEOUT
  • GATEWAY_SESSION_EVENTS_REDIS_STREAM, GATEWAY_SESSION_EVENTS_REDIS_READ_BLOCK_TIMEOUT
  • GATEWAY_CLIENT_EVENTS_REDIS_STREAM, GATEWAY_CLIENT_EVENTS_REDIS_READ_BLOCK_TIMEOUT

Retired env vars (hard removal)

The following variables are no longer read or honored:

  • GATEWAY_SESSION_CACHE_REDIS_ADDR — replaced by GATEWAY_REDIS_MASTER_ADDR.
  • GATEWAY_SESSION_CACHE_REDIS_USERNAME — Redis ACL not used.
  • GATEWAY_SESSION_CACHE_REDIS_PASSWORD — replaced by GATEWAY_REDIS_PASSWORD.
  • GATEWAY_SESSION_CACHE_REDIS_DB — replaced by GATEWAY_REDIS_DB.
  • GATEWAY_SESSION_CACHE_REDIS_TLS_ENABLED — TLS disabled by policy.

pkg/redisconn.LoadFromEnv rejects GATEWAY_REDIS_TLS_ENABLED and GATEWAY_REDIS_USERNAME at startup with a clear error pointing to ARCHITECTURE.md §Persistence Backends.

Compound legacy prefixes (GATEWAY_SESSION_CACHE_REDIS_USERNAME etc.) are not actively rejected. pkg/redisconn's deprecated-env detector only watches the canonical GATEWAY_REDIS_* form. The compound legacy vars become silently inert. The architecture rule explicitly accepts this ("no backward-compat shim — fresh project, no production deploys to migrate"); operators upgrading should remove the variables from their deployment manifests.

Telemetry

redisconn.Instrument wires redisotel.InstrumentTracing (with WithDBStatement(false)) and redisotel.InstrumentMetrics. This is the first gateway release that emits Redis tracing and connection-pool metrics; downstream dashboards will start populating without further changes.

Consequences

  • Gateway test code that previously constructed a Redis client per adapter must now construct one client and pass it to every adapter under test (see internal/session/redis_test.go, internal/replay/redis_test.go, internal/events/subscriber_test.go, internal/events/client_subscriber_test.go).
  • Operators must set GATEWAY_REDIS_PASSWORD. A passwordless local Redis is still acceptable as long as a placeholder password is supplied to the binary; Redis without requirepass accepts AUTH unconditionally.
  • The integration test harness passes GATEWAY_REDIS_PASSWORD = "integration" alongside GATEWAY_REDIS_MASTER_ADDR (see integration/internal/harness/gatewayservice.go).