415 lines
11 KiB
Markdown
415 lines
11 KiB
Markdown
# Edge Gateway
|
|
|
|
## Purpose
|
|
|
|
`Edge Gateway` is the only public ingress for Galaxy Plus clients.
|
|
It terminates the external transport and security boundary, enforces edge
|
|
policies, and routes verified requests to internal services.
|
|
|
|
The gateway does not implement domain-specific business logic.
|
|
Business validation, authorization, ownership checks, and state transitions
|
|
remain inside downstream services.
|
|
|
|
## Trust Boundary
|
|
|
|
The gateway sits between untrusted external clients and trusted internal
|
|
services.
|
|
|
|
The gateway is responsible for:
|
|
|
|
- parsing external transport requests;
|
|
- classifying public REST traffic;
|
|
- authenticating protected gRPC traffic;
|
|
- loading session state from cache;
|
|
- verifying request freshness and anti-replay constraints;
|
|
- applying edge rate limits and anti-abuse policy;
|
|
- building an authenticated internal command context;
|
|
- routing verified commands to internal services;
|
|
- maintaining authenticated push delivery connections.
|
|
|
|
The gateway is not responsible for:
|
|
|
|
- deciding whether a user is allowed to execute a business action;
|
|
- validating domain invariants;
|
|
- storing the source-of-truth session record;
|
|
- implementing business idempotency.
|
|
|
|
## Transport Matrix
|
|
|
|
The gateway exposes two external transport classes.
|
|
|
|
| Transport | Audience | Authentication | Payload format | Primary use |
|
|
| --- | --- | --- | --- | --- |
|
|
| REST/JSON | Public, unauthenticated traffic | No device session auth | JSON | Public auth commands, health checks, browser/bootstrap traffic |
|
|
| gRPC over HTTP/2 | Authenticated clients only | Required | FlatBuffers payload inside protobuf control envelope | Verified commands and push delivery |
|
|
|
|
### Public REST Surface
|
|
|
|
The public REST surface is used for commands that must work before a device
|
|
session exists and for browser-originated traffic that may share the same edge.
|
|
|
|
Stable public endpoints:
|
|
|
|
- `POST /api/v1/public/auth/send-email-code`
|
|
- `POST /api/v1/public/auth/confirm-email-code`
|
|
- `GET /healthz`
|
|
- `GET /readyz`
|
|
|
|
In addition to the fixed endpoints above, the gateway may front browser
|
|
bootstrap or asset traffic through a pluggable public handler or proxy.
|
|
That traffic belongs to dedicated public route classes and must not share rate
|
|
limit buckets or abuse counters with the public auth API.
|
|
|
|
### Authenticated gRPC Surface
|
|
|
|
All authenticated client requests use HTTP/2 and gRPC.
|
|
|
|
The public gRPC service exposes two methods:
|
|
|
|
- `ExecuteCommand(ExecuteCommandRequest) returns (ExecuteCommandResponse)`
|
|
- `SubscribeEvents(SubscribeEventsRequest) returns (stream GatewayEvent)`
|
|
|
|
`ExecuteCommand` is a generic unary RPC.
|
|
The gateway routes the request downstream by `message_type` after transport
|
|
verification succeeds.
|
|
|
|
`SubscribeEvents` is an authenticated server-streaming RPC.
|
|
It binds the stream to `user_id` and `device_session_id` and starts by sending
|
|
a service event that includes the current server time in milliseconds.
|
|
|
|
## Envelope and Payload Model
|
|
|
|
The authenticated transport uses a split contract:
|
|
|
|
- gRPC control messages are protobuf-based;
|
|
- business payload bytes are FlatBuffers;
|
|
- signatures are computed over canonical envelope fields and a hash of raw
|
|
FlatBuffers bytes.
|
|
|
|
The gateway treats `payload_bytes` as opaque business data.
|
|
It verifies integrity and forwards verified bytes downstream without rewriting
|
|
them.
|
|
|
|
### ExecuteCommandRequest
|
|
|
|
Required fields:
|
|
|
|
- `protocol_version`
|
|
- `device_session_id`
|
|
- `message_type`
|
|
- `timestamp_ms`
|
|
- `request_id`
|
|
- `payload_bytes`
|
|
- `payload_hash`
|
|
- `signature`
|
|
|
|
Optional fields:
|
|
|
|
- `trace_id`
|
|
|
|
### ExecuteCommandResponse
|
|
|
|
Required fields:
|
|
|
|
- `protocol_version`
|
|
- `request_id`
|
|
- `timestamp_ms`
|
|
- `result_code`
|
|
- `payload_bytes`
|
|
- `payload_hash`
|
|
- `signature`
|
|
|
|
### SubscribeEventsRequest
|
|
|
|
The stream open request reuses the authenticated request model.
|
|
It contains the same authentication fields as the unary request and either an
|
|
empty payload or a minimal connect payload.
|
|
|
|
Required fields:
|
|
|
|
- `protocol_version`
|
|
- `device_session_id`
|
|
- `message_type`
|
|
- `timestamp_ms`
|
|
- `request_id`
|
|
- `payload_hash`
|
|
- `signature`
|
|
|
|
Optional fields:
|
|
|
|
- `payload_bytes`
|
|
- `trace_id`
|
|
|
|
### GatewayEvent
|
|
|
|
Every stream event is a client-facing signed server message.
|
|
|
|
Required fields:
|
|
|
|
- `event_type`
|
|
- `event_id`
|
|
- `timestamp_ms`
|
|
- `payload_bytes`
|
|
- `payload_hash`
|
|
- `signature`
|
|
|
|
Optional fields:
|
|
|
|
- `request_id`
|
|
- `trace_id`
|
|
|
|
## Verification and Routing Pipeline
|
|
|
|
The gateway applies the same strict verification order for authenticated gRPC
|
|
ingress.
|
|
|
|
1. Parse the control envelope and validate required fields.
|
|
2. Check whether `protocol_version` is supported.
|
|
3. Resolve `device_session_id` through `SessionCache`.
|
|
4. Reject unknown or revoked sessions.
|
|
5. Verify that `payload_hash` matches raw `payload_bytes`.
|
|
6. Verify the client signature using the public key from session cache.
|
|
7. Verify that `timestamp_ms` is inside the accepted freshness window.
|
|
8. Verify anti-replay by checking `device_session_id + request_id`.
|
|
9. Apply authenticated rate limit and edge policy checks.
|
|
10. Build the authenticated internal command context.
|
|
11. Route the command downstream by `message_type`.
|
|
|
|
No downstream business service should receive a request that has not passed
|
|
this full verification pipeline.
|
|
|
|
## Internal Authenticated Contract
|
|
|
|
Downstream services should receive an internal authenticated command rather than
|
|
raw external gRPC transport data.
|
|
|
|
The minimum authenticated context is:
|
|
|
|
- `user_id`
|
|
- `device_session_id`
|
|
- `message_type`
|
|
- verified `payload_bytes`
|
|
- `request_id`
|
|
- optional `trace_id`
|
|
- optional client metadata needed for logs and tracing
|
|
|
|
Downstream services may trust that the gateway has already performed transport
|
|
authentication, freshness verification, and anti-replay checks.
|
|
They must still perform business authorization and domain validation.
|
|
|
|
## Session Model
|
|
|
|
The Auth / Session Service is the source of truth for device session state.
|
|
The gateway is designed to authenticate the hot path from cache.
|
|
|
|
Expected session fields available to the gateway:
|
|
|
|
- `device_session_id`
|
|
- `user_id`
|
|
- client public key
|
|
- session status
|
|
- revoke metadata
|
|
- optional client metadata
|
|
|
|
### Session Cache
|
|
|
|
`SessionCache` provides the fast path for:
|
|
|
|
- session existence checks;
|
|
- `device_session_id -> user_id`;
|
|
- access to the client public key used for signature verification;
|
|
- revoked versus active status checks.
|
|
|
|
Cache updates are event-driven.
|
|
TTL is allowed only as a safety net and must not replace invalidation events.
|
|
|
|
### Revocation Behavior
|
|
|
|
When a device session is revoked:
|
|
|
|
1. the Auth / Session Service updates the source of truth;
|
|
2. it publishes a session update or revoke event;
|
|
3. the gateway invalidates or updates `SessionCache`;
|
|
4. new unary gRPC requests for that session are rejected;
|
|
5. active `SubscribeEvents` streams for that session are closed.
|
|
|
|
## Public Anti-Abuse Model
|
|
|
|
The public REST layer must distinguish between public auth operations and
|
|
browser-originated traffic that may burst during a normal first page load.
|
|
|
|
The gateway uses these public route classes:
|
|
|
|
- `public_auth`
|
|
- `browser_bootstrap`
|
|
- `browser_asset`
|
|
- `public_misc`
|
|
|
|
### Public Auth
|
|
|
|
`public_auth` includes `send-email-code` and `confirm-email-code`.
|
|
This class uses stricter limits and abuse scoring because it directly touches
|
|
account and session creation flows.
|
|
|
|
Controls include:
|
|
|
|
- per-IP and per-identity rate limits;
|
|
- request body size limits;
|
|
- method allow-lists;
|
|
- malformed request counters;
|
|
- elevated logging and security telemetry for repeated failures.
|
|
|
|
### Browser Bootstrap and Asset Traffic
|
|
|
|
`browser_bootstrap` and `browser_asset` use separate coarse-grained budgets.
|
|
They may exhibit bursty behavior during the first load and therefore must not
|
|
be treated as hostile based on burst pattern alone.
|
|
|
|
This traffic is still constrained by:
|
|
|
|
- dedicated rate limits;
|
|
- method allow-lists;
|
|
- body size limits where request bodies are expected;
|
|
- protocol and path validation;
|
|
- independent abuse telemetry.
|
|
|
|
The gateway must not merge these buckets or counters with `public_auth`.
|
|
|
|
## Push Delivery Model
|
|
|
|
The v1 push channel is a gRPC server stream.
|
|
Long-polling is intentionally out of scope for the first version.
|
|
|
|
Expected stream behavior:
|
|
|
|
1. the client opens `SubscribeEvents`;
|
|
2. the gateway applies the full authenticated ingress verification pipeline;
|
|
3. the stream is bound to `user_id` and `device_session_id`;
|
|
4. the first service event includes `server_time_ms`;
|
|
5. client-facing events from internal pub/sub are fanned out to matching active
|
|
streams;
|
|
6. revoke events close affected streams.
|
|
|
|
## Recommended Package Layout
|
|
|
|
The initial package layout should keep transport, policy, and downstream
|
|
adapters separate:
|
|
|
|
- `cmd/gateway`
|
|
- `internal/app`
|
|
- `internal/config`
|
|
- `internal/restapi`
|
|
- `internal/grpcapi`
|
|
- `internal/authn`
|
|
- `internal/session`
|
|
- `internal/replay`
|
|
- `internal/ratelimit`
|
|
- `internal/downstream`
|
|
- `internal/push`
|
|
- `internal/events`
|
|
- `internal/clock`
|
|
|
|
## Key Interfaces
|
|
|
|
The gateway should be built around explicit consumer-side interfaces.
|
|
|
|
### SessionCache
|
|
|
|
Provides cached session lookup by `device_session_id`.
|
|
Returns enough data to verify signatures and identify the authenticated user.
|
|
|
|
### ReplayStore
|
|
|
|
Tracks recently seen `request_id` values per device session and rejects replayed
|
|
requests inside the accepted freshness window.
|
|
|
|
### RateLimiter
|
|
|
|
Applies independent policies for:
|
|
|
|
- public REST route classes;
|
|
- authenticated gRPC requests by IP;
|
|
- authenticated gRPC requests by session;
|
|
- authenticated gRPC requests by user;
|
|
- authenticated gRPC requests by message class.
|
|
|
|
### PublicTrafficClassifier
|
|
|
|
Maps incoming public REST requests to one of the public route classes so that
|
|
limits and anti-abuse counters remain isolated.
|
|
|
|
### AuthServiceClient
|
|
|
|
Handles public auth commands and session-related updates exchanged with the
|
|
Auth / Session Service.
|
|
|
|
### DownstreamRouter
|
|
|
|
Resolves the target downstream service or adapter by `message_type`.
|
|
|
|
### DownstreamClient
|
|
|
|
Executes a verified authenticated command against a downstream internal service
|
|
and returns response payload bytes plus a stable result code.
|
|
|
|
### EventSubscriber
|
|
|
|
Subscribes to internal pub/sub topics used for:
|
|
|
|
- session cache updates;
|
|
- revocations;
|
|
- client-facing event delivery.
|
|
|
|
### PushHub
|
|
|
|
Tracks active `SubscribeEvents` streams, binds them to authenticated identities,
|
|
and delivers events to the correct connections.
|
|
|
|
### ResponseSigner
|
|
|
|
Signs unary responses and stream events so clients can verify server-originated
|
|
messages.
|
|
|
|
### Clock
|
|
|
|
Provides current server time and supports consistent freshness-window checks.
|
|
|
|
## Error Model and Observability
|
|
|
|
The gateway should expose stable edge-level error classes instead of leaking
|
|
internal implementation details.
|
|
|
|
Minimum error categories:
|
|
|
|
- malformed request;
|
|
- unsupported protocol;
|
|
- unknown session;
|
|
- revoked session;
|
|
- invalid signature;
|
|
- stale request;
|
|
- replay detected;
|
|
- rate limited;
|
|
- downstream unavailable;
|
|
- internal error.
|
|
|
|
Observability requirements:
|
|
|
|
- stable correlation identifiers, including `request_id` and optional `trace_id`;
|
|
- structured logs;
|
|
- security audit events for rejects and abuse signals;
|
|
- metrics keyed by route class, message type, result code, and reject reason;
|
|
- no logging of secrets, raw private material, or raw signatures.
|
|
|
|
## Non-Goals
|
|
|
|
The gateway is not a business authorization layer and must not grow into a
|
|
domain coordinator.
|
|
|
|
The gateway must not:
|
|
|
|
- implement business ownership checks;
|
|
- validate domain state transitions;
|
|
- replace the Auth / Session Service as the session source of truth;
|
|
- degrade into a synchronous pass-through that reloads session state for every
|
|
authenticated request.
|