gateway readme and plan
This commit is contained in:
+489
-5
@@ -1,9 +1,493 @@
|
|||||||
# Implementation plan for Edge Gateway service
|
# Edge Gateway Implementation Plan
|
||||||
|
|
||||||
## [x] First step
|
## Summary
|
||||||
|
|
||||||
Step description.
|
This plan breaks implementation into small, reviewable phases.
|
||||||
|
Each phase has a single primary goal, clear deliverables, explicit dependencies,
|
||||||
|
acceptance criteria, and focused tests.
|
||||||
|
|
||||||
## [ ] Second step
|
The intended v1 architecture is:
|
||||||
|
|
||||||
Step Description.
|
- unauthenticated public ingress over REST/JSON;
|
||||||
|
- authenticated ingress over gRPC on HTTP/2;
|
||||||
|
- FlatBuffers payloads for authenticated business commands;
|
||||||
|
- protobuf-based gRPC control envelopes;
|
||||||
|
- authenticated server-streaming push through gRPC;
|
||||||
|
- separate public traffic classes and isolated anti-abuse counters.
|
||||||
|
|
||||||
|
## Assumptions and Defaults
|
||||||
|
|
||||||
|
- `message_type` is the stable downstream routing key.
|
||||||
|
- `protocol_version` covers transport and envelope compatibility, not business
|
||||||
|
payload schema compatibility.
|
||||||
|
- FlatBuffers are used for business payload bytes only.
|
||||||
|
- Browser bootstrap and asset traffic are within gateway scope, even when backed
|
||||||
|
by a pluggable proxy or handler.
|
||||||
|
- Long-polling is out of scope for v1.
|
||||||
|
|
||||||
|
## Phase 1. Module Skeleton
|
||||||
|
|
||||||
|
Goal: create the runnable gateway process skeleton.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- `cmd/gateway`
|
||||||
|
- `internal/app`
|
||||||
|
- base configuration types
|
||||||
|
- startup and shutdown wiring
|
||||||
|
|
||||||
|
Dependencies: none.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- the process starts with config;
|
||||||
|
- the process shuts down cleanly on signal;
|
||||||
|
- lifecycle wiring is testable.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- startup with valid config;
|
||||||
|
- shutdown without leaked goroutines.
|
||||||
|
|
||||||
|
## Phase 2. Public REST Server
|
||||||
|
|
||||||
|
Goal: add the unauthenticated HTTP server shell.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- public REST listener
|
||||||
|
- `GET /healthz`
|
||||||
|
- `GET /readyz`
|
||||||
|
- base error serialization
|
||||||
|
- request classification hook
|
||||||
|
|
||||||
|
Dependencies: Phase 1.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- health endpoints respond deterministically;
|
||||||
|
- public requests are classified at least into `public_auth` and `browser_*`.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- health endpoint responses;
|
||||||
|
- request classification smoke tests.
|
||||||
|
|
||||||
|
## Phase 3. Public Auth REST Handlers
|
||||||
|
|
||||||
|
Goal: expose unauthenticated auth commands through REST/JSON.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- `POST /api/v1/public/auth/send-email-code`
|
||||||
|
- `POST /api/v1/public/auth/confirm-email-code`
|
||||||
|
- request and response DTOs
|
||||||
|
- adapter calls into `AuthServiceClient`
|
||||||
|
|
||||||
|
Dependencies: Phase 2.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- no session authentication is required for these routes;
|
||||||
|
- handlers delegate only through the auth service adapter.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- success and validation errors for both routes;
|
||||||
|
- no session lookup on public auth paths.
|
||||||
|
|
||||||
|
## Phase 4. Public Traffic Classification
|
||||||
|
|
||||||
|
Goal: isolate public traffic into stable anti-abuse classes.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- `PublicTrafficClassifier`
|
||||||
|
- classes `public_auth`, `browser_bootstrap`, `browser_asset`, `public_misc`
|
||||||
|
- isolated rate-limit bucket keys
|
||||||
|
|
||||||
|
Dependencies: Phase 2.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- browser traffic does not share buckets with public auth;
|
||||||
|
- auth counters remain unaffected by asset bursts.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- per-class routing tests;
|
||||||
|
- bucket isolation tests.
|
||||||
|
|
||||||
|
## Phase 5. Public REST Anti-Abuse
|
||||||
|
|
||||||
|
Goal: add coarse protection to unauthenticated REST traffic.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- body size limits
|
||||||
|
- method allow-lists
|
||||||
|
- malformed request counters
|
||||||
|
- per-class rate-limit thresholds
|
||||||
|
|
||||||
|
Dependencies: Phase 4.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- first-load browser bursts are not marked hostile because of burst pattern
|
||||||
|
alone;
|
||||||
|
- malformed or oversized requests are rejected predictably.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- bootstrap burst stays outside auth abuse counters;
|
||||||
|
- invalid methods and oversized bodies are rejected.
|
||||||
|
|
||||||
|
## Phase 6. gRPC Server and Public Contracts
|
||||||
|
|
||||||
|
Goal: bring up authenticated transport over gRPC and HTTP/2.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- gRPC listener
|
||||||
|
- protobuf service definitions
|
||||||
|
- `ExecuteCommand`
|
||||||
|
- `SubscribeEvents`
|
||||||
|
|
||||||
|
Dependencies: Phase 1.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- unary and server-streaming RPCs are reachable;
|
||||||
|
- the server runs only over HTTP/2.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- unary transport smoke test;
|
||||||
|
- stream transport smoke test.
|
||||||
|
|
||||||
|
## Phase 7. Envelope Parsing and Protocol Gate
|
||||||
|
|
||||||
|
Goal: validate the gRPC control envelope before security checks continue.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- envelope parser
|
||||||
|
- required-field validation
|
||||||
|
- protocol version gate
|
||||||
|
|
||||||
|
Dependencies: Phase 6.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- unsupported or malformed envelopes are rejected before routing.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- missing field rejection;
|
||||||
|
- unsupported `protocol_version` rejection.
|
||||||
|
|
||||||
|
## Phase 8. Session Cache Lookup
|
||||||
|
|
||||||
|
Goal: resolve authenticated identity from cache.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- `SessionCache`
|
||||||
|
- session lookup pipeline
|
||||||
|
- revoked versus active session handling
|
||||||
|
|
||||||
|
Dependencies: Phase 7.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- unknown and revoked sessions are blocked before signature verification.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- cache hit with active session;
|
||||||
|
- cache miss reject;
|
||||||
|
- revoked session reject.
|
||||||
|
|
||||||
|
## Phase 9. Payload Hash and Signing Input
|
||||||
|
|
||||||
|
Goal: verify payload integrity before signature verification.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- `payload_hash` verification
|
||||||
|
- canonical signing input builder
|
||||||
|
|
||||||
|
Dependencies: Phase 8.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- changing payload bytes or envelope fields breaks the signing input.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- payload hash mismatch reject;
|
||||||
|
- canonical bytes differ when signed fields change.
|
||||||
|
|
||||||
|
## Phase 10. Client Signature Verification
|
||||||
|
|
||||||
|
Goal: authenticate the request origin using the session public key.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- signature verifier
|
||||||
|
- deterministic auth reject mapping
|
||||||
|
|
||||||
|
Dependencies: Phase 9.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- wrong key and invalid signature produce stable rejects.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- success case with valid signature;
|
||||||
|
- bad signature reject;
|
||||||
|
- wrong-key reject.
|
||||||
|
|
||||||
|
## Phase 11. Freshness and Anti-Replay
|
||||||
|
|
||||||
|
Goal: enforce transport freshness and replay protection.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- timestamp freshness window
|
||||||
|
- `ReplayStore`
|
||||||
|
- replay reservation and rejection logic
|
||||||
|
|
||||||
|
Dependencies: Phase 10.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- stale requests and duplicate `request_id` values are rejected.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- stale timestamp reject;
|
||||||
|
- replay reject for same session and request ID;
|
||||||
|
- distinct sessions do not collide.
|
||||||
|
|
||||||
|
## Phase 12. Authenticated Rate Limits and Policy
|
||||||
|
|
||||||
|
Goal: apply edge policy after transport authenticity is established.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- rate-limit keys for IP, session, user, and message class
|
||||||
|
- authenticated policy evaluation hook
|
||||||
|
|
||||||
|
Dependencies: Phase 11.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- authenticated buckets are independent from public REST buckets.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- per-dimension throttling;
|
||||||
|
- bucket isolation from public traffic.
|
||||||
|
|
||||||
|
## Phase 13. Internal Authenticated Command and Routing
|
||||||
|
|
||||||
|
Goal: forward only verified context to downstream services.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- `AuthenticatedCommand`
|
||||||
|
- `DownstreamRouter`
|
||||||
|
- `DownstreamClient`
|
||||||
|
|
||||||
|
Dependencies: Phase 12.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- downstream services receive verified context only;
|
||||||
|
- raw transport details do not leak as authoritative input.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- route selection by `message_type`;
|
||||||
|
- downstream receives the expected authenticated context.
|
||||||
|
|
||||||
|
## Phase 14. Signed Unary Responses
|
||||||
|
|
||||||
|
Goal: return verifiable server responses to authenticated clients.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- response envelope builder
|
||||||
|
- payload hash generation
|
||||||
|
- `ResponseSigner`
|
||||||
|
|
||||||
|
Dependencies: Phase 13.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- unary responses always carry the original `request_id`, `payload_hash`, and
|
||||||
|
server signature.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- response correlation test;
|
||||||
|
- server signature generation test.
|
||||||
|
|
||||||
|
## Phase 15. Session Update and Revocation Events
|
||||||
|
|
||||||
|
Goal: keep gateway session state current without synchronous hot-path lookups.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- `EventSubscriber`
|
||||||
|
- session update handlers
|
||||||
|
- session revoke handlers
|
||||||
|
|
||||||
|
Dependencies: Phase 8.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- session updates change gateway behavior without per-request sync calls to the
|
||||||
|
auth service.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- cache update from event;
|
||||||
|
- revocation event invalidates cached session.
|
||||||
|
|
||||||
|
## Phase 16. Authenticated Push Stream
|
||||||
|
|
||||||
|
Goal: open a verified server-streaming channel for client-facing delivery.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- `SubscribeEvents` handler
|
||||||
|
- stream binding to `user_id` and `device_session_id`
|
||||||
|
- initial server time event
|
||||||
|
|
||||||
|
Dependencies: Phase 15.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- the stream opens only after the full auth pipeline succeeds.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- authorized stream open;
|
||||||
|
- rejected stream open for invalid session;
|
||||||
|
- first event contains server time.
|
||||||
|
|
||||||
|
## Phase 17. Event Fan-Out
|
||||||
|
|
||||||
|
Goal: deliver client-facing events from internal pub/sub to active streams.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- `PushHub`
|
||||||
|
- event fan-out logic
|
||||||
|
- user and session targeting rules
|
||||||
|
|
||||||
|
Dependencies: Phase 16.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- events are delivered to the correct active streams only.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- single-session delivery;
|
||||||
|
- multi-device delivery for one user;
|
||||||
|
- unrelated sessions do not receive the event.
|
||||||
|
|
||||||
|
## Phase 18. Revocation-Driven Stream Teardown
|
||||||
|
|
||||||
|
Goal: terminate active delivery channels when a session is revoked.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- stream teardown on revoke
|
||||||
|
- connection cleanup logic
|
||||||
|
|
||||||
|
Dependencies: Phase 17.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- revocation blocks new unary requests and closes active streams for the same
|
||||||
|
session.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- revoke closes active stream;
|
||||||
|
- revoked session cannot reopen the stream.
|
||||||
|
|
||||||
|
## Phase 19. Observability and Shutdown Hardening
|
||||||
|
|
||||||
|
Goal: make the service operable in production.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- structured logs
|
||||||
|
- metrics
|
||||||
|
- trace propagation
|
||||||
|
- timeout budgets
|
||||||
|
- graceful shutdown for unary and streaming traffic
|
||||||
|
|
||||||
|
Dependencies: Phase 18.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- shutdown is deterministic;
|
||||||
|
- logs and metrics expose stable edge outcomes without leaking secrets.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- shutdown closes listeners and active streams;
|
||||||
|
- secret and signature values are not logged.
|
||||||
|
|
||||||
|
## Phase 20. Acceptance Pass
|
||||||
|
|
||||||
|
Goal: reconcile implementation, documentation, and regression coverage.
|
||||||
|
|
||||||
|
Artifacts:
|
||||||
|
|
||||||
|
- updated README and PLAN
|
||||||
|
- final protocol and interface review
|
||||||
|
- focused regression test run
|
||||||
|
|
||||||
|
Dependencies: Phases 1 through 19.
|
||||||
|
|
||||||
|
Acceptance criteria:
|
||||||
|
|
||||||
|
- implementation matches documented contracts and ordering guarantees;
|
||||||
|
- docs describe the actual gateway behavior.
|
||||||
|
|
||||||
|
Targeted tests:
|
||||||
|
|
||||||
|
- run focused package tests for gateway packages;
|
||||||
|
- rerun cross-cutting regression scenarios.
|
||||||
|
|
||||||
|
## Cross-Cutting Regression Scenarios
|
||||||
|
|
||||||
|
- `send_email_code` and `confirm_email_code` are available without session auth
|
||||||
|
and are still limited by public auth policy.
|
||||||
|
- Public browser bootstrap and asset bursts do not increase auth abuse counters
|
||||||
|
and are not rejected as hostile because of intensity alone.
|
||||||
|
- Any gRPC command without a valid session is rejected before routing.
|
||||||
|
- Unknown and revoked sessions are handled predictably and consistently where
|
||||||
|
policy requires identical behavior.
|
||||||
|
- Signature verification fails when `payload_bytes`, `payload_hash`,
|
||||||
|
`message_type`, `request_id`, or the signing key changes.
|
||||||
|
- `payload_hash` is verified before downstream execution.
|
||||||
|
- Requests outside the freshness window are rejected.
|
||||||
|
- Reused `request_id` values are rejected within the session replay window.
|
||||||
|
- Public REST and authenticated gRPC traffic use independent buckets and
|
||||||
|
independent abuse telemetry.
|
||||||
|
- Downstream services receive `AuthenticatedCommand`, not raw REST or gRPC
|
||||||
|
transport requests.
|
||||||
|
- Unary responses preserve `request_id` correlation and are server-signed.
|
||||||
|
- Streaming connections open only after the auth pipeline and close on revoke.
|
||||||
|
- Session cache updates from events change gateway behavior without synchronous
|
||||||
|
auth-service lookups per request.
|
||||||
|
- Graceful shutdown terminates unary and streaming traffic cleanly.
|
||||||
|
|||||||
@@ -1 +1,414 @@
|
|||||||
# Edge Gateway
|
# Edge Gateway
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
`Edge Gateway` is the only public ingress for Galaxy Plus clients.
|
||||||
|
It terminates the external transport and security boundary, enforces edge
|
||||||
|
policies, and routes verified requests to internal services.
|
||||||
|
|
||||||
|
The gateway does not implement domain-specific business logic.
|
||||||
|
Business validation, authorization, ownership checks, and state transitions
|
||||||
|
remain inside downstream services.
|
||||||
|
|
||||||
|
## Trust Boundary
|
||||||
|
|
||||||
|
The gateway sits between untrusted external clients and trusted internal
|
||||||
|
services.
|
||||||
|
|
||||||
|
The gateway is responsible for:
|
||||||
|
|
||||||
|
- parsing external transport requests;
|
||||||
|
- classifying public REST traffic;
|
||||||
|
- authenticating protected gRPC traffic;
|
||||||
|
- loading session state from cache;
|
||||||
|
- verifying request freshness and anti-replay constraints;
|
||||||
|
- applying edge rate limits and anti-abuse policy;
|
||||||
|
- building an authenticated internal command context;
|
||||||
|
- routing verified commands to internal services;
|
||||||
|
- maintaining authenticated push delivery connections.
|
||||||
|
|
||||||
|
The gateway is not responsible for:
|
||||||
|
|
||||||
|
- deciding whether a user is allowed to execute a business action;
|
||||||
|
- validating domain invariants;
|
||||||
|
- storing the source-of-truth session record;
|
||||||
|
- implementing business idempotency.
|
||||||
|
|
||||||
|
## Transport Matrix
|
||||||
|
|
||||||
|
The gateway exposes two external transport classes.
|
||||||
|
|
||||||
|
| Transport | Audience | Authentication | Payload format | Primary use |
|
||||||
|
| --- | --- | --- | --- | --- |
|
||||||
|
| REST/JSON | Public, unauthenticated traffic | No device session auth | JSON | Public auth commands, health checks, browser/bootstrap traffic |
|
||||||
|
| gRPC over HTTP/2 | Authenticated clients only | Required | FlatBuffers payload inside protobuf control envelope | Verified commands and push delivery |
|
||||||
|
|
||||||
|
### Public REST Surface
|
||||||
|
|
||||||
|
The public REST surface is used for commands that must work before a device
|
||||||
|
session exists and for browser-originated traffic that may share the same edge.
|
||||||
|
|
||||||
|
Stable public endpoints:
|
||||||
|
|
||||||
|
- `POST /api/v1/public/auth/send-email-code`
|
||||||
|
- `POST /api/v1/public/auth/confirm-email-code`
|
||||||
|
- `GET /healthz`
|
||||||
|
- `GET /readyz`
|
||||||
|
|
||||||
|
In addition to the fixed endpoints above, the gateway may front browser
|
||||||
|
bootstrap or asset traffic through a pluggable public handler or proxy.
|
||||||
|
That traffic belongs to dedicated public route classes and must not share rate
|
||||||
|
limit buckets or abuse counters with the public auth API.
|
||||||
|
|
||||||
|
### Authenticated gRPC Surface
|
||||||
|
|
||||||
|
All authenticated client requests use HTTP/2 and gRPC.
|
||||||
|
|
||||||
|
The public gRPC service exposes two methods:
|
||||||
|
|
||||||
|
- `ExecuteCommand(ExecuteCommandRequest) returns (ExecuteCommandResponse)`
|
||||||
|
- `SubscribeEvents(SubscribeEventsRequest) returns (stream GatewayEvent)`
|
||||||
|
|
||||||
|
`ExecuteCommand` is a generic unary RPC.
|
||||||
|
The gateway routes the request downstream by `message_type` after transport
|
||||||
|
verification succeeds.
|
||||||
|
|
||||||
|
`SubscribeEvents` is an authenticated server-streaming RPC.
|
||||||
|
It binds the stream to `user_id` and `device_session_id` and starts by sending
|
||||||
|
a service event that includes the current server time in milliseconds.
|
||||||
|
|
||||||
|
## Envelope and Payload Model
|
||||||
|
|
||||||
|
The authenticated transport uses a split contract:
|
||||||
|
|
||||||
|
- gRPC control messages are protobuf-based;
|
||||||
|
- business payload bytes are FlatBuffers;
|
||||||
|
- signatures are computed over canonical envelope fields and a hash of raw
|
||||||
|
FlatBuffers bytes.
|
||||||
|
|
||||||
|
The gateway treats `payload_bytes` as opaque business data.
|
||||||
|
It verifies integrity and forwards verified bytes downstream without rewriting
|
||||||
|
them.
|
||||||
|
|
||||||
|
### ExecuteCommandRequest
|
||||||
|
|
||||||
|
Required fields:
|
||||||
|
|
||||||
|
- `protocol_version`
|
||||||
|
- `device_session_id`
|
||||||
|
- `message_type`
|
||||||
|
- `timestamp_ms`
|
||||||
|
- `request_id`
|
||||||
|
- `payload_bytes`
|
||||||
|
- `payload_hash`
|
||||||
|
- `signature`
|
||||||
|
|
||||||
|
Optional fields:
|
||||||
|
|
||||||
|
- `trace_id`
|
||||||
|
|
||||||
|
### ExecuteCommandResponse
|
||||||
|
|
||||||
|
Required fields:
|
||||||
|
|
||||||
|
- `protocol_version`
|
||||||
|
- `request_id`
|
||||||
|
- `timestamp_ms`
|
||||||
|
- `result_code`
|
||||||
|
- `payload_bytes`
|
||||||
|
- `payload_hash`
|
||||||
|
- `signature`
|
||||||
|
|
||||||
|
### SubscribeEventsRequest
|
||||||
|
|
||||||
|
The stream open request reuses the authenticated request model.
|
||||||
|
It contains the same authentication fields as the unary request and either an
|
||||||
|
empty payload or a minimal connect payload.
|
||||||
|
|
||||||
|
Required fields:
|
||||||
|
|
||||||
|
- `protocol_version`
|
||||||
|
- `device_session_id`
|
||||||
|
- `message_type`
|
||||||
|
- `timestamp_ms`
|
||||||
|
- `request_id`
|
||||||
|
- `payload_hash`
|
||||||
|
- `signature`
|
||||||
|
|
||||||
|
Optional fields:
|
||||||
|
|
||||||
|
- `payload_bytes`
|
||||||
|
- `trace_id`
|
||||||
|
|
||||||
|
### GatewayEvent
|
||||||
|
|
||||||
|
Every stream event is a client-facing signed server message.
|
||||||
|
|
||||||
|
Required fields:
|
||||||
|
|
||||||
|
- `event_type`
|
||||||
|
- `event_id`
|
||||||
|
- `timestamp_ms`
|
||||||
|
- `payload_bytes`
|
||||||
|
- `payload_hash`
|
||||||
|
- `signature`
|
||||||
|
|
||||||
|
Optional fields:
|
||||||
|
|
||||||
|
- `request_id`
|
||||||
|
- `trace_id`
|
||||||
|
|
||||||
|
## Verification and Routing Pipeline
|
||||||
|
|
||||||
|
The gateway applies the same strict verification order for authenticated gRPC
|
||||||
|
ingress.
|
||||||
|
|
||||||
|
1. Parse the control envelope and validate required fields.
|
||||||
|
2. Check whether `protocol_version` is supported.
|
||||||
|
3. Resolve `device_session_id` through `SessionCache`.
|
||||||
|
4. Reject unknown or revoked sessions.
|
||||||
|
5. Verify that `payload_hash` matches raw `payload_bytes`.
|
||||||
|
6. Verify the client signature using the public key from session cache.
|
||||||
|
7. Verify that `timestamp_ms` is inside the accepted freshness window.
|
||||||
|
8. Verify anti-replay by checking `device_session_id + request_id`.
|
||||||
|
9. Apply authenticated rate limit and edge policy checks.
|
||||||
|
10. Build the authenticated internal command context.
|
||||||
|
11. Route the command downstream by `message_type`.
|
||||||
|
|
||||||
|
No downstream business service should receive a request that has not passed
|
||||||
|
this full verification pipeline.
|
||||||
|
|
||||||
|
## Internal Authenticated Contract
|
||||||
|
|
||||||
|
Downstream services should receive an internal authenticated command rather than
|
||||||
|
raw external gRPC transport data.
|
||||||
|
|
||||||
|
The minimum authenticated context is:
|
||||||
|
|
||||||
|
- `user_id`
|
||||||
|
- `device_session_id`
|
||||||
|
- `message_type`
|
||||||
|
- verified `payload_bytes`
|
||||||
|
- `request_id`
|
||||||
|
- optional `trace_id`
|
||||||
|
- optional client metadata needed for logs and tracing
|
||||||
|
|
||||||
|
Downstream services may trust that the gateway has already performed transport
|
||||||
|
authentication, freshness verification, and anti-replay checks.
|
||||||
|
They must still perform business authorization and domain validation.
|
||||||
|
|
||||||
|
## Session Model
|
||||||
|
|
||||||
|
The Auth / Session Service is the source of truth for device session state.
|
||||||
|
The gateway is designed to authenticate the hot path from cache.
|
||||||
|
|
||||||
|
Expected session fields available to the gateway:
|
||||||
|
|
||||||
|
- `device_session_id`
|
||||||
|
- `user_id`
|
||||||
|
- client public key
|
||||||
|
- session status
|
||||||
|
- revoke metadata
|
||||||
|
- optional client metadata
|
||||||
|
|
||||||
|
### Session Cache
|
||||||
|
|
||||||
|
`SessionCache` provides the fast path for:
|
||||||
|
|
||||||
|
- session existence checks;
|
||||||
|
- `device_session_id -> user_id`;
|
||||||
|
- access to the client public key used for signature verification;
|
||||||
|
- revoked versus active status checks.
|
||||||
|
|
||||||
|
Cache updates are event-driven.
|
||||||
|
TTL is allowed only as a safety net and must not replace invalidation events.
|
||||||
|
|
||||||
|
### Revocation Behavior
|
||||||
|
|
||||||
|
When a device session is revoked:
|
||||||
|
|
||||||
|
1. the Auth / Session Service updates the source of truth;
|
||||||
|
2. it publishes a session update or revoke event;
|
||||||
|
3. the gateway invalidates or updates `SessionCache`;
|
||||||
|
4. new unary gRPC requests for that session are rejected;
|
||||||
|
5. active `SubscribeEvents` streams for that session are closed.
|
||||||
|
|
||||||
|
## Public Anti-Abuse Model
|
||||||
|
|
||||||
|
The public REST layer must distinguish between public auth operations and
|
||||||
|
browser-originated traffic that may burst during a normal first page load.
|
||||||
|
|
||||||
|
The gateway uses these public route classes:
|
||||||
|
|
||||||
|
- `public_auth`
|
||||||
|
- `browser_bootstrap`
|
||||||
|
- `browser_asset`
|
||||||
|
- `public_misc`
|
||||||
|
|
||||||
|
### Public Auth
|
||||||
|
|
||||||
|
`public_auth` includes `send-email-code` and `confirm-email-code`.
|
||||||
|
This class uses stricter limits and abuse scoring because it directly touches
|
||||||
|
account and session creation flows.
|
||||||
|
|
||||||
|
Controls include:
|
||||||
|
|
||||||
|
- per-IP and per-identity rate limits;
|
||||||
|
- request body size limits;
|
||||||
|
- method allow-lists;
|
||||||
|
- malformed request counters;
|
||||||
|
- elevated logging and security telemetry for repeated failures.
|
||||||
|
|
||||||
|
### Browser Bootstrap and Asset Traffic
|
||||||
|
|
||||||
|
`browser_bootstrap` and `browser_asset` use separate coarse-grained budgets.
|
||||||
|
They may exhibit bursty behavior during the first load and therefore must not
|
||||||
|
be treated as hostile based on burst pattern alone.
|
||||||
|
|
||||||
|
This traffic is still constrained by:
|
||||||
|
|
||||||
|
- dedicated rate limits;
|
||||||
|
- method allow-lists;
|
||||||
|
- body size limits where request bodies are expected;
|
||||||
|
- protocol and path validation;
|
||||||
|
- independent abuse telemetry.
|
||||||
|
|
||||||
|
The gateway must not merge these buckets or counters with `public_auth`.
|
||||||
|
|
||||||
|
## Push Delivery Model
|
||||||
|
|
||||||
|
The v1 push channel is a gRPC server stream.
|
||||||
|
Long-polling is intentionally out of scope for the first version.
|
||||||
|
|
||||||
|
Expected stream behavior:
|
||||||
|
|
||||||
|
1. the client opens `SubscribeEvents`;
|
||||||
|
2. the gateway applies the full authenticated ingress verification pipeline;
|
||||||
|
3. the stream is bound to `user_id` and `device_session_id`;
|
||||||
|
4. the first service event includes `server_time_ms`;
|
||||||
|
5. client-facing events from internal pub/sub are fanned out to matching active
|
||||||
|
streams;
|
||||||
|
6. revoke events close affected streams.
|
||||||
|
|
||||||
|
## Recommended Package Layout
|
||||||
|
|
||||||
|
The initial package layout should keep transport, policy, and downstream
|
||||||
|
adapters separate:
|
||||||
|
|
||||||
|
- `cmd/gateway`
|
||||||
|
- `internal/app`
|
||||||
|
- `internal/config`
|
||||||
|
- `internal/restapi`
|
||||||
|
- `internal/grpcapi`
|
||||||
|
- `internal/authn`
|
||||||
|
- `internal/session`
|
||||||
|
- `internal/replay`
|
||||||
|
- `internal/ratelimit`
|
||||||
|
- `internal/downstream`
|
||||||
|
- `internal/push`
|
||||||
|
- `internal/events`
|
||||||
|
- `internal/clock`
|
||||||
|
|
||||||
|
## Key Interfaces
|
||||||
|
|
||||||
|
The gateway should be built around explicit consumer-side interfaces.
|
||||||
|
|
||||||
|
### SessionCache
|
||||||
|
|
||||||
|
Provides cached session lookup by `device_session_id`.
|
||||||
|
Returns enough data to verify signatures and identify the authenticated user.
|
||||||
|
|
||||||
|
### ReplayStore
|
||||||
|
|
||||||
|
Tracks recently seen `request_id` values per device session and rejects replayed
|
||||||
|
requests inside the accepted freshness window.
|
||||||
|
|
||||||
|
### RateLimiter
|
||||||
|
|
||||||
|
Applies independent policies for:
|
||||||
|
|
||||||
|
- public REST route classes;
|
||||||
|
- authenticated gRPC requests by IP;
|
||||||
|
- authenticated gRPC requests by session;
|
||||||
|
- authenticated gRPC requests by user;
|
||||||
|
- authenticated gRPC requests by message class.
|
||||||
|
|
||||||
|
### PublicTrafficClassifier
|
||||||
|
|
||||||
|
Maps incoming public REST requests to one of the public route classes so that
|
||||||
|
limits and anti-abuse counters remain isolated.
|
||||||
|
|
||||||
|
### AuthServiceClient
|
||||||
|
|
||||||
|
Handles public auth commands and session-related updates exchanged with the
|
||||||
|
Auth / Session Service.
|
||||||
|
|
||||||
|
### DownstreamRouter
|
||||||
|
|
||||||
|
Resolves the target downstream service or adapter by `message_type`.
|
||||||
|
|
||||||
|
### DownstreamClient
|
||||||
|
|
||||||
|
Executes a verified authenticated command against a downstream internal service
|
||||||
|
and returns response payload bytes plus a stable result code.
|
||||||
|
|
||||||
|
### EventSubscriber
|
||||||
|
|
||||||
|
Subscribes to internal pub/sub topics used for:
|
||||||
|
|
||||||
|
- session cache updates;
|
||||||
|
- revocations;
|
||||||
|
- client-facing event delivery.
|
||||||
|
|
||||||
|
### PushHub
|
||||||
|
|
||||||
|
Tracks active `SubscribeEvents` streams, binds them to authenticated identities,
|
||||||
|
and delivers events to the correct connections.
|
||||||
|
|
||||||
|
### ResponseSigner
|
||||||
|
|
||||||
|
Signs unary responses and stream events so clients can verify server-originated
|
||||||
|
messages.
|
||||||
|
|
||||||
|
### Clock
|
||||||
|
|
||||||
|
Provides current server time and supports consistent freshness-window checks.
|
||||||
|
|
||||||
|
## Error Model and Observability
|
||||||
|
|
||||||
|
The gateway should expose stable edge-level error classes instead of leaking
|
||||||
|
internal implementation details.
|
||||||
|
|
||||||
|
Minimum error categories:
|
||||||
|
|
||||||
|
- malformed request;
|
||||||
|
- unsupported protocol;
|
||||||
|
- unknown session;
|
||||||
|
- revoked session;
|
||||||
|
- invalid signature;
|
||||||
|
- stale request;
|
||||||
|
- replay detected;
|
||||||
|
- rate limited;
|
||||||
|
- downstream unavailable;
|
||||||
|
- internal error.
|
||||||
|
|
||||||
|
Observability requirements:
|
||||||
|
|
||||||
|
- stable correlation identifiers, including `request_id` and optional `trace_id`;
|
||||||
|
- structured logs;
|
||||||
|
- security audit events for rejects and abuse signals;
|
||||||
|
- metrics keyed by route class, message type, result code, and reject reason;
|
||||||
|
- no logging of secrets, raw private material, or raw signatures.
|
||||||
|
|
||||||
|
## Non-Goals
|
||||||
|
|
||||||
|
The gateway is not a business authorization layer and must not grow into a
|
||||||
|
domain coordinator.
|
||||||
|
|
||||||
|
The gateway must not:
|
||||||
|
|
||||||
|
- implement business ownership checks;
|
||||||
|
- validate domain state transitions;
|
||||||
|
- replace the Auth / Session Service as the session source of truth;
|
||||||
|
- degrade into a synchronous pass-through that reloads session state for every
|
||||||
|
authenticated request.
|
||||||
|
|||||||
Reference in New Issue
Block a user