diff --git a/geoprofile/PLAN.md b/geoprofile/PLAN.md new file mode 100644 index 0000000..a6db87f --- /dev/null +++ b/geoprofile/PLAN.md @@ -0,0 +1,679 @@ +# Implementation Plan for Geo Profile Service + +## Planning Principles + +This plan is aligned with the agreed architecture and is written for an experienced developer implementing an internal microservice in a trusted environment. + +Execution priorities: + +- Keep the edge path non-blocking. +- Keep the service boundary narrow. +- Build append/update-only ingest first. +- Preserve clear ownership split with `User Service`. +- Defer threshold tuning until after the basic data model is working. +- Avoid unnecessary infrastructure on the first iteration. + +## Stage 01 — Freeze Service Vocabulary and Contracts + +Goal: + +- Remove naming ambiguity before any implementation begins. + +Tasks: + +- Choose the final service name used in repository, configuration, and docs. +- Freeze the country-related domain terms: + - `declared_country` + - `observed_country` + - `usual_connection_country` + - `country_review_recommended` +- Freeze cross-service ownership rules. +- Write a short internal ADR describing why the latest `declared_country` lives in `User Service` while version history lives in Geo Profile Service. +- Write a short internal ADR describing why the edge path is async FlatBuffers instead of request-response RPC. + +Exit criteria: + +- No domain term remains overloaded or unclear. +- No service boundary question remains unresolved. + +## Stage 02 — Define the Minimal Domain Model + +Goal: + +- Describe the persistent state before choosing transport or storage details. + +Tasks: + +- Define conceptual entities and their relationships. +- Freeze mandatory fields for: + - country observation + - per-session country ranking + - review flag state + - declared country version history + - session block request log +- Decide which timestamps are mandatory on each entity. +- Decide whether optional hashed IP storage exists at all in v1. +- Decide whether declared-country version records need explicit lifecycle state: + - `recorded` + - `applied` + - `sync_failed` + +Recommended minimal entities: + +- `country_observation` +- `device_session_country_score` +- `user_review_state` +- `declared_country_version` +- `session_block_action` + +Exit criteria: + +- The storage layer can be designed directly from the domain model. +- The model reflects all agreed semantics and no extra features. + +## Stage 03 — Design the Ingest Message Schema + +Goal: + +- Freeze the binary contract from `Edge Service` to Geo Profile Service. + +Tasks: + +- Create the FlatBuffers schema for the async ingest message. +- Limit message fields to: + - `user_id` + - `device_session_id` + - `ip_address` +- Define allowed field types and byte layout. +- Define message versioning strategy for future backward-compatible additions. +- Decide how schema version is represented. +- Define receiver behavior for malformed messages. + +Important constraints: + +- No protobuf wrapper. +- No business reply payload. +- No external validation of identifiers. +- Only schema-level validation on receipt. + +Exit criteria: + +- `Edge Service` and Geo Profile Service can generate compatible FlatBuffers code. +- Message evolution path exists without breaking v1. + +## Stage 04 — Choose and Implement the Async Ingest Transport + +Goal: + +- Implement the simplest possible binary ingress path that does not behave like normal RPC. + +Tasks: + +- Choose the concrete transport for internal binary publication. +- Recommended default: + - internal HTTP endpoint + - `application/octet-stream` + - FlatBuffers body + - empty response body + - status-only acknowledgement +- Implement the receiver endpoint in Geo Profile Service. +- Implement an async publisher client in `Edge Service`. +- Ensure the edge client publishes out-of-band from the main request execution path. +- Ensure the edge ignores publication failures for request progression. +- Add metrics for publish attempts, successes, and failures. + +Important note: + +- The edge path must remain operational even if Geo Profile Service is completely unavailable. + +Exit criteria: + +- The edge can publish authenticated observations without blocking the main API flow. +- Transport failures do not change edge business behavior. + +## Stage 05 — Build the Internal Durable Queue + +Goal: + +- Decouple acceptance of ingress messages from their processing. + +Tasks: + +- Select the simplest queue implementation inside the service. +- Prefer a durable queue over an in-memory-only queue. +- Implement enqueue-on-receive behavior. +- Implement worker dequeue behavior. +- Define queue item lifecycle: + - accepted + - processing + - processed + - failed +- Define retry strategy for worker failures. +- Define dead-letter or failure-handling strategy if retries are exhausted. +- Add queue metrics: + - depth + - oldest item age + - processing rate + - failure count + +Recommended starting point: + +- Database-backed queue table or similarly simple durable append structure. + +Exit criteria: + +- Geo Profile Service can accept messages quickly and process them later. +- Worker failures do not lose already accepted work silently. + +## Stage 06 — Add Local Geo-IP Resolution + +Goal: + +- Resolve country from IP locally and cheaply. + +Tasks: + +- Choose the Geo-IP database for v1. +- Add a loader for the local country database. +- Implement lookup adapter for IP to country. +- Define how unknown, invalid, or non-resolvable IPs are handled. +- Add a periodic database refresh job. +- Add health signals for Geo-IP database presence and age. + +Design constraints: + +- Country only. +- No external network lookup during request processing. +- No Geo-IP version persistence with each observation. + +Exit criteria: + +- Workers can resolve country from IP locally. +- Geo-IP database refresh is operationally manageable. + +## Stage 07 — Persist Observation Facts + +Goal: + +- Materialize `observed_country` as stored domain facts. + +Tasks: + +- Implement the observation persistence model. +- Store at minimum: + - `user_id` + - `device_session_id` + - `observed_country` + - observation time +- Decide whether observations are stored as full facts, time-bucketed facts, or a hybrid model. +- Keep storage bounded and suitable for later aggregation. +- Add read support needed for internal recalculation and admin inspection. + +Constraints: + +- Do not turn this into a raw per-request IP audit log. +- Prefer country-level facts over low-level network data. + +Exit criteria: + +- The service stores enough observed-country history to support ranking and review. + +## Stage 08 — Implement Per-Session Country Ranking + +Goal: + +- Maintain ranked countries per `device_session_id`. + +Tasks: + +- Define the initial scoring algorithm using recent activities with decay. +- Implement score update on each processed observation. +- Persist ranked country scores per `device_session_id`. +- Define how ties are handled. +- Define how stale scores decay or are compacted over time. +- Expose enough state for later admin inspection. + +Important constraints: + +- No active-day model in v1. +- No heavy analytics pipeline. +- Keep updates cheap enough for continuous background processing. + +Exit criteria: + +- Each `device_session_id` has a current ranked country list. +- Ranking is stable and cheap to update. + +## Stage 09 — Compute usual_connection_country + +Goal: + +- Derive a current per-session representative country from the ranking. + +Tasks: + +- Define the selection rule for the top country. +- Decide whether a minimum score or minimum margin is needed before setting a value. +- Persist the current `usual_connection_country` per `device_session_id`. +- Add recalculation hooks when session country scores change. +- Add tests for common drift scenarios: + - one stable country + - gradual shift over time + - alternating countries + - sparse activity + +Exit criteria: + +- `usual_connection_country` can be read directly without recomputing the full score set every time. + +## Stage 10 — Implement Review Recommendation State + +Goal: + +- Persist and expose `country_review_recommended`. + +Tasks: + +- Define the initial rule that sets the review flag. +- Persist review state at user level. +- Detect transitions from `false` to `true`. +- Ensure repeated writes do not keep re-emitting the same transition indefinitely. +- Add API access for reading the flag. +- Add background recalculation entry points if the rule changes later. + +Design requirement: + +- Review state must live in storage and be queryable even if event delivery fails. + +Exit criteria: + +- The flag is durable, queryable, and transition-aware. + +## Stage 11 — Publish Review Events and Optional Email + +Goal: + +- Add auxiliary notifications for review-worthy users. + +Tasks: + +- Define the event payload for `country_review_recommended=true`. +- Implement event publication on transition to `true`. +- Implement configuration-driven email notification through `Mail Service`. +- Add notification deduplication or transition-only logic to prevent spam. +- Add failure metrics for both event publication and mail send. + +Important constraints: + +- The event bus is not the authoritative source of truth. +- Email is optional and non-blocking for business correctness. + +Exit criteria: + +- Review transitions can notify administrators without becoming a dependency for state correctness. + +## Stage 12 — Implement Suspicious Multi-Country Session Detection + +Goal: + +- Detect suspicious short-window cross-country behavior across sessions of the same user. + +Tasks: + +- Define the initial heuristic for suspicious mixed-country windows. +- Decide which session becomes the target of blocking when a conflict appears. +- Implement detection logic using stored observations and/or per-session summaries. +- Add persistence for suspicion evidence or at least action logs. +- Keep the heuristic configurable, not hard-coded deep in the codebase. + +Important constraints: + +- The current triggering request is allowed to continue. +- Only suspicious `device_session_id` values are blocked. +- The entire user account is never blocked by this service. + +Exit criteria: + +- The service can identify suspicious session patterns and produce a block action request. + +## Stage 13 — Integrate Session Blocking with Auth / Session Service + +Goal: + +- Make suspicious session handling operational. + +Tasks: + +- Define the internal API contract for session blocking. +- Implement the client toward `Auth / Session Service`. +- Ensure block requests are idempotent. +- Record block requests and outcomes locally for inspection. +- Add retry or failure-handling policy for temporary downstream failures. +- Add metrics for block attempts, successes, and failures. + +Exit criteria: + +- Geo Profile Service can request blocking of suspicious sessions and track the result. + +## Stage 14 — Implement Declared Country Version History + +Goal: + +- Add versioned history of `declared_country` inside Geo Profile Service. + +Tasks: + +- Define the version record schema. +- Persist all approved changes as immutable version records. +- Add actor metadata needed for internal audit: + - who triggered the change + - when it happened + - optional reason or comment +- Implement version lifecycle state if adopted: + - `recorded` + - `applied` + - `sync_failed` +- Add read support for history in admin APIs. + +Important constraint: + +- Version history is owned only by Geo Profile Service. + +Exit criteria: + +- The service can preserve the full change history independently from `User Service`. + +## Stage 15 — Implement Current Country Sync to User Service + +Goal: + +- Keep the latest effective `declared_country` centralized in `User Service`. + +Tasks: + +- Define the internal REST contract to update current `declared_country` in `User Service`. +- Implement synchronous update from Geo Profile Service. +- Ensure that a history version does not become effective until the sync succeeds. +- Implement failure handling and status persistence when sync fails. +- Add retry tooling or operator visibility for failed syncs. + +Design requirement: + +- No other service should bypass this write path. + +Exit criteria: + +- Approved changes update both version history and current user state without silent divergence. + +## Stage 16 — Build the Internal Read APIs + +Goal: + +- Expose the minimum trusted JSON REST API required for operations and admin tooling. + +Tasks: + +- Implement review-candidate listing endpoint. +- Support at least: + - `country_review_recommended=true` + - pagination + - stable ordering +- Implement user geo-profile endpoint. +- Group returned data by `device_session_id`. +- Include: + - review flag + - per-session ranked countries + - `usual_connection_country` + - observation summaries + - declared country history + - block-action history if useful +- Add authentication and authorization appropriate for trusted internal callers. + +Exit criteria: + +- Admin tools can list users for review and inspect full geo-related user state. + +## Stage 17 — Build the Internal Command API for Country Change Application + +Goal: + +- Expose the internal command path for approved `declared_country` changes. + +Tasks: + +- Implement the trusted internal command endpoint. +- Accept the approved new country and actor metadata. +- Write the new version record. +- Synchronize current value into `User Service`. +- Return success only if the change is fully applied. +- Return a recoverable failure state if sync fails. + +Clarification: + +- Public user-facing request creation is outside this service boundary unless explicitly added later. +- This command API is for internal orchestration of approved changes. + +Exit criteria: + +- Admin or internal orchestration can apply a country change through one controlled path. + +## Stage 18 — Add Admin-Oriented Data Shaping + +Goal: + +- Make the returned data useful for manual decisions without overloading the API consumer. + +Tasks: + +- Shape user geo-profile responses around manual review needs. +- Include compact ranked-country views per session. +- Include enough timestamps to understand temporal drift. +- Include current review recommendation state. +- Include declared-country version chain in a readable order. +- Avoid leaking unnecessary low-level network data. + +Exit criteria: + +- The admin interface can render useful country history and session separation without extra joins. + +## Stage 19 — Add Observability and Operational Controls + +Goal: + +- Make the service operable in production before traffic ramps up. + +Tasks: + +- Add metrics for every critical path: + - ingest publish receipt + - queue depth and lag + - worker throughput + - Geo-IP lookup failures + - ranking updates + - review-flag transitions + - block requests + - user-service sync failures + - mail and event failures +- Add structured logs with correlation identifiers where possible. +- Add readiness and liveness endpoints. +- Add dashboards and alerts for: + - queue lag + - persistent sync failures + - spike in suspicious session blocks + - Geo-IP database stale age + +Exit criteria: + +- Production operation does not depend on manual log-grepping. + +## Stage 20 — Add Test Coverage in Increasing Layers + +Goal: + +- Validate the service incrementally, from pure logic up to full integration. + +Tasks: + +- Add unit tests for: + - Geo-IP lookup adapter + - ranking logic + - `usual_connection_country` selection + - review recommendation logic + - suspicious session detection +- Add storage tests for: + - observation persistence + - version history + - queue behavior +- Add integration tests for: + - edge-style ingest acceptance + - worker processing + - `User Service` sync behavior + - `Auth / Session Service` block calls + - event and mail side effects +- Add failure-path tests: + - malformed FlatBuffers payload + - queue retry + - Geo-IP lookup miss + - `User Service` sync failure + - block-request downstream failure + +Exit criteria: + +- The highest-risk logic and all external integrations are covered. + +## Stage 21 — Add Data Migration and Backfill Strategy + +Goal: + +- Prepare for safe rollout in an existing microservice environment. + +Tasks: + +- Create initial database migrations. +- Define zero-data bootstrap behavior for new users and sessions. +- Define how existing users with already populated `declared_country` in `User Service` appear in Geo Profile Service before any version history exists. +- Decide whether an initial synthetic version record is needed for current production users. +- Add operational scripts for repair and backfill if required. + +Exit criteria: + +- The service can be introduced without corrupting current user country state. + +## Stage 22 — Roll Out in Shadow Mode + +Goal: + +- Validate the service behavior before relying on its outputs operationally. + +Tasks: + +- Deploy Geo Profile Service without enabling admin actions or session blocking. +- Publish ingest data from edge asynchronously. +- Process observations and compute derived state silently. +- Observe queue behavior, lookup correctness, score stability, and storage growth. +- Compare resulting data shape against expected real traffic behavior. +- Tune thresholds for: + - review recommendation + - suspicious mixed-country detection + - score decay + +Exit criteria: + +- The service behaves sanely on production-shaped traffic without affecting users. + +## Stage 23 — Enable Review Workflow + +Goal: + +- Turn on the first real consumer-facing internal functionality. + +Tasks: + +- Enable review-candidate listing in the admin interface. +- Enable user geo-profile rendering. +- Enable approved country-change application path. +- Keep session blocking disabled if needed for a staged rollout. +- Verify that `User Service` stays consistent with declared-country version history. + +Exit criteria: + +- Administrators can inspect users and apply country changes safely. + +## Stage 24 — Enable Suspicious Session Blocking + +Goal: + +- Turn on the account-protection part of the service. + +Tasks: + +- Enable session-block command emission to `Auth / Session Service`. +- Start with conservative thresholds. +- Monitor false positives closely. +- Add temporary operational kill-switches for the detection path. +- Verify that only suspicious sessions are blocked and not entire accounts. + +Exit criteria: + +- The service can protect accounts without destabilizing the rest of the platform. + +## Stage 25 — Stabilize and Simplify + +Goal: + +- Remove accidental complexity after the first complete iteration. + +Tasks: + +- Review actual queue backlog behavior. +- Review observation retention cost. +- Review whether optional hashed IP storage is still unnecessary. +- Review scoring tunability versus implementation complexity. +- Remove dead code and speculative abstractions. +- Freeze the v1 API once real consumers are stable. + +Exit criteria: + +- The service remains small, understandable, and aligned with its original narrow purpose. + +## Delivery Sequence Summary + +Recommended delivery order: + +- Domain vocabulary and ownership +- Domain model +- FlatBuffers schema +- Async ingest transport +- Internal durable queue +- Geo-IP lookup +- Observation persistence +- Session ranking +- `usual_connection_country` +- Review state +- Event and mail notifications +- Suspicious-session detection +- Session blocking integration +- Declared-country versioning +- Sync to `User Service` +- Admin read API +- Country-change command API +- Observability +- Tests +- Shadow rollout +- Review enablement +- Blocking enablement +- Cleanup + +## Final Acceptance Criteria + +The implementation may be considered complete for v1 when all of the following are true: + +- `Edge Service` publishes authenticated country observations asynchronously without affecting request processing. +- Geo Profile Service resolves and stores `observed_country`. +- The service maintains per-`device_session_id` country ranking and `usual_connection_country`. +- `country_review_recommended` is durable, queryable, and not event-dependent. +- Admin tooling can fetch review candidates and per-user geo profiles. +- Approved `declared_country` changes are versioned in Geo Profile Service and synchronized into `User Service`. +- Suspicious sessions can be blocked through `Auth / Session Service`. +- Optional email and event notifications work without becoming correctness dependencies. +- The service is observable and operable under real traffic. diff --git a/geoprofile/README.md b/geoprofile/README.md new file mode 100644 index 0000000..a591ac4 --- /dev/null +++ b/geoprofile/README.md @@ -0,0 +1,929 @@ +# Geo Profile Service + +## Context and Purpose + +Geo Profile Service is an internal trusted microservice responsible for collecting and processing country-level connection signals for authenticated users. + +The service exists to solve four related problems: + +- Record the observed country of authenticated requests based on local Geo-IP lookup. +- Maintain per-`device_session_id` country statistics and derive a `usual_connection_country`. +- Support administrative review workflows around user country changes. +- Detect suspicious multi-country session behavior and request blocking of suspicious sessions through `Auth / Session Service`. + +The service is intentionally narrow in scope. It does not own authentication, user identity validation, or user-facing profile reads for the latest country value. + +## Explicit Non-Goals + +The following are intentionally out of scope for this service: + +- Region-level or city-level geolocation. +- VPN, proxy, anonymizer, or hosting-provider detection. +- Automatic change of `declared_country` based on observed metrics. +- Immediate blocking of the same request that triggered suspicion. +- Global source-of-truth ownership for the current user country. +- Direct exposure of storage to other services. +- Strong audit reproducibility of historical Geo-IP lookup results by storing Geo-IP database versions. + +## Place in the Existing Microservice System + +The service is embedded into an already existing trusted microservice environment and integrates with: + +- `Edge Service` +- `Auth / Session Service` +- `User Service` +- `Mail Service` +- Internal event bus + +`Edge Service` is the producer of authenticated connection observations. + +`User Service` remains the centralized owner of the latest effective `declared_country` value for normal user profile reads. + +`Auth / Session Service` remains the owner of session lifecycle and session blocking. + +`Mail Service` is used only for optional administrative notifications. + +The event bus is used only as an auxiliary notification channel and not as the authoritative source of business state. + +## Responsibility Boundaries + +Geo Profile Service owns: + +- Geo-IP lookup at country level using a local database. +- Storage of `observed_country` as a fact of observation. +- Per-`device_session_id` country aggregation. +- Computation of `usual_connection_country`. +- Computation and storage of `country_review_recommended`. +- Version history of `declared_country`. +- Internal administrative read APIs for geo-related user state. +- Internal command API to apply approved `declared_country` changes. +- Detection of suspicious cross-country session patterns. +- Session block requests toward `Auth / Session Service`. + +Geo Profile Service does not own: + +- Validation of `user_id` and `device_session_id` against external services. +- Public user profile reads for the latest country value. +- Authentication or authorization of end users. +- Final enforcement of session blocking. +- Delivery guarantees of auxiliary event notifications. +- Formal administrative SLA or rigid approval policies. + +## Semantic Model + +The service works with four core country-related concepts. + +### declared_country + +`declared_country` is the user-declared country. + +Properties: + +- It is a user-facing business attribute. +- The latest effective value is stored in `User Service`. +- The full version history is stored in Geo Profile Service. +- It is never changed automatically by metrics. +- It changes only through a controlled command path and administrative approval. + +### observed_country + +`observed_country` is the country derived from Geo-IP for a specific authenticated request. + +Properties: + +- It is an observation fact, not a truth claim about residence. +- It is tied to `user_id`, `device_session_id`, and observation time. +- It is derived on the server side from the source IP seen at the trusted edge. +- It is used as input into country aggregation and anomaly detection. + +### usual_connection_country + +`usual_connection_country` is the computed most typical country of network egress for a given `device_session_id`. + +Properties: + +- It is not interpreted as country of residence. +- It is calculated per `device_session_id`, not globally per account. +- It is derived from recent weighted observations with decay over time. +- It is expected to drift naturally as usage patterns change. + +### country_review_recommended + +`country_review_recommended` is an internal service flag that indicates that the accumulated observations justify administrative review. + +Properties: + +- It does not trigger automatic country change. +- It is stored durably in the service state. +- It is readable through the service API. +- Transition to `true` may also emit an event and optionally send email. + +## Data Ownership Rules + +The split ownership model is intentional. + +- `User Service` owns the latest effective `declared_country`. +- Geo Profile Service owns the history of `declared_country` changes. +- Geo Profile Service owns `observed_country`, `usual_connection_country`, and `country_review_recommended`. + +This means Geo Profile Service is the owner of the country-change process, but `User Service` is the owner of the currently effective denormalized value used by the rest of the system. + +To avoid divergence: + +- No service other than Geo Profile Service should directly mutate the current `declared_country` in `User Service`. +- Geo Profile Service must write the new version in its own storage first. +- Geo Profile Service must then synchronously update the current value in `User Service`. +- A version should become effective only after the `User Service` update succeeds. + +## High-Level Architecture + +```mermaid +flowchart LR + Client[Client] --> Edge[Edge Service] + Edge --> Auth[Auth / Session Service] + Auth --> Edge + + Edge -. async flatbuffers ingest .-> Geo[Geo Profile Service] + + Geo --> User[User Service] + Geo --> Mail[Mail Service] + Geo --> Bus[Event Bus] + Geo --> Auth + + AdminUI[Admin Interface] --> Edge + Edge --> Geo + Edge --> User +```` + +## Ingress Processing Model + +The hot path from `Edge Service` to Geo Profile Service is intentionally asynchronous and non-blocking for the edge. + +Design rules: + +- `Edge Service` publishes a minimal FlatBuffers message after user authentication. +- The message contains only: + + - `user_id` + - `device_session_id` + - `ip_address` +- No protobuf wrapper is used. +- No business response is required from Geo Profile Service. +- The edge does not depend on this service for normal request continuation. +- Failures are treated as observability signals, not as reasons to change gateway behavior. + +This design explicitly prioritizes low infrastructure complexity and low overhead on the hottest path over strict RPC semantics. + +## Ingress Transport Contract + +The ingress path is not modeled as conventional request-response RPC. + +Recommended transport shape: + +- Internal binary HTTP endpoint or similarly simple internal binary transport. +- `application/octet-stream` body encoded as FlatBuffers. +- Minimal acknowledgement such as `202 Accepted` with empty body. +- The acknowledgement is not part of business logic. +- The edge client should publish asynchronously and ignore service availability for request progression. + +The service must only validate: + +- FlatBuffers message integrity. +- Presence of required scalar fields. +- Basic field shape constraints. + +The service must not validate: + +- Whether `user_id` exists. +- Whether `device_session_id` belongs to the user. +- Whether the session is still valid. + +Those concerns belong to the already trusted authentication/session layer. + +## Internal Queue and Worker Pipeline + +Geo Profile Service must process ingress data in its own queue and worker flow. + +```mermaid +flowchart LR + E[Edge Service] -. async flatbuffers publish .-> I[Ingest Receiver] + I --> Q[Internal Ingest Queue] + Q --> W[Processing Worker] + W --> G[Geo-IP Resolver] + G --> A[Observation Aggregator] + A --> U[usual_connection_country Calculator] + A --> R[country_review_recommended Evaluator] + A --> S[Session Suspicion Detector] + S --> B[Block Session Command] +``` + +The internal queue exists to decouple network acceptance from CPU and storage work. + +Required properties: + +- The network-facing ingest step is append/update-only. +- The worker can process observations independently from the ingest receiver. +- Expensive logic must not run inline on the network acceptance step. +- Queue backlog and processing latency must be observable. + +A simple durable internal queue is preferred over a complex broker dependency for this part of the system. + +## Service Interface Model + +The service interface is intentionally divided into commands, queries, and events. + +This split exists to preserve the architectural rules already fixed above: + +- Hot-path ingest is asynchronous and write-oriented. +- Administrative reads use trusted internal JSON REST APIs. +- State-changing administrative operations follow one controlled command path. +- Events are auxiliary notifications and never the only representation of business state. + +## Commands + +Commands change service state or trigger downstream effects. + +### Ingest Connection Observation + +Purpose: + +- Accept an authenticated country observation from `Edge Service`. + +Caller: + +- `Edge Service` + +Transport: + +- Internal binary transport +- FlatBuffers payload +- Async publication +- No business response + +Payload: + +- `user_id` +- `device_session_id` +- `ip_address` + +Effects: + +- Enqueue observation for processing +- Eventually resolve `observed_country` +- Update per-session country statistics +- Potentially update `usual_connection_country` +- Potentially set `country_review_recommended` +- Potentially request session blocking through `Auth / Session Service` + +Important behavior: + +- This command must not block edge request processing. +- Failure to send or process is an observability concern, not a gateway correctness concern. + +### Apply Approved Declared Country Change + +Purpose: + +- Record a new approved version of `declared_country` and synchronize the current value into `User Service`. + +Caller: + +- Trusted internal administrative workflow +- Administrative interface backend +- Internal orchestration component + +Transport: + +- Trusted internal JSON REST API + +Input shape: + +- `user_id` +- `new_declared_country` +- actor identity or actor type +- optional reason or comment +- optional correlation metadata + +Effects: + +- Create immutable declared-country version record in Geo Profile Service +- Synchronize latest effective value to `User Service` +- Mark version as effective only after sync succeeds + +Important behavior: + +- Geo Profile Service is the owner of this mutation workflow. +- No bypass write path to `User Service` should exist for this field. + +### Request Suspicious Session Block + +Purpose: + +- Ask `Auth / Session Service` to block suspicious `device_session_id` values. + +Caller: + +- Internal processing worker inside Geo Profile Service + +Transport: + +- Trusted internal API call from Geo Profile Service to `Auth / Session Service` + +Input shape: + +- `user_id` +- one or more suspicious `device_session_id` +- reason or code for block trigger +- optional evidence reference + +Effects: + +- Session block request is sent to `Auth / Session Service` +- Local action log is written in Geo Profile Service + +Important behavior: + +- Current triggering request is not interrupted. +- The effect is expected on subsequent requests. + +## Queries + +Queries return internal state and never mutate business state. + +### List Review Candidates + +Purpose: + +- Return `user_id` values matching review-related filters. + +Caller: + +- Administrative interface +- Internal operational tooling + +Transport: + +- Trusted internal JSON REST API + +Initial supported filter: + +- `country_review_recommended=true` + +Expected response characteristics: + +- Pagination +- Stable ordering +- Ability to extend filter set later without changing the conceptual API class + +### Read User Geo Profile + +Purpose: + +- Return the geo-related internal state of a single user for manual review or investigation. + +Caller: + +- Administrative interface +- Internal operational tooling + +Transport: + +- Trusted internal JSON REST API + +Response should include, at minimum: + +- `user_id` +- current `country_review_recommended` +- per-`device_session_id` country ranking +- per-`device_session_id` `usual_connection_country` +- observation summaries grouped by `device_session_id` +- declared-country version history +- suspicious-session indicators if present +- session-block action history if useful for operations + +### Read Service Health and Operational State + +Purpose: + +- Expose service-operability information for internal monitoring. + +Caller: + +- Monitoring systems +- Internal operators + +Transport: + +- Internal HTTP endpoints + +Response may include: + +- readiness state +- liveness state +- queue lag indicators +- Geo-IP database status +- downstream integration health summaries + +This query group is operational, not business-facing. + +## Events + +Events are emitted as auxiliary notifications. They are not sources of truth. + +### Country Review Recommended + +Meaning: + +- `country_review_recommended` transitioned from `false` to `true` for a user. + +Producer: + +- Geo Profile Service + +Consumers: + +- Administrative workflow automation +- Internal notification consumers +- Optional future downstream internal systems + +Delivery channel: + +- Internal event bus + +Guarantees: + +- Best effort only, unless the underlying bus is later upgraded +- Loss of event must not lose the actual business state + +Durable state counterpart: + +- The current review flag must remain available through Geo Profile Service query APIs + +### Optional Admin Email Notification + +Meaning: + +- Administrative email generated because a user entered review-recommended state + +Producer: + +- Geo Profile Service via `Mail Service` + +Consumers: + +- Administrators + +This is operationally useful but never required for correctness. + +## Data Entities + +This section defines the core logical entities of the service. These are domain entities, not mandatory final physical table names. + +### Country Observation + +Represents a stored observation fact derived from one authenticated request. + +Required logical fields: + +- `user_id` +- `device_session_id` +- `observed_country` +- observation timestamp + +Optional implementation fields: + +- obfuscated or hashed IP representation +- internal ingestion metadata +- processing metadata + +Role in the system: + +- Source data for rankings +- Source data for suspicious-session detection +- Source data for review recommendations + +### Device Session Country Score + +Represents the weighted ranking of countries for one `device_session_id`. + +Required logical fields: + +- `device_session_id` +- `country_code` +- current score +- last contribution timestamp +- optional rank or ordering marker + +Role in the system: + +- Maintains the rolling per-session country distribution +- Supports direct derivation of `usual_connection_country` + +### Device Session Geo State + +Represents the current derived geographic state of one `device_session_id`. + +Required logical fields: + +- `device_session_id` +- current `usual_connection_country` +- last observation timestamp +- summary metadata needed by admin APIs + +Role in the system: + +- Read-optimized representation of session-level geo state +- Allows admin APIs to avoid recomputing from raw observations on each read + +### User Review State + +Represents the current review-related state for one user. + +Required logical fields: + +- `user_id` +- `country_review_recommended` +- last evaluation timestamp +- optional reason code or explanation marker + +Role in the system: + +- Durable source for review filtering +- Source of truth for admin API candidate listing +- State backing for auxiliary event emission + +### Declared Country Version + +Represents one immutable version of the declared country. + +Required logical fields: + +- `user_id` +- version identifier +- `declared_country` +- version creation timestamp +- actor identity or actor type +- optional reason or comment +- version status + +Suggested version statuses: + +- `recorded` +- `applied` +- `sync_failed` + +Role in the system: + +- Immutable history of approved country changes +- Separation between local history and currently effective external value + +### Session Block Action + +Represents a record of a suspicious-session block request. + +Required logical fields: + +- `user_id` +- `device_session_id` +- action timestamp +- reason code +- result status + +Role in the system: + +- Operational trace of protection actions +- Support for troubleshooting and admin inspection + +## Geo-IP Source + +The service uses a locally stored free Geo-IP country database. + +Requirements: + +- No per-request calls to external Geo-IP services. +- The database must be actively maintained and not abandoned. +- The service only needs country-level lookup. +- Database refresh is handled internally on a schedule. + +Version of the Geo-IP database is not stored with each observation, by explicit design choice. + +## Observation Storage Strategy + +The service does not keep a full raw IP log for every API request. + +The primary stored signal is the derived country observation and its aggregates. + +Recommended storage model: + +- Store observation facts at country level. +- Aggregate per `device_session_id`. +- Keep enough history to compute ranking and review decisions. +- Retain no raw IP by default. +- Allow optional obfuscated or hashed IP retention only if later justified by operational needs. + +A one-year observation horizon is acceptable as a starting point, subject to real data volume. + +## Derived Statistics Model + +The service computes a weighted ranking of countries per `device_session_id`. + +Baseline principles: + +- More recent observations carry more weight. +- Older observations decay over time. +- The calculation is based on recent activities, not active calendar days. +- The scoring model must remain computationally cheap and tunable. + +The service must maintain, at minimum: + +- Ranked observed countries for each `device_session_id` +- Current `usual_connection_country` for each `device_session_id` +- Sufficient ranking data for administrative inspection + +The precise scoring formula is configurable and intentionally left outside this document. + +## Suspicious Session Logic + +The service must detect suspicious multi-country behavior across multiple sessions of the same user. + +The intended interpretation is: + +- Slow geographic drift over larger time spans is normal. +- Simultaneous or near-simultaneous active usage from conflicting countries is suspicious. +- Suspicion targets sessions, not the entire account. + +Important trade-off: + +- The request that caused the suspicion is allowed to proceed. +- Session blocking is requested asynchronously afterward. +- The next request from the blocked session should be rejected by `Auth / Session Service`. + +```mermaid +flowchart TD + O[New processed observation] --> D{Conflicting country pattern + across user sessions?} + D -- No --> N[No block action] + D -- Yes --> C[Select suspicious device sessions] + C --> A[Call Auth / Session Service block API] + A --> X[Subsequent requests get rejected] +``` + +Exact threshold tuning is configuration-driven and may evolve without changing the service boundary. + +## Country Review Recommendation Logic + +The recommendation workflow is durable and queryable. + +Key rules: + +- `country_review_recommended` is stored as service state. +- Transition to `true` must not be represented only by an event. +- Administrative systems must be able to retrieve candidates via service API. +- Auxiliary notifications exist only to reduce polling latency. + +```mermaid +flowchart LR + P[Processed observations] --> F{Review criteria met?} + F -- No --> K[Keep existing state] + F -- Yes --> T[Set country_review_recommended=true] + T --> API[Expose via internal REST API] + T --> BUS[Publish event bus notification] + T --> MAIL[Optionally send admin email] +``` + +If event delivery fails, the recommendation state still exists and remains observable through the API. + +## Administrative Read API + +The service exposes a trusted internal JSON REST API for administrative and operational reads. + +### Review Candidate Query Endpoint + +Purpose: + +- Return `user_id` values for users requiring review. + +Initial required filter set: + +- `country_review_recommended=true` + +Expected characteristics: + +- Pagination support +- Stable ordering +- Simple extension path for additional filters later + +### Geo Profile Query Endpoint + +Purpose: + +- Return the internal geo profile of a specific user for administrative inspection. + +The response should include, at minimum: + +- `user_id` +- current review flag +- per-`device_session_id` country ranking +- per-`device_session_id` `usual_connection_country` +- observation summaries +- declared country version history +- suspicious session markers if present +- enough information for manual administrative decision-making + +The profile is grouped by `device_session_id`, because that is the primary aggregation boundary. + +## Declared Country Change Command API + +Geo Profile Service exposes an internal trusted command API to apply approved `declared_country` changes. + +The command path must behave as follows: + +- Record a new declared country version in Geo Profile Service storage. +- Synchronously update the current `declared_country` value in `User Service`. +- Mark the new version effective only if the `User Service` update succeeds. + +Recommended version lifecycle: + +- `recorded` +- `applied` +- `sync_failed` + +This lifecycle prevents invisible divergence between history and current value. + +```mermaid +sequenceDiagram + participant Admin as Admin Interface + participant Geo as Geo Profile Service + participant User as User Service + + Admin->>Geo: Apply approved declared_country change + Geo->>Geo: Create new version record + Geo->>User: Sync current declared_country + alt Sync succeeds + User-->>Geo: OK + Geo->>Geo: Mark version as applied + Geo-->>Admin: Success + else Sync fails + User-->>Geo: Error + Geo->>Geo: Mark version as sync_failed + Geo-->>Admin: Failure + end +``` + +## Integration with User Service + +`User Service` keeps the latest effective `declared_country` because other services and the gateway may need it frequently for response shaping without querying Geo Profile Service. + +Integration rules: + +- Geo Profile Service owns the mutation workflow. +- `User Service` stores only the latest effective value. +- Reads of the current country for normal business responses should go to `User Service`. +- Reads of country history and geo-derived data should go to Geo Profile Service. + +## Integration with Auth / Session Service + +Geo Profile Service must be able to request blocking of suspicious sessions. + +Contract assumptions: + +- Blocking is idempotent. +- The block applies to `device_session_id`, not to the entire user account. +- The effect is expected on subsequent requests, not the current triggering request. + +This keeps the hot path simple and avoids synchronous enforcement coupling. + +## Integration with Mail Service + +Mail notifications are optional and configuration-driven. + +Mail is sent only when: + +- `country_review_recommended` transitions to `true` +- Email notifications are enabled + +Mail is auxiliary and must not be required for business correctness. + +## Event Bus Integration + +The service emits an event when `country_review_recommended` transitions to `true`. + +Event usage: + +- Auxiliary notification for downstream systems +- Reduced delay for admin workflows +- Optional future fan-out for additional internal consumers + +Important constraint: + +- The event bus is not the source of truth. +- Loss of an event must not lose the business state. +- Periodic pull through the service API must remain sufficient to recover missed notifications. + +## Failure and Degradation Model + +The service is intentionally designed for fail-open behavior relative to the edge. + +### Edge-to-Service Failure + +If Geo Profile Service is unavailable: + +- `Edge Service` must continue request processing unchanged. +- The publication failure becomes a metric/logging concern. +- No user-visible request rejection is introduced by this dependency. + +### In-Service Processing Failure + +If the worker pipeline temporarily fails: + +- Already accepted observations stay queued if a durable queue is used. +- Processing lag grows and must be monitored. +- Administrative state may become stale, but the rest of the platform keeps functioning. + +### User Service Sync Failure + +If `declared_country` sync to `User Service` fails: + +- The version record remains in Geo Profile Service. +- The version must be marked as not yet effective. +- Retry or operator action can be used later. +- No silent divergence is allowed. + +### Mail or Event Delivery Failure + +If mail or event publication fails: + +- The failure is logged and metered. +- `country_review_recommended` remains persisted. +- Administrative polling can still find the affected user. + +## Privacy and Retention Posture + +The privacy posture is intentionally minimal. + +- Do not store raw IP long-term unless a later justification appears. +- Prefer storing country-level derived facts and aggregates. +- If hashed or obfuscated IP is introduced later, treat it as an implementation detail, not as a core domain dependency. +- Retention is expected to be bounded and configurable. + +## Operational Observability + +The service should expose metrics and logs for at least: + +- Ingest acceptance rate +- Ingest publish failures observed by edge +- Queue depth +- Queue lag +- Geo-IP lookup latency +- Observation processing latency +- Review flag transitions +- Suspicious session block commands +- User Service sync failures +- Mail send failures +- Event publication failures + +The service must be easy to operate even though it does not sit on the synchronous business-critical path. + +## Minimal Initial API Surface + +The initial required API surface is intentionally small. + +Binary ingest path: + +- Asynchronous FlatBuffers message publication from edge +- No business response body +- No synchronous decision returned to edge + +Internal JSON REST paths: + +- List review candidates by filter +- Read user geo profile grouped by `device_session_id` +- Apply approved `declared_country` version change +- Optional internal health and metrics endpoints + +Any additional endpoints should be added only if a concrete consumer appears. + +## Design Trade-Offs Accepted by This Architecture + +This architecture intentionally accepts the following trade-offs: + +- Some observation messages may be lost if the service is down and the edge cannot deliver them. +- The request that triggers suspicious-session detection is allowed to continue. +- Geo-IP history is not strictly reproducible against past database versions. +- Current `declared_country` is denormalized into `User Service`. +- Administrative approval policy stays flexible and human-driven. + +These trade-offs are acceptable because they keep the hottest path simple while preserving enough internal state for review and risk handling. + +## Implementation Readiness Statement + +The architecture is considered ready for implementation planning. + +The main remaining work is not conceptual but executional: + +- Precise API shape +- Queue implementation details +- Scoring formula tuning +- Suspicious-session thresholds +- Concrete storage schema +- Operational hardening