# Geo Profile Service ## Context and Purpose Geo Profile Service is an internal trusted microservice responsible for collecting and processing country-level connection signals for authenticated users. The service exists to solve four related problems: - Record the observed country of authenticated requests based on local Geo-IP lookup. - Maintain per-`device_session_id` country statistics and derive a `usual_connection_country`. - Support administrative review workflows around user country changes. - Detect suspicious multi-country session behavior and request blocking of suspicious sessions through `Auth / Session Service`. The service is intentionally narrow in scope. It does not own authentication, user identity validation, or user-facing profile reads for the latest country value. ## Explicit Non-Goals The following are intentionally out of scope for this service: - Region-level or city-level geolocation. - VPN, proxy, anonymizer, or hosting-provider detection. - Automatic change of `declared_country` based on observed metrics. - Immediate blocking of the same request that triggered suspicion. - Global source-of-truth ownership for the current user country. - Direct exposure of storage to other services. - Strong audit reproducibility of historical Geo-IP lookup results by storing Geo-IP database versions. ## Place in the Existing Microservice System The service is embedded into an already existing trusted microservice environment and integrates with: - `Edge Service` - `Auth / Session Service` - `User Service` - `Mail Service` - Internal event bus `Edge Service` is the producer of authenticated connection observations. `User Service` remains the centralized owner of the latest effective `declared_country` value for normal user profile reads. `Auth / Session Service` remains the owner of session lifecycle and session blocking. `Mail Service` is used only for optional administrative notifications. The event bus is used only as an auxiliary notification channel and not as the authoritative source of business state. ## Responsibility Boundaries Geo Profile Service owns: - Geo-IP lookup at country level using a local database. - Storage of `observed_country` as a fact of observation. - Per-`device_session_id` country aggregation. - Computation of `usual_connection_country`. - Computation and storage of `country_review_recommended`. - Version history of `declared_country`. - Internal administrative read APIs for geo-related user state. - Internal command API to apply approved `declared_country` changes. - Detection of suspicious cross-country session patterns. - Session block requests toward `Auth / Session Service`. Geo Profile Service does not own: - Validation of `user_id` and `device_session_id` against external services. - Public user profile reads for the latest country value. - Authentication or authorization of end users. - Final enforcement of session blocking. - Delivery guarantees of auxiliary event notifications. - Formal administrative SLA or rigid approval policies. ## Semantic Model The service works with four core country-related concepts. ### declared_country `declared_country` is the user-declared country. Properties: - It is a user-facing business attribute. - The latest effective value is stored in `User Service`. - The full version history is stored in Geo Profile Service. - It is never changed automatically by metrics. - It changes only through a controlled command path and administrative approval. ### observed_country `observed_country` is the country derived from Geo-IP for a specific authenticated request. Properties: - It is an observation fact, not a truth claim about residence. - It is tied to `user_id`, `device_session_id`, and observation time. - It is derived on the server side from the source IP seen at the trusted edge. - It is used as input into country aggregation and anomaly detection. ### usual_connection_country `usual_connection_country` is the computed most typical country of network egress for a given `device_session_id`. Properties: - It is not interpreted as country of residence. - It is calculated per `device_session_id`, not globally per account. - It is derived from recent weighted observations with decay over time. - It is expected to drift naturally as usage patterns change. ### country_review_recommended `country_review_recommended` is an internal service flag that indicates that the accumulated observations justify administrative review. Properties: - It does not trigger automatic country change. - It is stored durably in the service state. - It is readable through the service API. - Transition to `true` may also emit an event and optionally send email. ## Data Ownership Rules The split ownership model is intentional. - `User Service` owns the latest effective `declared_country`. - Geo Profile Service owns the history of `declared_country` changes. - Geo Profile Service owns `observed_country`, `usual_connection_country`, and `country_review_recommended`. This means Geo Profile Service is the owner of the country-change process, but `User Service` is the owner of the currently effective denormalized value used by the rest of the system. To avoid divergence: - No service other than Geo Profile Service should directly mutate the current `declared_country` in `User Service`. - Geo Profile Service must write the new version in its own storage first. - Geo Profile Service must then synchronously update the current value in `User Service`. - A version should become effective only after the `User Service` update succeeds. ## High-Level Architecture ```mermaid flowchart LR Client[Client] --> Edge[Edge Service] Edge --> Auth[Auth / Session Service] Auth --> Edge Edge -. async flatbuffers ingest .-> Geo[Geo Profile Service] Geo --> User[User Service] Geo --> Mail[Mail Service] Geo --> Bus[Event Bus] Geo --> Auth AdminUI[Admin Interface] --> Edge Edge --> Geo Edge --> User ```` ## Ingress Processing Model The hot path from `Edge Service` to Geo Profile Service is intentionally asynchronous and non-blocking for the edge. Design rules: - `Edge Service` publishes a minimal FlatBuffers message after user authentication. - The message contains only: - `user_id` - `device_session_id` - `ip_address` - No protobuf wrapper is used. - No business response is required from Geo Profile Service. - The edge does not depend on this service for normal request continuation. - Failures are treated as observability signals, not as reasons to change gateway behavior. This design explicitly prioritizes low infrastructure complexity and low overhead on the hottest path over strict RPC semantics. ## Ingress Transport Contract The ingress path is not modeled as conventional request-response RPC. Recommended transport shape: - Internal binary HTTP endpoint or similarly simple internal binary transport. - `application/octet-stream` body encoded as FlatBuffers. - Minimal acknowledgement such as `202 Accepted` with empty body. - The acknowledgement is not part of business logic. - The edge client should publish asynchronously and ignore service availability for request progression. The service must only validate: - FlatBuffers message integrity. - Presence of required scalar fields. - Basic field shape constraints. The service must not validate: - Whether `user_id` exists. - Whether `device_session_id` belongs to the user. - Whether the session is still valid. Those concerns belong to the already trusted authentication/session layer. ## Internal Queue and Worker Pipeline Geo Profile Service must process ingress data in its own queue and worker flow. ```mermaid flowchart LR E[Edge Service] -. async flatbuffers publish .-> I[Ingest Receiver] I --> Q[Internal Ingest Queue] Q --> W[Processing Worker] W --> G[Geo-IP Resolver] G --> A[Observation Aggregator] A --> U[usual_connection_country Calculator] A --> R[country_review_recommended Evaluator] A --> S[Session Suspicion Detector] S --> B[Block Session Command] ``` The internal queue exists to decouple network acceptance from CPU and storage work. Required properties: - The network-facing ingest step is append/update-only. - The worker can process observations independently from the ingest receiver. - Expensive logic must not run inline on the network acceptance step. - Queue backlog and processing latency must be observable. A simple durable internal queue is preferred over a complex broker dependency for this part of the system. ## Service Interface Model The service interface is intentionally divided into commands, queries, and events. This split exists to preserve the architectural rules already fixed above: - Hot-path ingest is asynchronous and write-oriented. - Administrative reads use trusted internal JSON REST APIs. - State-changing administrative operations follow one controlled command path. - Events are auxiliary notifications and never the only representation of business state. ## Commands Commands change service state or trigger downstream effects. ### Ingest Connection Observation Purpose: - Accept an authenticated country observation from `Edge Service`. Caller: - `Edge Service` Transport: - Internal binary transport - FlatBuffers payload - Async publication - No business response Payload: - `user_id` - `device_session_id` - `ip_address` Effects: - Enqueue observation for processing - Eventually resolve `observed_country` - Update per-session country statistics - Potentially update `usual_connection_country` - Potentially set `country_review_recommended` - Potentially request session blocking through `Auth / Session Service` Important behavior: - This command must not block edge request processing. - Failure to send or process is an observability concern, not a gateway correctness concern. ### Apply Approved Declared Country Change Purpose: - Record a new approved version of `declared_country` and synchronize the current value into `User Service`. Caller: - Trusted internal administrative workflow - Administrative interface backend - Internal orchestration component Transport: - Trusted internal JSON REST API Input shape: - `user_id` - `new_declared_country` - actor identity or actor type - optional reason or comment - optional correlation metadata Effects: - Create immutable declared-country version record in Geo Profile Service - Synchronize latest effective value to `User Service` - Mark version as effective only after sync succeeds Important behavior: - Geo Profile Service is the owner of this mutation workflow. - No bypass write path to `User Service` should exist for this field. ### Request Suspicious Session Block Purpose: - Ask `Auth / Session Service` to block suspicious `device_session_id` values. Caller: - Internal processing worker inside Geo Profile Service Transport: - Trusted internal API call from Geo Profile Service to `Auth / Session Service` Input shape: - `user_id` - one or more suspicious `device_session_id` - reason or code for block trigger - optional evidence reference Effects: - Session block request is sent to `Auth / Session Service` - Local action log is written in Geo Profile Service Important behavior: - Current triggering request is not interrupted. - The effect is expected on subsequent requests. ## Queries Queries return internal state and never mutate business state. ### List Review Candidates Purpose: - Return `user_id` values matching review-related filters. Caller: - Administrative interface - Internal operational tooling Transport: - Trusted internal JSON REST API Initial supported filter: - `country_review_recommended=true` Expected response characteristics: - Pagination - Stable ordering - Ability to extend filter set later without changing the conceptual API class ### Read User Geo Profile Purpose: - Return the geo-related internal state of a single user for manual review or investigation. Caller: - Administrative interface - Internal operational tooling Transport: - Trusted internal JSON REST API Response should include, at minimum: - `user_id` - current `country_review_recommended` - per-`device_session_id` country ranking - per-`device_session_id` `usual_connection_country` - observation summaries grouped by `device_session_id` - declared-country version history - suspicious-session indicators if present - session-block action history if useful for operations ### Read Service Health and Operational State Purpose: - Expose service-operability information for internal monitoring. Caller: - Monitoring systems - Internal operators Transport: - Internal HTTP endpoints Response may include: - readiness state - liveness state - queue lag indicators - Geo-IP database status - downstream integration health summaries This query group is operational, not business-facing. ## Events Events are emitted as auxiliary notifications. They are not sources of truth. ### Country Review Recommended Meaning: - `country_review_recommended` transitioned from `false` to `true` for a user. Producer: - Geo Profile Service Consumers: - Administrative workflow automation - Internal notification consumers - Optional future downstream internal systems Delivery channel: - Internal event bus Guarantees: - Best effort only, unless the underlying bus is later upgraded - Loss of event must not lose the actual business state Durable state counterpart: - The current review flag must remain available through Geo Profile Service query APIs ### Optional Admin Email Notification Meaning: - Administrative email generated because a user entered review-recommended state Producer: - Geo Profile Service via `Mail Service` Consumers: - Administrators This is operationally useful but never required for correctness. ## Data Entities This section defines the core logical entities of the service. These are domain entities, not mandatory final physical table names. ### Country Observation Represents a stored observation fact derived from one authenticated request. Required logical fields: - `user_id` - `device_session_id` - `observed_country` - observation timestamp Optional implementation fields: - obfuscated or hashed IP representation - internal ingestion metadata - processing metadata Role in the system: - Source data for rankings - Source data for suspicious-session detection - Source data for review recommendations ### Device Session Country Score Represents the weighted ranking of countries for one `device_session_id`. Required logical fields: - `device_session_id` - `country_code` - current score - last contribution timestamp - optional rank or ordering marker Role in the system: - Maintains the rolling per-session country distribution - Supports direct derivation of `usual_connection_country` ### Device Session Geo State Represents the current derived geographic state of one `device_session_id`. Required logical fields: - `device_session_id` - current `usual_connection_country` - last observation timestamp - summary metadata needed by admin APIs Role in the system: - Read-optimized representation of session-level geo state - Allows admin APIs to avoid recomputing from raw observations on each read ### User Review State Represents the current review-related state for one user. Required logical fields: - `user_id` - `country_review_recommended` - last evaluation timestamp - optional reason code or explanation marker Role in the system: - Durable source for review filtering - Source of truth for admin API candidate listing - State backing for auxiliary event emission ### Declared Country Version Represents one immutable version of the declared country. Required logical fields: - `user_id` - version identifier - `declared_country` - version creation timestamp - actor identity or actor type - optional reason or comment - version status Suggested version statuses: - `recorded` - `applied` - `sync_failed` Role in the system: - Immutable history of approved country changes - Separation between local history and currently effective external value ### Session Block Action Represents a record of a suspicious-session block request. Required logical fields: - `user_id` - `device_session_id` - action timestamp - reason code - result status Role in the system: - Operational trace of protection actions - Support for troubleshooting and admin inspection ## Geo-IP Source The service uses a locally stored free Geo-IP country database. The Geo-IP acessible via [geoip](../pkg/geoip/) package. Requirements: - No per-request calls to external Geo-IP services. - The database must be actively maintained and not abandoned. - The service only needs country-level lookup. - Database refresh is handled internally on a schedule. Version of the Geo-IP database is not stored with each observation, by explicit design choice. ## Observation Storage Strategy The service does not keep a full raw IP log for every API request. The primary stored signal is the derived country observation and its aggregates. Recommended storage model: - Store observation facts at country level. - Aggregate per `device_session_id`. - Keep enough history to compute ranking and review decisions. - Retain no raw IP by default. - Allow optional obfuscated or hashed IP retention only if later justified by operational needs. A one-year observation horizon is acceptable as a starting point, subject to real data volume. ## Derived Statistics Model The service computes a weighted ranking of countries per `device_session_id`. Baseline principles: - More recent observations carry more weight. - Older observations decay over time. - The calculation is based on recent activities, not active calendar days. - The scoring model must remain computationally cheap and tunable. The service must maintain, at minimum: - Ranked observed countries for each `device_session_id` - Current `usual_connection_country` for each `device_session_id` - Sufficient ranking data for administrative inspection The precise scoring formula is configurable and intentionally left outside this document. ## Suspicious Session Logic The service must detect suspicious multi-country behavior across multiple sessions of the same user. The intended interpretation is: - Slow geographic drift over larger time spans is normal. - Simultaneous or near-simultaneous active usage from conflicting countries is suspicious. - Suspicion targets sessions, not the entire account. Important trade-off: - The request that caused the suspicion is allowed to proceed. - Session blocking is requested asynchronously afterward. - The next request from the blocked session should be rejected by `Auth / Session Service`. ```mermaid flowchart TD O[New processed observation] --> D{Conflicting country pattern across user sessions?} D -- No --> N[No block action] D -- Yes --> C[Select suspicious device sessions] C --> A[Call Auth / Session Service block API] A --> X[Subsequent requests get rejected] ``` Exact threshold tuning is configuration-driven and may evolve without changing the service boundary. ## Country Review Recommendation Logic The recommendation workflow is durable and queryable. Key rules: - `country_review_recommended` is stored as service state. - Transition to `true` must not be represented only by an event. - Administrative systems must be able to retrieve candidates via service API. - Auxiliary notifications exist only to reduce polling latency. ```mermaid flowchart LR P[Processed observations] --> F{Review criteria met?} F -- No --> K[Keep existing state] F -- Yes --> T[Set country_review_recommended=true] T --> API[Expose via internal REST API] T --> BUS[Publish event bus notification] T --> MAIL[Optionally send admin email] ``` If event delivery fails, the recommendation state still exists and remains observable through the API. ## Administrative Read API The service exposes a trusted internal JSON REST API for administrative and operational reads. ### Review Candidate Query Endpoint Purpose: - Return `user_id` values for users requiring review. Initial required filter set: - `country_review_recommended=true` Expected characteristics: - Pagination support - Stable ordering - Simple extension path for additional filters later ### Geo Profile Query Endpoint Purpose: - Return the internal geo profile of a specific user for administrative inspection. The response should include, at minimum: - `user_id` - current review flag - per-`device_session_id` country ranking - per-`device_session_id` `usual_connection_country` - observation summaries - declared country version history - suspicious session markers if present - enough information for manual administrative decision-making The profile is grouped by `device_session_id`, because that is the primary aggregation boundary. ## Declared Country Change Command API Geo Profile Service exposes an internal trusted command API to apply approved `declared_country` changes. The command path must behave as follows: - Record a new declared country version in Geo Profile Service storage. - Synchronously update the current `declared_country` value in `User Service`. - Mark the new version effective only if the `User Service` update succeeds. Recommended version lifecycle: - `recorded` - `applied` - `sync_failed` This lifecycle prevents invisible divergence between history and current value. ```mermaid sequenceDiagram participant Admin as Admin Interface participant Geo as Geo Profile Service participant User as User Service Admin->>Geo: Apply approved declared_country change Geo->>Geo: Create new version record Geo->>User: Sync current declared_country alt Sync succeeds User-->>Geo: OK Geo->>Geo: Mark version as applied Geo-->>Admin: Success else Sync fails User-->>Geo: Error Geo->>Geo: Mark version as sync_failed Geo-->>Admin: Failure end ``` ## Integration with User Service `User Service` keeps the latest effective `declared_country` because other services and the gateway may need it frequently for response shaping without querying Geo Profile Service. Integration rules: - Geo Profile Service owns the mutation workflow. - `User Service` stores only the latest effective value. - Reads of the current country for normal business responses should go to `User Service`. - Reads of country history and geo-derived data should go to Geo Profile Service. ## Integration with Auth / Session Service Geo Profile Service must be able to request blocking of suspicious sessions. Contract assumptions: - Blocking is idempotent. - The block applies to `device_session_id`, not to the entire user account. - The effect is expected on subsequent requests, not the current triggering request. This keeps the hot path simple and avoids synchronous enforcement coupling. ## Integration with Mail Service Mail notifications are optional and configuration-driven. Mail is sent only when: - `country_review_recommended` transitions to `true` - Email notifications are enabled Mail is auxiliary and must not be required for business correctness. ## Event Bus Integration The service emits an event when `country_review_recommended` transitions to `true`. Event usage: - Auxiliary notification for downstream systems - Reduced delay for admin workflows - Optional future fan-out for additional internal consumers Important constraint: - The event bus is not the source of truth. - Loss of an event must not lose the business state. - Periodic pull through the service API must remain sufficient to recover missed notifications. ## Failure and Degradation Model The service is intentionally designed for fail-open behavior relative to the edge. ### Edge-to-Service Failure If Geo Profile Service is unavailable: - `Edge Service` must continue request processing unchanged. - The publication failure becomes a metric/logging concern. - No user-visible request rejection is introduced by this dependency. ### In-Service Processing Failure If the worker pipeline temporarily fails: - Already accepted observations stay queued if a durable queue is used. - Processing lag grows and must be monitored. - Administrative state may become stale, but the rest of the platform keeps functioning. ### User Service Sync Failure If `declared_country` sync to `User Service` fails: - The version record remains in Geo Profile Service. - The version must be marked as not yet effective. - Retry or operator action can be used later. - No silent divergence is allowed. ### Mail or Event Delivery Failure If mail or event publication fails: - The failure is logged and metered. - `country_review_recommended` remains persisted. - Administrative polling can still find the affected user. ## Privacy and Retention Posture The privacy posture is intentionally minimal. - Do not store raw IP long-term unless a later justification appears. - Prefer storing country-level derived facts and aggregates. - If hashed or obfuscated IP is introduced later, treat it as an implementation detail, not as a core domain dependency. - Retention is expected to be bounded and configurable. ## Operational Observability The service should expose metrics and logs for at least: - Ingest acceptance rate - Ingest publish failures observed by edge - Queue depth - Queue lag - Geo-IP lookup latency - Observation processing latency - Review flag transitions - Suspicious session block commands - User Service sync failures - Mail send failures - Event publication failures The service must be easy to operate even though it does not sit on the synchronous business-critical path. ## Minimal Initial API Surface The initial required API surface is intentionally small. Binary ingest path: - Asynchronous FlatBuffers message publication from edge - No business response body - No synchronous decision returned to edge Internal JSON REST paths: - List review candidates by filter - Read user geo profile grouped by `device_session_id` - Apply approved `declared_country` version change - Optional internal health and metrics endpoints Any additional endpoints should be added only if a concrete consumer appears. ## Design Trade-Offs Accepted by This Architecture This architecture intentionally accepts the following trade-offs: - Some observation messages may be lost if the service is down and the edge cannot deliver them. - The request that triggers suspicious-session detection is allowed to continue. - Geo-IP history is not strictly reproducible against past database versions. - Current `declared_country` is denormalized into `User Service`. - Administrative approval policy stays flexible and human-driven. These trade-offs are acceptable because they keep the hottest path simple while preserving enough internal state for review and risk handling. ## Implementation Readiness Statement The architecture is considered ready for implementation planning. The main remaining work is not conceptual but executional: - Precise API shape - Queue implementation details - Scoring formula tuning - Suspicious-session thresholds - Concrete storage schema - Operational hardening