docs: geoprofile service
This commit is contained in:
@@ -0,0 +1,679 @@
|
|||||||
|
# Implementation Plan for Geo Profile Service
|
||||||
|
|
||||||
|
## Planning Principles
|
||||||
|
|
||||||
|
This plan is aligned with the agreed architecture and is written for an experienced developer implementing an internal microservice in a trusted environment.
|
||||||
|
|
||||||
|
Execution priorities:
|
||||||
|
|
||||||
|
- Keep the edge path non-blocking.
|
||||||
|
- Keep the service boundary narrow.
|
||||||
|
- Build append/update-only ingest first.
|
||||||
|
- Preserve clear ownership split with `User Service`.
|
||||||
|
- Defer threshold tuning until after the basic data model is working.
|
||||||
|
- Avoid unnecessary infrastructure on the first iteration.
|
||||||
|
|
||||||
|
## Stage 01 — Freeze Service Vocabulary and Contracts
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Remove naming ambiguity before any implementation begins.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Choose the final service name used in repository, configuration, and docs.
|
||||||
|
- Freeze the country-related domain terms:
|
||||||
|
- `declared_country`
|
||||||
|
- `observed_country`
|
||||||
|
- `usual_connection_country`
|
||||||
|
- `country_review_recommended`
|
||||||
|
- Freeze cross-service ownership rules.
|
||||||
|
- Write a short internal ADR describing why the latest `declared_country` lives in `User Service` while version history lives in Geo Profile Service.
|
||||||
|
- Write a short internal ADR describing why the edge path is async FlatBuffers instead of request-response RPC.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- No domain term remains overloaded or unclear.
|
||||||
|
- No service boundary question remains unresolved.
|
||||||
|
|
||||||
|
## Stage 02 — Define the Minimal Domain Model
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Describe the persistent state before choosing transport or storage details.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Define conceptual entities and their relationships.
|
||||||
|
- Freeze mandatory fields for:
|
||||||
|
- country observation
|
||||||
|
- per-session country ranking
|
||||||
|
- review flag state
|
||||||
|
- declared country version history
|
||||||
|
- session block request log
|
||||||
|
- Decide which timestamps are mandatory on each entity.
|
||||||
|
- Decide whether optional hashed IP storage exists at all in v1.
|
||||||
|
- Decide whether declared-country version records need explicit lifecycle state:
|
||||||
|
- `recorded`
|
||||||
|
- `applied`
|
||||||
|
- `sync_failed`
|
||||||
|
|
||||||
|
Recommended minimal entities:
|
||||||
|
|
||||||
|
- `country_observation`
|
||||||
|
- `device_session_country_score`
|
||||||
|
- `user_review_state`
|
||||||
|
- `declared_country_version`
|
||||||
|
- `session_block_action`
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- The storage layer can be designed directly from the domain model.
|
||||||
|
- The model reflects all agreed semantics and no extra features.
|
||||||
|
|
||||||
|
## Stage 03 — Design the Ingest Message Schema
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Freeze the binary contract from `Edge Service` to Geo Profile Service.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Create the FlatBuffers schema for the async ingest message.
|
||||||
|
- Limit message fields to:
|
||||||
|
- `user_id`
|
||||||
|
- `device_session_id`
|
||||||
|
- `ip_address`
|
||||||
|
- Define allowed field types and byte layout.
|
||||||
|
- Define message versioning strategy for future backward-compatible additions.
|
||||||
|
- Decide how schema version is represented.
|
||||||
|
- Define receiver behavior for malformed messages.
|
||||||
|
|
||||||
|
Important constraints:
|
||||||
|
|
||||||
|
- No protobuf wrapper.
|
||||||
|
- No business reply payload.
|
||||||
|
- No external validation of identifiers.
|
||||||
|
- Only schema-level validation on receipt.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- `Edge Service` and Geo Profile Service can generate compatible FlatBuffers code.
|
||||||
|
- Message evolution path exists without breaking v1.
|
||||||
|
|
||||||
|
## Stage 04 — Choose and Implement the Async Ingest Transport
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Implement the simplest possible binary ingress path that does not behave like normal RPC.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Choose the concrete transport for internal binary publication.
|
||||||
|
- Recommended default:
|
||||||
|
- internal HTTP endpoint
|
||||||
|
- `application/octet-stream`
|
||||||
|
- FlatBuffers body
|
||||||
|
- empty response body
|
||||||
|
- status-only acknowledgement
|
||||||
|
- Implement the receiver endpoint in Geo Profile Service.
|
||||||
|
- Implement an async publisher client in `Edge Service`.
|
||||||
|
- Ensure the edge client publishes out-of-band from the main request execution path.
|
||||||
|
- Ensure the edge ignores publication failures for request progression.
|
||||||
|
- Add metrics for publish attempts, successes, and failures.
|
||||||
|
|
||||||
|
Important note:
|
||||||
|
|
||||||
|
- The edge path must remain operational even if Geo Profile Service is completely unavailable.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- The edge can publish authenticated observations without blocking the main API flow.
|
||||||
|
- Transport failures do not change edge business behavior.
|
||||||
|
|
||||||
|
## Stage 05 — Build the Internal Durable Queue
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Decouple acceptance of ingress messages from their processing.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Select the simplest queue implementation inside the service.
|
||||||
|
- Prefer a durable queue over an in-memory-only queue.
|
||||||
|
- Implement enqueue-on-receive behavior.
|
||||||
|
- Implement worker dequeue behavior.
|
||||||
|
- Define queue item lifecycle:
|
||||||
|
- accepted
|
||||||
|
- processing
|
||||||
|
- processed
|
||||||
|
- failed
|
||||||
|
- Define retry strategy for worker failures.
|
||||||
|
- Define dead-letter or failure-handling strategy if retries are exhausted.
|
||||||
|
- Add queue metrics:
|
||||||
|
- depth
|
||||||
|
- oldest item age
|
||||||
|
- processing rate
|
||||||
|
- failure count
|
||||||
|
|
||||||
|
Recommended starting point:
|
||||||
|
|
||||||
|
- Database-backed queue table or similarly simple durable append structure.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- Geo Profile Service can accept messages quickly and process them later.
|
||||||
|
- Worker failures do not lose already accepted work silently.
|
||||||
|
|
||||||
|
## Stage 06 — Add Local Geo-IP Resolution
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Resolve country from IP locally and cheaply.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Choose the Geo-IP database for v1.
|
||||||
|
- Add a loader for the local country database.
|
||||||
|
- Implement lookup adapter for IP to country.
|
||||||
|
- Define how unknown, invalid, or non-resolvable IPs are handled.
|
||||||
|
- Add a periodic database refresh job.
|
||||||
|
- Add health signals for Geo-IP database presence and age.
|
||||||
|
|
||||||
|
Design constraints:
|
||||||
|
|
||||||
|
- Country only.
|
||||||
|
- No external network lookup during request processing.
|
||||||
|
- No Geo-IP version persistence with each observation.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- Workers can resolve country from IP locally.
|
||||||
|
- Geo-IP database refresh is operationally manageable.
|
||||||
|
|
||||||
|
## Stage 07 — Persist Observation Facts
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Materialize `observed_country` as stored domain facts.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Implement the observation persistence model.
|
||||||
|
- Store at minimum:
|
||||||
|
- `user_id`
|
||||||
|
- `device_session_id`
|
||||||
|
- `observed_country`
|
||||||
|
- observation time
|
||||||
|
- Decide whether observations are stored as full facts, time-bucketed facts, or a hybrid model.
|
||||||
|
- Keep storage bounded and suitable for later aggregation.
|
||||||
|
- Add read support needed for internal recalculation and admin inspection.
|
||||||
|
|
||||||
|
Constraints:
|
||||||
|
|
||||||
|
- Do not turn this into a raw per-request IP audit log.
|
||||||
|
- Prefer country-level facts over low-level network data.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- The service stores enough observed-country history to support ranking and review.
|
||||||
|
|
||||||
|
## Stage 08 — Implement Per-Session Country Ranking
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Maintain ranked countries per `device_session_id`.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Define the initial scoring algorithm using recent activities with decay.
|
||||||
|
- Implement score update on each processed observation.
|
||||||
|
- Persist ranked country scores per `device_session_id`.
|
||||||
|
- Define how ties are handled.
|
||||||
|
- Define how stale scores decay or are compacted over time.
|
||||||
|
- Expose enough state for later admin inspection.
|
||||||
|
|
||||||
|
Important constraints:
|
||||||
|
|
||||||
|
- No active-day model in v1.
|
||||||
|
- No heavy analytics pipeline.
|
||||||
|
- Keep updates cheap enough for continuous background processing.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- Each `device_session_id` has a current ranked country list.
|
||||||
|
- Ranking is stable and cheap to update.
|
||||||
|
|
||||||
|
## Stage 09 — Compute usual_connection_country
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Derive a current per-session representative country from the ranking.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Define the selection rule for the top country.
|
||||||
|
- Decide whether a minimum score or minimum margin is needed before setting a value.
|
||||||
|
- Persist the current `usual_connection_country` per `device_session_id`.
|
||||||
|
- Add recalculation hooks when session country scores change.
|
||||||
|
- Add tests for common drift scenarios:
|
||||||
|
- one stable country
|
||||||
|
- gradual shift over time
|
||||||
|
- alternating countries
|
||||||
|
- sparse activity
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- `usual_connection_country` can be read directly without recomputing the full score set every time.
|
||||||
|
|
||||||
|
## Stage 10 — Implement Review Recommendation State
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Persist and expose `country_review_recommended`.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Define the initial rule that sets the review flag.
|
||||||
|
- Persist review state at user level.
|
||||||
|
- Detect transitions from `false` to `true`.
|
||||||
|
- Ensure repeated writes do not keep re-emitting the same transition indefinitely.
|
||||||
|
- Add API access for reading the flag.
|
||||||
|
- Add background recalculation entry points if the rule changes later.
|
||||||
|
|
||||||
|
Design requirement:
|
||||||
|
|
||||||
|
- Review state must live in storage and be queryable even if event delivery fails.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- The flag is durable, queryable, and transition-aware.
|
||||||
|
|
||||||
|
## Stage 11 — Publish Review Events and Optional Email
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Add auxiliary notifications for review-worthy users.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Define the event payload for `country_review_recommended=true`.
|
||||||
|
- Implement event publication on transition to `true`.
|
||||||
|
- Implement configuration-driven email notification through `Mail Service`.
|
||||||
|
- Add notification deduplication or transition-only logic to prevent spam.
|
||||||
|
- Add failure metrics for both event publication and mail send.
|
||||||
|
|
||||||
|
Important constraints:
|
||||||
|
|
||||||
|
- The event bus is not the authoritative source of truth.
|
||||||
|
- Email is optional and non-blocking for business correctness.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- Review transitions can notify administrators without becoming a dependency for state correctness.
|
||||||
|
|
||||||
|
## Stage 12 — Implement Suspicious Multi-Country Session Detection
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Detect suspicious short-window cross-country behavior across sessions of the same user.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Define the initial heuristic for suspicious mixed-country windows.
|
||||||
|
- Decide which session becomes the target of blocking when a conflict appears.
|
||||||
|
- Implement detection logic using stored observations and/or per-session summaries.
|
||||||
|
- Add persistence for suspicion evidence or at least action logs.
|
||||||
|
- Keep the heuristic configurable, not hard-coded deep in the codebase.
|
||||||
|
|
||||||
|
Important constraints:
|
||||||
|
|
||||||
|
- The current triggering request is allowed to continue.
|
||||||
|
- Only suspicious `device_session_id` values are blocked.
|
||||||
|
- The entire user account is never blocked by this service.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- The service can identify suspicious session patterns and produce a block action request.
|
||||||
|
|
||||||
|
## Stage 13 — Integrate Session Blocking with Auth / Session Service
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Make suspicious session handling operational.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Define the internal API contract for session blocking.
|
||||||
|
- Implement the client toward `Auth / Session Service`.
|
||||||
|
- Ensure block requests are idempotent.
|
||||||
|
- Record block requests and outcomes locally for inspection.
|
||||||
|
- Add retry or failure-handling policy for temporary downstream failures.
|
||||||
|
- Add metrics for block attempts, successes, and failures.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- Geo Profile Service can request blocking of suspicious sessions and track the result.
|
||||||
|
|
||||||
|
## Stage 14 — Implement Declared Country Version History
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Add versioned history of `declared_country` inside Geo Profile Service.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Define the version record schema.
|
||||||
|
- Persist all approved changes as immutable version records.
|
||||||
|
- Add actor metadata needed for internal audit:
|
||||||
|
- who triggered the change
|
||||||
|
- when it happened
|
||||||
|
- optional reason or comment
|
||||||
|
- Implement version lifecycle state if adopted:
|
||||||
|
- `recorded`
|
||||||
|
- `applied`
|
||||||
|
- `sync_failed`
|
||||||
|
- Add read support for history in admin APIs.
|
||||||
|
|
||||||
|
Important constraint:
|
||||||
|
|
||||||
|
- Version history is owned only by Geo Profile Service.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- The service can preserve the full change history independently from `User Service`.
|
||||||
|
|
||||||
|
## Stage 15 — Implement Current Country Sync to User Service
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Keep the latest effective `declared_country` centralized in `User Service`.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Define the internal REST contract to update current `declared_country` in `User Service`.
|
||||||
|
- Implement synchronous update from Geo Profile Service.
|
||||||
|
- Ensure that a history version does not become effective until the sync succeeds.
|
||||||
|
- Implement failure handling and status persistence when sync fails.
|
||||||
|
- Add retry tooling or operator visibility for failed syncs.
|
||||||
|
|
||||||
|
Design requirement:
|
||||||
|
|
||||||
|
- No other service should bypass this write path.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- Approved changes update both version history and current user state without silent divergence.
|
||||||
|
|
||||||
|
## Stage 16 — Build the Internal Read APIs
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Expose the minimum trusted JSON REST API required for operations and admin tooling.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Implement review-candidate listing endpoint.
|
||||||
|
- Support at least:
|
||||||
|
- `country_review_recommended=true`
|
||||||
|
- pagination
|
||||||
|
- stable ordering
|
||||||
|
- Implement user geo-profile endpoint.
|
||||||
|
- Group returned data by `device_session_id`.
|
||||||
|
- Include:
|
||||||
|
- review flag
|
||||||
|
- per-session ranked countries
|
||||||
|
- `usual_connection_country`
|
||||||
|
- observation summaries
|
||||||
|
- declared country history
|
||||||
|
- block-action history if useful
|
||||||
|
- Add authentication and authorization appropriate for trusted internal callers.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- Admin tools can list users for review and inspect full geo-related user state.
|
||||||
|
|
||||||
|
## Stage 17 — Build the Internal Command API for Country Change Application
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Expose the internal command path for approved `declared_country` changes.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Implement the trusted internal command endpoint.
|
||||||
|
- Accept the approved new country and actor metadata.
|
||||||
|
- Write the new version record.
|
||||||
|
- Synchronize current value into `User Service`.
|
||||||
|
- Return success only if the change is fully applied.
|
||||||
|
- Return a recoverable failure state if sync fails.
|
||||||
|
|
||||||
|
Clarification:
|
||||||
|
|
||||||
|
- Public user-facing request creation is outside this service boundary unless explicitly added later.
|
||||||
|
- This command API is for internal orchestration of approved changes.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- Admin or internal orchestration can apply a country change through one controlled path.
|
||||||
|
|
||||||
|
## Stage 18 — Add Admin-Oriented Data Shaping
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Make the returned data useful for manual decisions without overloading the API consumer.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Shape user geo-profile responses around manual review needs.
|
||||||
|
- Include compact ranked-country views per session.
|
||||||
|
- Include enough timestamps to understand temporal drift.
|
||||||
|
- Include current review recommendation state.
|
||||||
|
- Include declared-country version chain in a readable order.
|
||||||
|
- Avoid leaking unnecessary low-level network data.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- The admin interface can render useful country history and session separation without extra joins.
|
||||||
|
|
||||||
|
## Stage 19 — Add Observability and Operational Controls
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Make the service operable in production before traffic ramps up.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Add metrics for every critical path:
|
||||||
|
- ingest publish receipt
|
||||||
|
- queue depth and lag
|
||||||
|
- worker throughput
|
||||||
|
- Geo-IP lookup failures
|
||||||
|
- ranking updates
|
||||||
|
- review-flag transitions
|
||||||
|
- block requests
|
||||||
|
- user-service sync failures
|
||||||
|
- mail and event failures
|
||||||
|
- Add structured logs with correlation identifiers where possible.
|
||||||
|
- Add readiness and liveness endpoints.
|
||||||
|
- Add dashboards and alerts for:
|
||||||
|
- queue lag
|
||||||
|
- persistent sync failures
|
||||||
|
- spike in suspicious session blocks
|
||||||
|
- Geo-IP database stale age
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- Production operation does not depend on manual log-grepping.
|
||||||
|
|
||||||
|
## Stage 20 — Add Test Coverage in Increasing Layers
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Validate the service incrementally, from pure logic up to full integration.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Add unit tests for:
|
||||||
|
- Geo-IP lookup adapter
|
||||||
|
- ranking logic
|
||||||
|
- `usual_connection_country` selection
|
||||||
|
- review recommendation logic
|
||||||
|
- suspicious session detection
|
||||||
|
- Add storage tests for:
|
||||||
|
- observation persistence
|
||||||
|
- version history
|
||||||
|
- queue behavior
|
||||||
|
- Add integration tests for:
|
||||||
|
- edge-style ingest acceptance
|
||||||
|
- worker processing
|
||||||
|
- `User Service` sync behavior
|
||||||
|
- `Auth / Session Service` block calls
|
||||||
|
- event and mail side effects
|
||||||
|
- Add failure-path tests:
|
||||||
|
- malformed FlatBuffers payload
|
||||||
|
- queue retry
|
||||||
|
- Geo-IP lookup miss
|
||||||
|
- `User Service` sync failure
|
||||||
|
- block-request downstream failure
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- The highest-risk logic and all external integrations are covered.
|
||||||
|
|
||||||
|
## Stage 21 — Add Data Migration and Backfill Strategy
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Prepare for safe rollout in an existing microservice environment.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Create initial database migrations.
|
||||||
|
- Define zero-data bootstrap behavior for new users and sessions.
|
||||||
|
- Define how existing users with already populated `declared_country` in `User Service` appear in Geo Profile Service before any version history exists.
|
||||||
|
- Decide whether an initial synthetic version record is needed for current production users.
|
||||||
|
- Add operational scripts for repair and backfill if required.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- The service can be introduced without corrupting current user country state.
|
||||||
|
|
||||||
|
## Stage 22 — Roll Out in Shadow Mode
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Validate the service behavior before relying on its outputs operationally.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Deploy Geo Profile Service without enabling admin actions or session blocking.
|
||||||
|
- Publish ingest data from edge asynchronously.
|
||||||
|
- Process observations and compute derived state silently.
|
||||||
|
- Observe queue behavior, lookup correctness, score stability, and storage growth.
|
||||||
|
- Compare resulting data shape against expected real traffic behavior.
|
||||||
|
- Tune thresholds for:
|
||||||
|
- review recommendation
|
||||||
|
- suspicious mixed-country detection
|
||||||
|
- score decay
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- The service behaves sanely on production-shaped traffic without affecting users.
|
||||||
|
|
||||||
|
## Stage 23 — Enable Review Workflow
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Turn on the first real consumer-facing internal functionality.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Enable review-candidate listing in the admin interface.
|
||||||
|
- Enable user geo-profile rendering.
|
||||||
|
- Enable approved country-change application path.
|
||||||
|
- Keep session blocking disabled if needed for a staged rollout.
|
||||||
|
- Verify that `User Service` stays consistent with declared-country version history.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- Administrators can inspect users and apply country changes safely.
|
||||||
|
|
||||||
|
## Stage 24 — Enable Suspicious Session Blocking
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Turn on the account-protection part of the service.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Enable session-block command emission to `Auth / Session Service`.
|
||||||
|
- Start with conservative thresholds.
|
||||||
|
- Monitor false positives closely.
|
||||||
|
- Add temporary operational kill-switches for the detection path.
|
||||||
|
- Verify that only suspicious sessions are blocked and not entire accounts.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- The service can protect accounts without destabilizing the rest of the platform.
|
||||||
|
|
||||||
|
## Stage 25 — Stabilize and Simplify
|
||||||
|
|
||||||
|
Goal:
|
||||||
|
|
||||||
|
- Remove accidental complexity after the first complete iteration.
|
||||||
|
|
||||||
|
Tasks:
|
||||||
|
|
||||||
|
- Review actual queue backlog behavior.
|
||||||
|
- Review observation retention cost.
|
||||||
|
- Review whether optional hashed IP storage is still unnecessary.
|
||||||
|
- Review scoring tunability versus implementation complexity.
|
||||||
|
- Remove dead code and speculative abstractions.
|
||||||
|
- Freeze the v1 API once real consumers are stable.
|
||||||
|
|
||||||
|
Exit criteria:
|
||||||
|
|
||||||
|
- The service remains small, understandable, and aligned with its original narrow purpose.
|
||||||
|
|
||||||
|
## Delivery Sequence Summary
|
||||||
|
|
||||||
|
Recommended delivery order:
|
||||||
|
|
||||||
|
- Domain vocabulary and ownership
|
||||||
|
- Domain model
|
||||||
|
- FlatBuffers schema
|
||||||
|
- Async ingest transport
|
||||||
|
- Internal durable queue
|
||||||
|
- Geo-IP lookup
|
||||||
|
- Observation persistence
|
||||||
|
- Session ranking
|
||||||
|
- `usual_connection_country`
|
||||||
|
- Review state
|
||||||
|
- Event and mail notifications
|
||||||
|
- Suspicious-session detection
|
||||||
|
- Session blocking integration
|
||||||
|
- Declared-country versioning
|
||||||
|
- Sync to `User Service`
|
||||||
|
- Admin read API
|
||||||
|
- Country-change command API
|
||||||
|
- Observability
|
||||||
|
- Tests
|
||||||
|
- Shadow rollout
|
||||||
|
- Review enablement
|
||||||
|
- Blocking enablement
|
||||||
|
- Cleanup
|
||||||
|
|
||||||
|
## Final Acceptance Criteria
|
||||||
|
|
||||||
|
The implementation may be considered complete for v1 when all of the following are true:
|
||||||
|
|
||||||
|
- `Edge Service` publishes authenticated country observations asynchronously without affecting request processing.
|
||||||
|
- Geo Profile Service resolves and stores `observed_country`.
|
||||||
|
- The service maintains per-`device_session_id` country ranking and `usual_connection_country`.
|
||||||
|
- `country_review_recommended` is durable, queryable, and not event-dependent.
|
||||||
|
- Admin tooling can fetch review candidates and per-user geo profiles.
|
||||||
|
- Approved `declared_country` changes are versioned in Geo Profile Service and synchronized into `User Service`.
|
||||||
|
- Suspicious sessions can be blocked through `Auth / Session Service`.
|
||||||
|
- Optional email and event notifications work without becoming correctness dependencies.
|
||||||
|
- The service is observable and operable under real traffic.
|
||||||
@@ -0,0 +1,929 @@
|
|||||||
|
# Geo Profile Service
|
||||||
|
|
||||||
|
## Context and Purpose
|
||||||
|
|
||||||
|
Geo Profile Service is an internal trusted microservice responsible for collecting and processing country-level connection signals for authenticated users.
|
||||||
|
|
||||||
|
The service exists to solve four related problems:
|
||||||
|
|
||||||
|
- Record the observed country of authenticated requests based on local Geo-IP lookup.
|
||||||
|
- Maintain per-`device_session_id` country statistics and derive a `usual_connection_country`.
|
||||||
|
- Support administrative review workflows around user country changes.
|
||||||
|
- Detect suspicious multi-country session behavior and request blocking of suspicious sessions through `Auth / Session Service`.
|
||||||
|
|
||||||
|
The service is intentionally narrow in scope. It does not own authentication, user identity validation, or user-facing profile reads for the latest country value.
|
||||||
|
|
||||||
|
## Explicit Non-Goals
|
||||||
|
|
||||||
|
The following are intentionally out of scope for this service:
|
||||||
|
|
||||||
|
- Region-level or city-level geolocation.
|
||||||
|
- VPN, proxy, anonymizer, or hosting-provider detection.
|
||||||
|
- Automatic change of `declared_country` based on observed metrics.
|
||||||
|
- Immediate blocking of the same request that triggered suspicion.
|
||||||
|
- Global source-of-truth ownership for the current user country.
|
||||||
|
- Direct exposure of storage to other services.
|
||||||
|
- Strong audit reproducibility of historical Geo-IP lookup results by storing Geo-IP database versions.
|
||||||
|
|
||||||
|
## Place in the Existing Microservice System
|
||||||
|
|
||||||
|
The service is embedded into an already existing trusted microservice environment and integrates with:
|
||||||
|
|
||||||
|
- `Edge Service`
|
||||||
|
- `Auth / Session Service`
|
||||||
|
- `User Service`
|
||||||
|
- `Mail Service`
|
||||||
|
- Internal event bus
|
||||||
|
|
||||||
|
`Edge Service` is the producer of authenticated connection observations.
|
||||||
|
|
||||||
|
`User Service` remains the centralized owner of the latest effective `declared_country` value for normal user profile reads.
|
||||||
|
|
||||||
|
`Auth / Session Service` remains the owner of session lifecycle and session blocking.
|
||||||
|
|
||||||
|
`Mail Service` is used only for optional administrative notifications.
|
||||||
|
|
||||||
|
The event bus is used only as an auxiliary notification channel and not as the authoritative source of business state.
|
||||||
|
|
||||||
|
## Responsibility Boundaries
|
||||||
|
|
||||||
|
Geo Profile Service owns:
|
||||||
|
|
||||||
|
- Geo-IP lookup at country level using a local database.
|
||||||
|
- Storage of `observed_country` as a fact of observation.
|
||||||
|
- Per-`device_session_id` country aggregation.
|
||||||
|
- Computation of `usual_connection_country`.
|
||||||
|
- Computation and storage of `country_review_recommended`.
|
||||||
|
- Version history of `declared_country`.
|
||||||
|
- Internal administrative read APIs for geo-related user state.
|
||||||
|
- Internal command API to apply approved `declared_country` changes.
|
||||||
|
- Detection of suspicious cross-country session patterns.
|
||||||
|
- Session block requests toward `Auth / Session Service`.
|
||||||
|
|
||||||
|
Geo Profile Service does not own:
|
||||||
|
|
||||||
|
- Validation of `user_id` and `device_session_id` against external services.
|
||||||
|
- Public user profile reads for the latest country value.
|
||||||
|
- Authentication or authorization of end users.
|
||||||
|
- Final enforcement of session blocking.
|
||||||
|
- Delivery guarantees of auxiliary event notifications.
|
||||||
|
- Formal administrative SLA or rigid approval policies.
|
||||||
|
|
||||||
|
## Semantic Model
|
||||||
|
|
||||||
|
The service works with four core country-related concepts.
|
||||||
|
|
||||||
|
### declared_country
|
||||||
|
|
||||||
|
`declared_country` is the user-declared country.
|
||||||
|
|
||||||
|
Properties:
|
||||||
|
|
||||||
|
- It is a user-facing business attribute.
|
||||||
|
- The latest effective value is stored in `User Service`.
|
||||||
|
- The full version history is stored in Geo Profile Service.
|
||||||
|
- It is never changed automatically by metrics.
|
||||||
|
- It changes only through a controlled command path and administrative approval.
|
||||||
|
|
||||||
|
### observed_country
|
||||||
|
|
||||||
|
`observed_country` is the country derived from Geo-IP for a specific authenticated request.
|
||||||
|
|
||||||
|
Properties:
|
||||||
|
|
||||||
|
- It is an observation fact, not a truth claim about residence.
|
||||||
|
- It is tied to `user_id`, `device_session_id`, and observation time.
|
||||||
|
- It is derived on the server side from the source IP seen at the trusted edge.
|
||||||
|
- It is used as input into country aggregation and anomaly detection.
|
||||||
|
|
||||||
|
### usual_connection_country
|
||||||
|
|
||||||
|
`usual_connection_country` is the computed most typical country of network egress for a given `device_session_id`.
|
||||||
|
|
||||||
|
Properties:
|
||||||
|
|
||||||
|
- It is not interpreted as country of residence.
|
||||||
|
- It is calculated per `device_session_id`, not globally per account.
|
||||||
|
- It is derived from recent weighted observations with decay over time.
|
||||||
|
- It is expected to drift naturally as usage patterns change.
|
||||||
|
|
||||||
|
### country_review_recommended
|
||||||
|
|
||||||
|
`country_review_recommended` is an internal service flag that indicates that the accumulated observations justify administrative review.
|
||||||
|
|
||||||
|
Properties:
|
||||||
|
|
||||||
|
- It does not trigger automatic country change.
|
||||||
|
- It is stored durably in the service state.
|
||||||
|
- It is readable through the service API.
|
||||||
|
- Transition to `true` may also emit an event and optionally send email.
|
||||||
|
|
||||||
|
## Data Ownership Rules
|
||||||
|
|
||||||
|
The split ownership model is intentional.
|
||||||
|
|
||||||
|
- `User Service` owns the latest effective `declared_country`.
|
||||||
|
- Geo Profile Service owns the history of `declared_country` changes.
|
||||||
|
- Geo Profile Service owns `observed_country`, `usual_connection_country`, and `country_review_recommended`.
|
||||||
|
|
||||||
|
This means Geo Profile Service is the owner of the country-change process, but `User Service` is the owner of the currently effective denormalized value used by the rest of the system.
|
||||||
|
|
||||||
|
To avoid divergence:
|
||||||
|
|
||||||
|
- No service other than Geo Profile Service should directly mutate the current `declared_country` in `User Service`.
|
||||||
|
- Geo Profile Service must write the new version in its own storage first.
|
||||||
|
- Geo Profile Service must then synchronously update the current value in `User Service`.
|
||||||
|
- A version should become effective only after the `User Service` update succeeds.
|
||||||
|
|
||||||
|
## High-Level Architecture
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
flowchart LR
|
||||||
|
Client[Client] --> Edge[Edge Service]
|
||||||
|
Edge --> Auth[Auth / Session Service]
|
||||||
|
Auth --> Edge
|
||||||
|
|
||||||
|
Edge -. async flatbuffers ingest .-> Geo[Geo Profile Service]
|
||||||
|
|
||||||
|
Geo --> User[User Service]
|
||||||
|
Geo --> Mail[Mail Service]
|
||||||
|
Geo --> Bus[Event Bus]
|
||||||
|
Geo --> Auth
|
||||||
|
|
||||||
|
AdminUI[Admin Interface] --> Edge
|
||||||
|
Edge --> Geo
|
||||||
|
Edge --> User
|
||||||
|
````
|
||||||
|
|
||||||
|
## Ingress Processing Model
|
||||||
|
|
||||||
|
The hot path from `Edge Service` to Geo Profile Service is intentionally asynchronous and non-blocking for the edge.
|
||||||
|
|
||||||
|
Design rules:
|
||||||
|
|
||||||
|
- `Edge Service` publishes a minimal FlatBuffers message after user authentication.
|
||||||
|
- The message contains only:
|
||||||
|
|
||||||
|
- `user_id`
|
||||||
|
- `device_session_id`
|
||||||
|
- `ip_address`
|
||||||
|
- No protobuf wrapper is used.
|
||||||
|
- No business response is required from Geo Profile Service.
|
||||||
|
- The edge does not depend on this service for normal request continuation.
|
||||||
|
- Failures are treated as observability signals, not as reasons to change gateway behavior.
|
||||||
|
|
||||||
|
This design explicitly prioritizes low infrastructure complexity and low overhead on the hottest path over strict RPC semantics.
|
||||||
|
|
||||||
|
## Ingress Transport Contract
|
||||||
|
|
||||||
|
The ingress path is not modeled as conventional request-response RPC.
|
||||||
|
|
||||||
|
Recommended transport shape:
|
||||||
|
|
||||||
|
- Internal binary HTTP endpoint or similarly simple internal binary transport.
|
||||||
|
- `application/octet-stream` body encoded as FlatBuffers.
|
||||||
|
- Minimal acknowledgement such as `202 Accepted` with empty body.
|
||||||
|
- The acknowledgement is not part of business logic.
|
||||||
|
- The edge client should publish asynchronously and ignore service availability for request progression.
|
||||||
|
|
||||||
|
The service must only validate:
|
||||||
|
|
||||||
|
- FlatBuffers message integrity.
|
||||||
|
- Presence of required scalar fields.
|
||||||
|
- Basic field shape constraints.
|
||||||
|
|
||||||
|
The service must not validate:
|
||||||
|
|
||||||
|
- Whether `user_id` exists.
|
||||||
|
- Whether `device_session_id` belongs to the user.
|
||||||
|
- Whether the session is still valid.
|
||||||
|
|
||||||
|
Those concerns belong to the already trusted authentication/session layer.
|
||||||
|
|
||||||
|
## Internal Queue and Worker Pipeline
|
||||||
|
|
||||||
|
Geo Profile Service must process ingress data in its own queue and worker flow.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
flowchart LR
|
||||||
|
E[Edge Service] -. async flatbuffers publish .-> I[Ingest Receiver]
|
||||||
|
I --> Q[Internal Ingest Queue]
|
||||||
|
Q --> W[Processing Worker]
|
||||||
|
W --> G[Geo-IP Resolver]
|
||||||
|
G --> A[Observation Aggregator]
|
||||||
|
A --> U[usual_connection_country Calculator]
|
||||||
|
A --> R[country_review_recommended Evaluator]
|
||||||
|
A --> S[Session Suspicion Detector]
|
||||||
|
S --> B[Block Session Command]
|
||||||
|
```
|
||||||
|
|
||||||
|
The internal queue exists to decouple network acceptance from CPU and storage work.
|
||||||
|
|
||||||
|
Required properties:
|
||||||
|
|
||||||
|
- The network-facing ingest step is append/update-only.
|
||||||
|
- The worker can process observations independently from the ingest receiver.
|
||||||
|
- Expensive logic must not run inline on the network acceptance step.
|
||||||
|
- Queue backlog and processing latency must be observable.
|
||||||
|
|
||||||
|
A simple durable internal queue is preferred over a complex broker dependency for this part of the system.
|
||||||
|
|
||||||
|
## Service Interface Model
|
||||||
|
|
||||||
|
The service interface is intentionally divided into commands, queries, and events.
|
||||||
|
|
||||||
|
This split exists to preserve the architectural rules already fixed above:
|
||||||
|
|
||||||
|
- Hot-path ingest is asynchronous and write-oriented.
|
||||||
|
- Administrative reads use trusted internal JSON REST APIs.
|
||||||
|
- State-changing administrative operations follow one controlled command path.
|
||||||
|
- Events are auxiliary notifications and never the only representation of business state.
|
||||||
|
|
||||||
|
## Commands
|
||||||
|
|
||||||
|
Commands change service state or trigger downstream effects.
|
||||||
|
|
||||||
|
### Ingest Connection Observation
|
||||||
|
|
||||||
|
Purpose:
|
||||||
|
|
||||||
|
- Accept an authenticated country observation from `Edge Service`.
|
||||||
|
|
||||||
|
Caller:
|
||||||
|
|
||||||
|
- `Edge Service`
|
||||||
|
|
||||||
|
Transport:
|
||||||
|
|
||||||
|
- Internal binary transport
|
||||||
|
- FlatBuffers payload
|
||||||
|
- Async publication
|
||||||
|
- No business response
|
||||||
|
|
||||||
|
Payload:
|
||||||
|
|
||||||
|
- `user_id`
|
||||||
|
- `device_session_id`
|
||||||
|
- `ip_address`
|
||||||
|
|
||||||
|
Effects:
|
||||||
|
|
||||||
|
- Enqueue observation for processing
|
||||||
|
- Eventually resolve `observed_country`
|
||||||
|
- Update per-session country statistics
|
||||||
|
- Potentially update `usual_connection_country`
|
||||||
|
- Potentially set `country_review_recommended`
|
||||||
|
- Potentially request session blocking through `Auth / Session Service`
|
||||||
|
|
||||||
|
Important behavior:
|
||||||
|
|
||||||
|
- This command must not block edge request processing.
|
||||||
|
- Failure to send or process is an observability concern, not a gateway correctness concern.
|
||||||
|
|
||||||
|
### Apply Approved Declared Country Change
|
||||||
|
|
||||||
|
Purpose:
|
||||||
|
|
||||||
|
- Record a new approved version of `declared_country` and synchronize the current value into `User Service`.
|
||||||
|
|
||||||
|
Caller:
|
||||||
|
|
||||||
|
- Trusted internal administrative workflow
|
||||||
|
- Administrative interface backend
|
||||||
|
- Internal orchestration component
|
||||||
|
|
||||||
|
Transport:
|
||||||
|
|
||||||
|
- Trusted internal JSON REST API
|
||||||
|
|
||||||
|
Input shape:
|
||||||
|
|
||||||
|
- `user_id`
|
||||||
|
- `new_declared_country`
|
||||||
|
- actor identity or actor type
|
||||||
|
- optional reason or comment
|
||||||
|
- optional correlation metadata
|
||||||
|
|
||||||
|
Effects:
|
||||||
|
|
||||||
|
- Create immutable declared-country version record in Geo Profile Service
|
||||||
|
- Synchronize latest effective value to `User Service`
|
||||||
|
- Mark version as effective only after sync succeeds
|
||||||
|
|
||||||
|
Important behavior:
|
||||||
|
|
||||||
|
- Geo Profile Service is the owner of this mutation workflow.
|
||||||
|
- No bypass write path to `User Service` should exist for this field.
|
||||||
|
|
||||||
|
### Request Suspicious Session Block
|
||||||
|
|
||||||
|
Purpose:
|
||||||
|
|
||||||
|
- Ask `Auth / Session Service` to block suspicious `device_session_id` values.
|
||||||
|
|
||||||
|
Caller:
|
||||||
|
|
||||||
|
- Internal processing worker inside Geo Profile Service
|
||||||
|
|
||||||
|
Transport:
|
||||||
|
|
||||||
|
- Trusted internal API call from Geo Profile Service to `Auth / Session Service`
|
||||||
|
|
||||||
|
Input shape:
|
||||||
|
|
||||||
|
- `user_id`
|
||||||
|
- one or more suspicious `device_session_id`
|
||||||
|
- reason or code for block trigger
|
||||||
|
- optional evidence reference
|
||||||
|
|
||||||
|
Effects:
|
||||||
|
|
||||||
|
- Session block request is sent to `Auth / Session Service`
|
||||||
|
- Local action log is written in Geo Profile Service
|
||||||
|
|
||||||
|
Important behavior:
|
||||||
|
|
||||||
|
- Current triggering request is not interrupted.
|
||||||
|
- The effect is expected on subsequent requests.
|
||||||
|
|
||||||
|
## Queries
|
||||||
|
|
||||||
|
Queries return internal state and never mutate business state.
|
||||||
|
|
||||||
|
### List Review Candidates
|
||||||
|
|
||||||
|
Purpose:
|
||||||
|
|
||||||
|
- Return `user_id` values matching review-related filters.
|
||||||
|
|
||||||
|
Caller:
|
||||||
|
|
||||||
|
- Administrative interface
|
||||||
|
- Internal operational tooling
|
||||||
|
|
||||||
|
Transport:
|
||||||
|
|
||||||
|
- Trusted internal JSON REST API
|
||||||
|
|
||||||
|
Initial supported filter:
|
||||||
|
|
||||||
|
- `country_review_recommended=true`
|
||||||
|
|
||||||
|
Expected response characteristics:
|
||||||
|
|
||||||
|
- Pagination
|
||||||
|
- Stable ordering
|
||||||
|
- Ability to extend filter set later without changing the conceptual API class
|
||||||
|
|
||||||
|
### Read User Geo Profile
|
||||||
|
|
||||||
|
Purpose:
|
||||||
|
|
||||||
|
- Return the geo-related internal state of a single user for manual review or investigation.
|
||||||
|
|
||||||
|
Caller:
|
||||||
|
|
||||||
|
- Administrative interface
|
||||||
|
- Internal operational tooling
|
||||||
|
|
||||||
|
Transport:
|
||||||
|
|
||||||
|
- Trusted internal JSON REST API
|
||||||
|
|
||||||
|
Response should include, at minimum:
|
||||||
|
|
||||||
|
- `user_id`
|
||||||
|
- current `country_review_recommended`
|
||||||
|
- per-`device_session_id` country ranking
|
||||||
|
- per-`device_session_id` `usual_connection_country`
|
||||||
|
- observation summaries grouped by `device_session_id`
|
||||||
|
- declared-country version history
|
||||||
|
- suspicious-session indicators if present
|
||||||
|
- session-block action history if useful for operations
|
||||||
|
|
||||||
|
### Read Service Health and Operational State
|
||||||
|
|
||||||
|
Purpose:
|
||||||
|
|
||||||
|
- Expose service-operability information for internal monitoring.
|
||||||
|
|
||||||
|
Caller:
|
||||||
|
|
||||||
|
- Monitoring systems
|
||||||
|
- Internal operators
|
||||||
|
|
||||||
|
Transport:
|
||||||
|
|
||||||
|
- Internal HTTP endpoints
|
||||||
|
|
||||||
|
Response may include:
|
||||||
|
|
||||||
|
- readiness state
|
||||||
|
- liveness state
|
||||||
|
- queue lag indicators
|
||||||
|
- Geo-IP database status
|
||||||
|
- downstream integration health summaries
|
||||||
|
|
||||||
|
This query group is operational, not business-facing.
|
||||||
|
|
||||||
|
## Events
|
||||||
|
|
||||||
|
Events are emitted as auxiliary notifications. They are not sources of truth.
|
||||||
|
|
||||||
|
### Country Review Recommended
|
||||||
|
|
||||||
|
Meaning:
|
||||||
|
|
||||||
|
- `country_review_recommended` transitioned from `false` to `true` for a user.
|
||||||
|
|
||||||
|
Producer:
|
||||||
|
|
||||||
|
- Geo Profile Service
|
||||||
|
|
||||||
|
Consumers:
|
||||||
|
|
||||||
|
- Administrative workflow automation
|
||||||
|
- Internal notification consumers
|
||||||
|
- Optional future downstream internal systems
|
||||||
|
|
||||||
|
Delivery channel:
|
||||||
|
|
||||||
|
- Internal event bus
|
||||||
|
|
||||||
|
Guarantees:
|
||||||
|
|
||||||
|
- Best effort only, unless the underlying bus is later upgraded
|
||||||
|
- Loss of event must not lose the actual business state
|
||||||
|
|
||||||
|
Durable state counterpart:
|
||||||
|
|
||||||
|
- The current review flag must remain available through Geo Profile Service query APIs
|
||||||
|
|
||||||
|
### Optional Admin Email Notification
|
||||||
|
|
||||||
|
Meaning:
|
||||||
|
|
||||||
|
- Administrative email generated because a user entered review-recommended state
|
||||||
|
|
||||||
|
Producer:
|
||||||
|
|
||||||
|
- Geo Profile Service via `Mail Service`
|
||||||
|
|
||||||
|
Consumers:
|
||||||
|
|
||||||
|
- Administrators
|
||||||
|
|
||||||
|
This is operationally useful but never required for correctness.
|
||||||
|
|
||||||
|
## Data Entities
|
||||||
|
|
||||||
|
This section defines the core logical entities of the service. These are domain entities, not mandatory final physical table names.
|
||||||
|
|
||||||
|
### Country Observation
|
||||||
|
|
||||||
|
Represents a stored observation fact derived from one authenticated request.
|
||||||
|
|
||||||
|
Required logical fields:
|
||||||
|
|
||||||
|
- `user_id`
|
||||||
|
- `device_session_id`
|
||||||
|
- `observed_country`
|
||||||
|
- observation timestamp
|
||||||
|
|
||||||
|
Optional implementation fields:
|
||||||
|
|
||||||
|
- obfuscated or hashed IP representation
|
||||||
|
- internal ingestion metadata
|
||||||
|
- processing metadata
|
||||||
|
|
||||||
|
Role in the system:
|
||||||
|
|
||||||
|
- Source data for rankings
|
||||||
|
- Source data for suspicious-session detection
|
||||||
|
- Source data for review recommendations
|
||||||
|
|
||||||
|
### Device Session Country Score
|
||||||
|
|
||||||
|
Represents the weighted ranking of countries for one `device_session_id`.
|
||||||
|
|
||||||
|
Required logical fields:
|
||||||
|
|
||||||
|
- `device_session_id`
|
||||||
|
- `country_code`
|
||||||
|
- current score
|
||||||
|
- last contribution timestamp
|
||||||
|
- optional rank or ordering marker
|
||||||
|
|
||||||
|
Role in the system:
|
||||||
|
|
||||||
|
- Maintains the rolling per-session country distribution
|
||||||
|
- Supports direct derivation of `usual_connection_country`
|
||||||
|
|
||||||
|
### Device Session Geo State
|
||||||
|
|
||||||
|
Represents the current derived geographic state of one `device_session_id`.
|
||||||
|
|
||||||
|
Required logical fields:
|
||||||
|
|
||||||
|
- `device_session_id`
|
||||||
|
- current `usual_connection_country`
|
||||||
|
- last observation timestamp
|
||||||
|
- summary metadata needed by admin APIs
|
||||||
|
|
||||||
|
Role in the system:
|
||||||
|
|
||||||
|
- Read-optimized representation of session-level geo state
|
||||||
|
- Allows admin APIs to avoid recomputing from raw observations on each read
|
||||||
|
|
||||||
|
### User Review State
|
||||||
|
|
||||||
|
Represents the current review-related state for one user.
|
||||||
|
|
||||||
|
Required logical fields:
|
||||||
|
|
||||||
|
- `user_id`
|
||||||
|
- `country_review_recommended`
|
||||||
|
- last evaluation timestamp
|
||||||
|
- optional reason code or explanation marker
|
||||||
|
|
||||||
|
Role in the system:
|
||||||
|
|
||||||
|
- Durable source for review filtering
|
||||||
|
- Source of truth for admin API candidate listing
|
||||||
|
- State backing for auxiliary event emission
|
||||||
|
|
||||||
|
### Declared Country Version
|
||||||
|
|
||||||
|
Represents one immutable version of the declared country.
|
||||||
|
|
||||||
|
Required logical fields:
|
||||||
|
|
||||||
|
- `user_id`
|
||||||
|
- version identifier
|
||||||
|
- `declared_country`
|
||||||
|
- version creation timestamp
|
||||||
|
- actor identity or actor type
|
||||||
|
- optional reason or comment
|
||||||
|
- version status
|
||||||
|
|
||||||
|
Suggested version statuses:
|
||||||
|
|
||||||
|
- `recorded`
|
||||||
|
- `applied`
|
||||||
|
- `sync_failed`
|
||||||
|
|
||||||
|
Role in the system:
|
||||||
|
|
||||||
|
- Immutable history of approved country changes
|
||||||
|
- Separation between local history and currently effective external value
|
||||||
|
|
||||||
|
### Session Block Action
|
||||||
|
|
||||||
|
Represents a record of a suspicious-session block request.
|
||||||
|
|
||||||
|
Required logical fields:
|
||||||
|
|
||||||
|
- `user_id`
|
||||||
|
- `device_session_id`
|
||||||
|
- action timestamp
|
||||||
|
- reason code
|
||||||
|
- result status
|
||||||
|
|
||||||
|
Role in the system:
|
||||||
|
|
||||||
|
- Operational trace of protection actions
|
||||||
|
- Support for troubleshooting and admin inspection
|
||||||
|
|
||||||
|
## Geo-IP Source
|
||||||
|
|
||||||
|
The service uses a locally stored free Geo-IP country database.
|
||||||
|
|
||||||
|
Requirements:
|
||||||
|
|
||||||
|
- No per-request calls to external Geo-IP services.
|
||||||
|
- The database must be actively maintained and not abandoned.
|
||||||
|
- The service only needs country-level lookup.
|
||||||
|
- Database refresh is handled internally on a schedule.
|
||||||
|
|
||||||
|
Version of the Geo-IP database is not stored with each observation, by explicit design choice.
|
||||||
|
|
||||||
|
## Observation Storage Strategy
|
||||||
|
|
||||||
|
The service does not keep a full raw IP log for every API request.
|
||||||
|
|
||||||
|
The primary stored signal is the derived country observation and its aggregates.
|
||||||
|
|
||||||
|
Recommended storage model:
|
||||||
|
|
||||||
|
- Store observation facts at country level.
|
||||||
|
- Aggregate per `device_session_id`.
|
||||||
|
- Keep enough history to compute ranking and review decisions.
|
||||||
|
- Retain no raw IP by default.
|
||||||
|
- Allow optional obfuscated or hashed IP retention only if later justified by operational needs.
|
||||||
|
|
||||||
|
A one-year observation horizon is acceptable as a starting point, subject to real data volume.
|
||||||
|
|
||||||
|
## Derived Statistics Model
|
||||||
|
|
||||||
|
The service computes a weighted ranking of countries per `device_session_id`.
|
||||||
|
|
||||||
|
Baseline principles:
|
||||||
|
|
||||||
|
- More recent observations carry more weight.
|
||||||
|
- Older observations decay over time.
|
||||||
|
- The calculation is based on recent activities, not active calendar days.
|
||||||
|
- The scoring model must remain computationally cheap and tunable.
|
||||||
|
|
||||||
|
The service must maintain, at minimum:
|
||||||
|
|
||||||
|
- Ranked observed countries for each `device_session_id`
|
||||||
|
- Current `usual_connection_country` for each `device_session_id`
|
||||||
|
- Sufficient ranking data for administrative inspection
|
||||||
|
|
||||||
|
The precise scoring formula is configurable and intentionally left outside this document.
|
||||||
|
|
||||||
|
## Suspicious Session Logic
|
||||||
|
|
||||||
|
The service must detect suspicious multi-country behavior across multiple sessions of the same user.
|
||||||
|
|
||||||
|
The intended interpretation is:
|
||||||
|
|
||||||
|
- Slow geographic drift over larger time spans is normal.
|
||||||
|
- Simultaneous or near-simultaneous active usage from conflicting countries is suspicious.
|
||||||
|
- Suspicion targets sessions, not the entire account.
|
||||||
|
|
||||||
|
Important trade-off:
|
||||||
|
|
||||||
|
- The request that caused the suspicion is allowed to proceed.
|
||||||
|
- Session blocking is requested asynchronously afterward.
|
||||||
|
- The next request from the blocked session should be rejected by `Auth / Session Service`.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
flowchart TD
|
||||||
|
O[New processed observation] --> D{Conflicting country pattern
|
||||||
|
across user sessions?}
|
||||||
|
D -- No --> N[No block action]
|
||||||
|
D -- Yes --> C[Select suspicious device sessions]
|
||||||
|
C --> A[Call Auth / Session Service block API]
|
||||||
|
A --> X[Subsequent requests get rejected]
|
||||||
|
```
|
||||||
|
|
||||||
|
Exact threshold tuning is configuration-driven and may evolve without changing the service boundary.
|
||||||
|
|
||||||
|
## Country Review Recommendation Logic
|
||||||
|
|
||||||
|
The recommendation workflow is durable and queryable.
|
||||||
|
|
||||||
|
Key rules:
|
||||||
|
|
||||||
|
- `country_review_recommended` is stored as service state.
|
||||||
|
- Transition to `true` must not be represented only by an event.
|
||||||
|
- Administrative systems must be able to retrieve candidates via service API.
|
||||||
|
- Auxiliary notifications exist only to reduce polling latency.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
flowchart LR
|
||||||
|
P[Processed observations] --> F{Review criteria met?}
|
||||||
|
F -- No --> K[Keep existing state]
|
||||||
|
F -- Yes --> T[Set country_review_recommended=true]
|
||||||
|
T --> API[Expose via internal REST API]
|
||||||
|
T --> BUS[Publish event bus notification]
|
||||||
|
T --> MAIL[Optionally send admin email]
|
||||||
|
```
|
||||||
|
|
||||||
|
If event delivery fails, the recommendation state still exists and remains observable through the API.
|
||||||
|
|
||||||
|
## Administrative Read API
|
||||||
|
|
||||||
|
The service exposes a trusted internal JSON REST API for administrative and operational reads.
|
||||||
|
|
||||||
|
### Review Candidate Query Endpoint
|
||||||
|
|
||||||
|
Purpose:
|
||||||
|
|
||||||
|
- Return `user_id` values for users requiring review.
|
||||||
|
|
||||||
|
Initial required filter set:
|
||||||
|
|
||||||
|
- `country_review_recommended=true`
|
||||||
|
|
||||||
|
Expected characteristics:
|
||||||
|
|
||||||
|
- Pagination support
|
||||||
|
- Stable ordering
|
||||||
|
- Simple extension path for additional filters later
|
||||||
|
|
||||||
|
### Geo Profile Query Endpoint
|
||||||
|
|
||||||
|
Purpose:
|
||||||
|
|
||||||
|
- Return the internal geo profile of a specific user for administrative inspection.
|
||||||
|
|
||||||
|
The response should include, at minimum:
|
||||||
|
|
||||||
|
- `user_id`
|
||||||
|
- current review flag
|
||||||
|
- per-`device_session_id` country ranking
|
||||||
|
- per-`device_session_id` `usual_connection_country`
|
||||||
|
- observation summaries
|
||||||
|
- declared country version history
|
||||||
|
- suspicious session markers if present
|
||||||
|
- enough information for manual administrative decision-making
|
||||||
|
|
||||||
|
The profile is grouped by `device_session_id`, because that is the primary aggregation boundary.
|
||||||
|
|
||||||
|
## Declared Country Change Command API
|
||||||
|
|
||||||
|
Geo Profile Service exposes an internal trusted command API to apply approved `declared_country` changes.
|
||||||
|
|
||||||
|
The command path must behave as follows:
|
||||||
|
|
||||||
|
- Record a new declared country version in Geo Profile Service storage.
|
||||||
|
- Synchronously update the current `declared_country` value in `User Service`.
|
||||||
|
- Mark the new version effective only if the `User Service` update succeeds.
|
||||||
|
|
||||||
|
Recommended version lifecycle:
|
||||||
|
|
||||||
|
- `recorded`
|
||||||
|
- `applied`
|
||||||
|
- `sync_failed`
|
||||||
|
|
||||||
|
This lifecycle prevents invisible divergence between history and current value.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant Admin as Admin Interface
|
||||||
|
participant Geo as Geo Profile Service
|
||||||
|
participant User as User Service
|
||||||
|
|
||||||
|
Admin->>Geo: Apply approved declared_country change
|
||||||
|
Geo->>Geo: Create new version record
|
||||||
|
Geo->>User: Sync current declared_country
|
||||||
|
alt Sync succeeds
|
||||||
|
User-->>Geo: OK
|
||||||
|
Geo->>Geo: Mark version as applied
|
||||||
|
Geo-->>Admin: Success
|
||||||
|
else Sync fails
|
||||||
|
User-->>Geo: Error
|
||||||
|
Geo->>Geo: Mark version as sync_failed
|
||||||
|
Geo-->>Admin: Failure
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
## Integration with User Service
|
||||||
|
|
||||||
|
`User Service` keeps the latest effective `declared_country` because other services and the gateway may need it frequently for response shaping without querying Geo Profile Service.
|
||||||
|
|
||||||
|
Integration rules:
|
||||||
|
|
||||||
|
- Geo Profile Service owns the mutation workflow.
|
||||||
|
- `User Service` stores only the latest effective value.
|
||||||
|
- Reads of the current country for normal business responses should go to `User Service`.
|
||||||
|
- Reads of country history and geo-derived data should go to Geo Profile Service.
|
||||||
|
|
||||||
|
## Integration with Auth / Session Service
|
||||||
|
|
||||||
|
Geo Profile Service must be able to request blocking of suspicious sessions.
|
||||||
|
|
||||||
|
Contract assumptions:
|
||||||
|
|
||||||
|
- Blocking is idempotent.
|
||||||
|
- The block applies to `device_session_id`, not to the entire user account.
|
||||||
|
- The effect is expected on subsequent requests, not the current triggering request.
|
||||||
|
|
||||||
|
This keeps the hot path simple and avoids synchronous enforcement coupling.
|
||||||
|
|
||||||
|
## Integration with Mail Service
|
||||||
|
|
||||||
|
Mail notifications are optional and configuration-driven.
|
||||||
|
|
||||||
|
Mail is sent only when:
|
||||||
|
|
||||||
|
- `country_review_recommended` transitions to `true`
|
||||||
|
- Email notifications are enabled
|
||||||
|
|
||||||
|
Mail is auxiliary and must not be required for business correctness.
|
||||||
|
|
||||||
|
## Event Bus Integration
|
||||||
|
|
||||||
|
The service emits an event when `country_review_recommended` transitions to `true`.
|
||||||
|
|
||||||
|
Event usage:
|
||||||
|
|
||||||
|
- Auxiliary notification for downstream systems
|
||||||
|
- Reduced delay for admin workflows
|
||||||
|
- Optional future fan-out for additional internal consumers
|
||||||
|
|
||||||
|
Important constraint:
|
||||||
|
|
||||||
|
- The event bus is not the source of truth.
|
||||||
|
- Loss of an event must not lose the business state.
|
||||||
|
- Periodic pull through the service API must remain sufficient to recover missed notifications.
|
||||||
|
|
||||||
|
## Failure and Degradation Model
|
||||||
|
|
||||||
|
The service is intentionally designed for fail-open behavior relative to the edge.
|
||||||
|
|
||||||
|
### Edge-to-Service Failure
|
||||||
|
|
||||||
|
If Geo Profile Service is unavailable:
|
||||||
|
|
||||||
|
- `Edge Service` must continue request processing unchanged.
|
||||||
|
- The publication failure becomes a metric/logging concern.
|
||||||
|
- No user-visible request rejection is introduced by this dependency.
|
||||||
|
|
||||||
|
### In-Service Processing Failure
|
||||||
|
|
||||||
|
If the worker pipeline temporarily fails:
|
||||||
|
|
||||||
|
- Already accepted observations stay queued if a durable queue is used.
|
||||||
|
- Processing lag grows and must be monitored.
|
||||||
|
- Administrative state may become stale, but the rest of the platform keeps functioning.
|
||||||
|
|
||||||
|
### User Service Sync Failure
|
||||||
|
|
||||||
|
If `declared_country` sync to `User Service` fails:
|
||||||
|
|
||||||
|
- The version record remains in Geo Profile Service.
|
||||||
|
- The version must be marked as not yet effective.
|
||||||
|
- Retry or operator action can be used later.
|
||||||
|
- No silent divergence is allowed.
|
||||||
|
|
||||||
|
### Mail or Event Delivery Failure
|
||||||
|
|
||||||
|
If mail or event publication fails:
|
||||||
|
|
||||||
|
- The failure is logged and metered.
|
||||||
|
- `country_review_recommended` remains persisted.
|
||||||
|
- Administrative polling can still find the affected user.
|
||||||
|
|
||||||
|
## Privacy and Retention Posture
|
||||||
|
|
||||||
|
The privacy posture is intentionally minimal.
|
||||||
|
|
||||||
|
- Do not store raw IP long-term unless a later justification appears.
|
||||||
|
- Prefer storing country-level derived facts and aggregates.
|
||||||
|
- If hashed or obfuscated IP is introduced later, treat it as an implementation detail, not as a core domain dependency.
|
||||||
|
- Retention is expected to be bounded and configurable.
|
||||||
|
|
||||||
|
## Operational Observability
|
||||||
|
|
||||||
|
The service should expose metrics and logs for at least:
|
||||||
|
|
||||||
|
- Ingest acceptance rate
|
||||||
|
- Ingest publish failures observed by edge
|
||||||
|
- Queue depth
|
||||||
|
- Queue lag
|
||||||
|
- Geo-IP lookup latency
|
||||||
|
- Observation processing latency
|
||||||
|
- Review flag transitions
|
||||||
|
- Suspicious session block commands
|
||||||
|
- User Service sync failures
|
||||||
|
- Mail send failures
|
||||||
|
- Event publication failures
|
||||||
|
|
||||||
|
The service must be easy to operate even though it does not sit on the synchronous business-critical path.
|
||||||
|
|
||||||
|
## Minimal Initial API Surface
|
||||||
|
|
||||||
|
The initial required API surface is intentionally small.
|
||||||
|
|
||||||
|
Binary ingest path:
|
||||||
|
|
||||||
|
- Asynchronous FlatBuffers message publication from edge
|
||||||
|
- No business response body
|
||||||
|
- No synchronous decision returned to edge
|
||||||
|
|
||||||
|
Internal JSON REST paths:
|
||||||
|
|
||||||
|
- List review candidates by filter
|
||||||
|
- Read user geo profile grouped by `device_session_id`
|
||||||
|
- Apply approved `declared_country` version change
|
||||||
|
- Optional internal health and metrics endpoints
|
||||||
|
|
||||||
|
Any additional endpoints should be added only if a concrete consumer appears.
|
||||||
|
|
||||||
|
## Design Trade-Offs Accepted by This Architecture
|
||||||
|
|
||||||
|
This architecture intentionally accepts the following trade-offs:
|
||||||
|
|
||||||
|
- Some observation messages may be lost if the service is down and the edge cannot deliver them.
|
||||||
|
- The request that triggers suspicious-session detection is allowed to continue.
|
||||||
|
- Geo-IP history is not strictly reproducible against past database versions.
|
||||||
|
- Current `declared_country` is denormalized into `User Service`.
|
||||||
|
- Administrative approval policy stays flexible and human-driven.
|
||||||
|
|
||||||
|
These trade-offs are acceptable because they keep the hottest path simple while preserving enough internal state for review and risk handling.
|
||||||
|
|
||||||
|
## Implementation Readiness Statement
|
||||||
|
|
||||||
|
The architecture is considered ready for implementation planning.
|
||||||
|
|
||||||
|
The main remaining work is not conceptual but executional:
|
||||||
|
|
||||||
|
- Precise API shape
|
||||||
|
- Queue implementation details
|
||||||
|
- Scoring formula tuning
|
||||||
|
- Suspicious-session thresholds
|
||||||
|
- Concrete storage schema
|
||||||
|
- Operational hardening
|
||||||
Reference in New Issue
Block a user