Soulmap — System Architecture

01 — Architecture Overview

High-Level System Architecture

Traditional layered architecture. Mobile clients hit an API gateway. Services are split by concern. Each layer of the compatibility framework maps to its own microservice with dedicated databases. Real-time services for conversations, batch processing for overnight compatibility scoring.

graph TB subgraph Client["Client Layer"] iOS["iOS App"] Android["Android App"] Web["Web App"] end subgraph Gateway["API Gateway Layer"] APIGateway["API Gateway
(Kong / AWS API Gateway)
- Auth (JWT)
- Rate limiting
- Request routing"] end subgraph Services["Core Services Layer"] ProfileService["Profile Service
- User profiles
- Triangle weights
- Archetype data"] ConvoService["Conversation Service
- Message routing
- Metadata extraction
- Real-time analysis"] MatchService["Matching Engine
- Compatibility scoring
- Calibration logic
- Match queue"] AnalyticsService["Analytics Service
- Sentiment analysis
- Repair latency
- Vulnerability slope"] AspirationService["Aspiration Service
- Tag management
- Growth alignment
- Sculpting index"] end subgraph Data["Data Layer"] PostgreSQL["PostgreSQL
- User profiles
- Match history
- Calibration data"] MongoDB["MongoDB
- Conversation metadata
- Sentiment scores
- Reflection data"] Redis["Redis
- Session cache
- Real-time presence
- Rate limit counters"] S3["S3 / Object Storage
- Profile photos
- Archived messages
- Backup data"] end subgraph ML["ML & Analytics Layer"] SentimentML["Sentiment Analysis
(Hugging Face / OpenAI)"] ArchetypeML["Archetype Classifier
(Custom trained model)"] CompatML["Compatibility Scorer
(Ensemble model)"] end subgraph Queue["Message Queue"] RabbitMQ["RabbitMQ / Kafka
- Async processing
- Event streaming
- Batch jobs"] end subgraph External["External Services"] Auth0["Auth0 / Firebase Auth
- OAuth
- Social login"] Twilio["Twilio
- SMS verification
- Push notifications"] Sentry["Sentry
- Error tracking
- Performance monitoring"] end iOS --> APIGateway Android --> APIGateway Web --> APIGateway APIGateway --> ProfileService APIGateway --> ConvoService APIGateway --> MatchService APIGateway --> AnalyticsService APIGateway --> AspirationService ProfileService --> PostgreSQL ProfileService --> Redis ProfileService --> S3 ConvoService --> MongoDB ConvoService --> Redis ConvoService --> RabbitMQ MatchService --> PostgreSQL MatchService --> MongoDB MatchService --> Redis MatchService --> CompatML AnalyticsService --> MongoDB AnalyticsService --> SentimentML AnalyticsService --> RabbitMQ AspirationService --> PostgreSQL AspirationService --> MongoDB RabbitMQ --> ArchetypeML RabbitMQ --> CompatML APIGateway --> Auth0 ConvoService --> Twilio MatchService --> Sentry style Client fill:#1C1917,stroke:#C4622D,stroke-width:2px,color:#F5F0E8 style Gateway fill:#231f1b,stroke:#D4A853,stroke-width:2px,color:#F5F0E8 style Services fill:#2a2520,stroke:#C4622D,stroke-width:1px,color:#F5F0E8 style Data fill:#1C1917,stroke:#D4A853,stroke-width:1px,color:#F5F0E8 style ML fill:#231f1b,stroke:#C4622D,stroke-width:1px,color:#F5F0E8 style Queue fill:#2a2520,stroke:#D4A853,stroke-width:1px,color:#F5F0E8 style External fill:#1C1917,stroke:rgba(245,240,232,0.2),stroke-width:1px,color:#F5F0E8

Five Layers → Microservices

Gottman Layer

→

Conversation Analytics Service

Tracks repair latency after friction. Measures time between negative sentiment spike and recovery to positive baseline. Stores conversation metadata in MongoDB, processes async via RabbitMQ.

Aron Layer

→

Sentiment Analysis Service

Measures vulnerability escalation via reciprocal sentiment depth. Keyword analysis + ML sentiment scoring (Hugging Face transformers). Tracks reciprocity lag — how long before the other person matches vulnerability level.

Sternberg Layer

→

Profile Matching Service

Triangle overlap scoring. Compares passion/intimacy/commitment weights between two users. Stored in PostgreSQL, cached in Redis for fast lookups. Re-weighted periodically based on user behavior.

Fisher Layer

→

Archetype Classification Service

Neurochemical archetype matching (Explorer, Builder, Director, Negotiator). Custom trained model based on quiz responses. Complementary pairing logic — Explorers attract Explorers, Directors attract Negotiators.

Michelangelo Layer

→

Aspiration Matching Service

Growth alignment via aspiration tag overlap. Sculpting index tracks whether both users are moving toward their stated goals over time. Behavioral validation through profile updates and conversation topics.

02 — Data Flow Diagram

Journey of a Match Request

What happens when Priya opens the app and asks for her next calibration match? This diagram shows the complete request flow — from client to all five services to the response she sees on screen.

sequenceDiagram participant User as User (Priya) participant App as Mobile App participant Gateway as API Gateway participant Auth as Auth Service participant Profile as Profile Service participant Match as Matching Engine participant Analytics as Analytics Service participant Aspiration as Aspiration Service participant Archetype as Archetype Service participant DB as Database participant Cache as Redis Cache User->>App: Opens app, taps "Show me a match" App->>Gateway: POST /api/matches/next Gateway->>Auth: Validate JWT token Auth-->>Gateway: ✓ Token valid Gateway->>Profile: GET /profile/{userId} Profile->>Cache: Check cache Cache-->>Profile: Cache miss Profile->>DB: Query user profile DB-->>Profile: User data (triangle, archetype, tags) Profile->>Cache: Store in cache (TTL 1hr) Profile-->>Gateway: User profile data Gateway->>Match: POST /match/candidates Note over Match: Match engine coordinates 5 layers Match->>Profile: Get triangle weights Profile-->>Match: Passion: 60%, Intimacy: 80%, Commitment: 70% Match->>Archetype: Get archetype compatibility Archetype->>DB: Query Builder → Builder matches DB-->>Archetype: Candidate pool (500 users) Archetype-->>Match: Compatible archetypes Match->>Aspiration: Get tag overlap candidates Aspiration->>DB: Query users with 3+ matching tags DB-->>Aspiration: Candidates with tag overlap Aspiration-->>Match: 150 candidates with high alignment Match->>Analytics: Filter by conversation quality Analytics->>DB: Check past conversation metrics DB-->>Analytics: Users with positive sentiment history Analytics-->>Match: 50 high-quality candidates Note over Match: Apply calibration logic:
First 5 matches = learning mode
Confidence score < 80% Match->>DB: Record match shown DB-->>Match: ✓ Logged Match-->>Gateway: Match candidate (Rohan)
- 74% triangle alignment
- Builder archetype
- 3 tag overlaps
- Calibration match 2/5 Gateway-->>App: Match response (JSON) App-->>User: Shows Rohan's profile
with calibration banner User->>App: Taps "Start conversation" App->>Gateway: POST /conversations/create Gateway->>Profile: Create conversation thread Profile->>DB: Insert conversation record DB-->>Profile: ✓ Created Profile-->>Gateway: Conversation ID Gateway-->>App: ✓ Conversation started App-->>User: Opens chat with Rohan Note over User,DB: All 5 layers now collecting data
from this conversation

03 — Technical Decisions

Why the architecture looks this way

Real-time vs Batch Processing

Conversation sentiment analysis happens in near real-time (within 5 seconds of message sent) via async queue processing. Compatibility scoring runs as a batch job overnight — recalculates all active users' match pools based on previous day's conversation data. This balance keeps the app feeling responsive without overloading the ML services.

Privacy-First Data Model

Conversation content never leaves user devices unencrypted. Only metadata flows to servers — message timestamps, character counts, sentiment scores computed client-side via on-device ML (Core ML on iOS, ML Kit on Android). Users explicitly consent to sharing this metadata during onboarding. Full GDPR and India DPDP Act 2023 compliant.

Cold Start Strategy

First week of usage = calibration mode. Matching relies heavily on Sternberg triangle + Fisher archetype layers (data we collect upfront). Gottman and Aron layers weight increases gradually as conversation data accumulates. Matching confidence score shown to user (e.g., "74% match — we're still learning"). Honesty builds trust.

Microservices over Monolith

Each layer is a separate service so they can scale independently. Sentiment analysis is compute-heavy and needs autoscaling. Profile lookups are read-heavy and benefit from aggressive caching. Matching engine is batch-oriented and can run on scheduled workers. One service failing doesn't bring down the entire system.

PostgreSQL + MongoDB Hybrid

PostgreSQL for structured relational data (user profiles, match history, triangle weights). MongoDB for semi-structured conversation metadata and time-series sentiment data. Each database optimized for its workload. Redis sits in front for hot data caching (active sessions, frequently accessed profiles).

ML Model Deployment

Sentiment analysis uses pre-trained Hugging Face transformers fine-tuned on relationship conversation data. Archetype classifier is a custom XGBoost model trained on quiz responses. Compatibility scorer is an ensemble model combining outputs from all five layers. Models deployed as separate containerized services, updated via A/B testing without app releases.

04 — Technology Stack

What we'd actually build this with

Frontend

React Native

Single codebase for iOS and Android. Fast development, native performance. TypeScript for type safety.

API Gateway

Kong / AWS API Gateway

Request routing, rate limiting, JWT validation. SSL termination. Request logging.

Backend Services

Node.js + Express

Fast, async I/O. Large ecosystem. Easy to hire for. TypeScript across the entire stack.

Relational Database

PostgreSQL

ACID compliance. Complex queries. Strong consistency. Mature ecosystem.

Document Database

MongoDB

Flexible schema. Time-series data. Fast writes. Easy horizontal scaling.

Cache

Redis

In-memory key-value store. Sub-millisecond latency. Pub/sub for real-time features.

Message Queue

RabbitMQ

Async processing. Event streaming. Retry logic. Dead letter queues.

ML Framework

Python + PyTorch

Hugging Face transformers. Scikit-learn. Custom model training. FastAPI for serving.

Infrastructure

AWS / GCP

ECS for containers. RDS for databases. S3 for storage. CloudWatch for monitoring.

Auth

Auth0

OAuth 2.0. Social login. MFA. Secure by default.

Monitoring

Sentry + DataDog

Error tracking. Performance monitoring. Real-time alerts. Log aggregation.

CI/CD

GitHub Actions

Automated testing. Container builds. Blue-green deployments. Rollback on failure.

05 — Scale & Performance

How much can this actually handle?

Realistic performance targets for MVP through 1M users. What breaks first, where to optimize, when to worry.

10K

Concurrent Users

Single region deployment. Monolithic database. No autoscaling needed yet. Everything works on standard AWS instances.

100K

DAU at breakpoint

Start seeing database load. Add read replicas. Enable Redis caching. Horizontal scaling for API servers. ML inference becomes bottleneck — move to dedicated GPU instances.

1M

Users at maturity

Multi-region deployment. Database sharding by user ID. Separate ML cluster. CDN for static assets. Batch jobs split across worker pools. This is where serious infra investment starts.

What breaks first?

In order of likely failure points —

1. ML inference latency — Sentiment analysis on every message. Solved with model quantization + GPU autoscaling.

2. Database write throughput — Conversation metadata writes. Solved with write-heavy MongoDB sharding + batch inserts.

3. Matching engine compute — Nightly batch job recalculating compatibility for all users. Solved with distributed worker pools + incremental updates (only recalc users with new data).

4. Real-time messaging — WebSocket connections for live chat. Solved with dedicated message routing service (Socket.io cluster) + Redis pub/sub.

System Design

High-Level System Architecture

Journey of a Match Request

Why the architecture looks this way

What we'd actually build this with

How much can this actually handle?