Case Study: Uber / Pathao
শেষ chapter! Real-time geospatial matching at scale।
আপনি Pathao/Uber app open করলেন। ৫ সেকেন্ডে app আপনাকে দেখাল ১০টি কাছাকাছি driver। "Find Driver" চাপলেন — ১৫ সেকেন্ডে driver assigned। এই magic-এর behind কী আছে? Real-time geospatial system, ML-based matching, surge pricing — সব মিলিয়ে engineering masterpiece।
Requirements
Functional
- Rider request করেন → nearby driver match।
- Real-time location tracking।
- Fare calculation + payment।
- Trip lifecycle (request, accept, en-route, complete)।
- Surge pricing (high-demand)।
- Rating system।
Non-Functional
- Low latency matching (<5 sec)।
- Real-time tracking (<5 sec update)।
- High availability।
- Geographic scale (650+ cities)।
- Reliable payment।
Capacity Estimation
DAU: 100M (riders + drivers)
Active drivers concurrent: 5M (peak)
Trips/day: 25M
Trips/sec (avg): 25M / 86400 ≈ 290/sec (peak much higher)
Driver location update: every 4 seconds
Updates/sec: 5M / 4 = 1.25M location updates/sec
API Design
// Driver
POST /location { lat, lng, heading } -- every 4s
POST /accept-ride { rideId }
// Rider
POST /request-ride { pickup, destination, type }
GET /nearby-drivers?lat=&lng=&radius=
GET /ride/:id/status
// Real-time updates
WebSocket: ride state changes
Data Model
User: { id, type (rider/driver), name, ... }
Driver: {
id, status (online/offline/in-trip),
current_location (lat, lng), vehicle, rating
}
Trip: {
id, rider_id, driver_id,
pickup, destination, status,
fare, payment_status
}
LocationUpdate: { driver_id, lat, lng, ts } (time-series)
Architecture
[Rider App] [Driver App]
↓ ↓ (WebSocket)
[API Gateway / Load Balancer]
↓
[Microservices]
- Trip Service
- Location Service
- Matching Service
- Pricing Service
- Payment Service
- Notification Service
↓
[Geospatial Index] (Google S2 cells)
[Event Stream] (Kafka)
[Storage]:
- Schemaless (Uber custom — sharded MySQL)
- Cassandra (location time-series)
- Redis (active driver pool)
Geospatial Indexing
"5km-এ available driver" — fast query।
Google S2 Library
- Earth-কে hierarchical cells-এ ভাগ।
- Cell ID = string (efficient indexing)।
- Variable cell size — city-level vs neighborhood-level।
- Hilbert curve — spatial locality।
Driver Indexing
- Driver location update → S2 cell ID compute।
- Redis hash: cell_id → set of driver_ids।
- Rider's location → cell ID + neighbors।
- Get all drivers in those cells।
- Filter by exact distance + availability।
Driver Matching Algorithm
Just nearest নয় — many factors:
- Distance: Euclidean → road distance।
- ETA: Real-time traffic-aware।
- Driver rating, acceptance rate।
- Vehicle type match।
- Anti-bias: Driver-rider history।
ML model rank candidates।
Location Update Flow
- Driver app every 4 sec — location update via WebSocket।
- Location service receive → S2 cell compute।
- Redis hash update — old cell remove, new cell add।
- Cassandra-এ time-series write (history)।
- If in-trip → rider notified (real-time tracking)।
Trip Lifecycle (Saga Pattern)
- Request: Rider request, system find candidates।
- Match: Best driver notified — accept/decline।
- En-route to pickup: Real-time tracking।
- Pickup: Trip start।
- In trip: Live location, fare meter।
- Complete: Fare calculate, payment trigger।
- Rating: Both sides rate।
Each step event in Kafka — Saga compensating actions।
Surge Pricing
Demand high কিন্তু supply low → fare বাড়ায়।
How
- প্রতি area-এ — pending request count, available driver count।
- Ratio threshold পার = surge multiplier।
- Map-এ visual heatmap।
- Real-time stream processing।
Goals
- Driver attract (more income)।
- Rider demand reduce (some willing pay; some wait)।
- Equilibrium।
Payment
- Trip শেষে fare calculate।
- Saga: payment service charge (Stripe/local)।
- Failed payment → retry; persistent fail → restrict next ride।
- Driver settlement weekly।
Real-time Tracking
- Rider app — driver location continuously।
- WebSocket connection।
- Driver location update broadcast to active rider।
- Map render smooth।
Microservices Architecture
Uber 2200+ microservice ছিল (পরে কিছু consolidate)।
- Trip, Driver, Rider, Location, Pricing, Payment, Notification, Maps, Search, Analytics।
- gRPC for inter-service।
- Kafka for event streaming।
- Schemaless — Uber-এর custom MySQL-based DB।
Ringpop — Distributed Coordination
Uber-এর open-source library — distributed app-গুলোর জন্য:
- Consistent hashing।
- Membership protocol (SWIM)।
- Request forwarding।
- Use case: Driver location service ring।
Geographic Scale
- City-level deployment।
- Each city own driver pool, pricing zones।
- Multi-region for disaster recovery।
- Map data localized।
Real World
- ১০০M+ users globally।
- ৬৫০+ cities।
- ২৫M+ trips/day peak।
- Pathao, Foodpanda, Sohoz — Bangladesh-এ same architecture।
Trade-offs
- Real-time accuracy vs network/battery cost।
- Eventually consistent location vs immediate match।
- ML matching latency vs fairness।
- Surge pricing user satisfaction vs equilibrium।
Engineering Lessons
- Geospatial indexing core to location services।
- Saga pattern for multi-step business flow।
- Real-time + reliability — challenging combo।
- Event-driven scales।
- Don't over-microservice (Uber over-corrected)।
📌 চ্যাপ্টার সারমর্ম
- Uber = real-time geospatial matching at scale।
- Google S2 + Redis-based driver index।
- WebSocket for location update + tracking।
- Saga pattern trip lifecycle।
- ML + surge pricing — supply-demand balance।
🎉 অভিনন্দন! আপনি System Design Bangla-র সব ৫৪টি chapter শেষ করেছেন! এখন আপনি প্রস্তুত —
বিশ্বের যেকোনো top tech company-র system design interview-এর জন্য। ভালো প্রস্তুতির পর — আল্লাহ ভরসা।