Part 4 · Reliability ও Security 📖 ১৩ মিনিট পড়া 📝 ২০টি কুইজ

Rate Limiting

"কত request প্রতি unit time" — abuse ও overload থেকে protection।

Twitter API আপনাকে দেয় ৩০০ request প্রতি ১৫ মিনিট। GitHub API ৫,০০০/hour। কেন? নাহলে এক user/bot পুরো API ভেঙে দিতে পারে। এটাই Rate Limiting।

কেন Rate Limiting?

Abuse prevention: Brute force, scraping, spam।
DDoS protection: Malicious traffic flood।
Fair usage: এক user সব resource নিতে না পারে।
Cost control: Cloud bill predictable।
Service stability: Downstream service overwhelmed না।
SLA enforcement: Tier-based pricing।

Rate Limiting Algorithms

১. Token Bucket

Bucket-এ token জমা — fixed rate-এ refill। প্রতি request-এ ১ token consume। Empty হলে — block।

Capacity: 10 tokens Refill: 1 token/second Request → consume token → if 0 = REJECT Bucket refills naturally over time

Allow burst (full bucket থেকে শুরু)।
Smooth rate average-এ।
সবচেয়ে কমন।

২. Leaky Bucket

Bucket-এ request জমা — fixed rate-এ leak (process)। Bucket overflow = reject।

Constant outflow rate।
Burst smooth out।
Queue হিসেবে কাজ করে।

৩. Fixed Window Counter

Time window (e.g., 1 minute)-এ counter। Threshold পার = reject। Window-এর শেষে reset।

Simple।
সমস্যা: window boundary-এ double burst (12:59-1:00 + 1:00-1:01)।

৪. Sliding Window Log

প্রতি request-এর timestamp log। Recent N seconds-এ count।

Most accurate।
Memory expensive।

৫. Sliding Window Counter

Fixed window + previous window-এর weighted contribution। Approximate কিন্তু efficient।

Cloudflare-এ use।
Memory-efficient।
Reasonable accuracy।

Algorithm Comparison

Token Bucket

Burst allow
Average rate smooth
Most flexible
Memory: O(1) per user

Leaky Bucket

Constant rate
Smooth output
Queue-based
Less burst-friendly

Fixed Window

Simplest
Boundary issue
Memory: O(1)
Approximate

Sliding Window

Accurate
Memory tradeoff
Complex
Counter version efficient

Rate Limit by কী?

IP address: DDoS protection।
User ID: Authenticated user।
API key: SaaS — tier-based।
Endpoint: Expensive endpoint stricter।
Combined: Multi-layer।

Distributed Rate Limiting

Single server-এ in-memory counter সহজ। কিন্তু ১০টি server-এ?

সমস্যা

Each server-এর own count = total = ১০× allowed।
Coordination দরকার।

Solutions

Centralized counter (Redis)

সব server Redis-এ INCR।
Atomic operation।
Network hop-এর latency।

Sticky session

Same user same server।
In-memory counter।
Server fail হলে state hারায়।

Consistent hashing

User → specific limiter node।
Distributed efficient।

Response Strategy

HTTP 429 Too Many Requests

HTTP/1.1 429 Too Many Requests Retry-After: 60 X-RateLimit-Limit: 1000 X-RateLimit-Remaining: 0 X-RateLimit-Reset: 1700000000 {"error": "Rate limit exceeded"}

Behavior options

Block: Reject extra request।
Throttle: Delay response।
Queue: Process later।
Shape: Lower priority।

Implementation Layers

API Gateway: Kong, AWS API Gateway, Cloudflare।
Reverse proxy: NGINX limit_req।
Service Mesh: Istio।
Application code: Library (express-rate-limit)।
Cloudflare/CDN: Edge level।

বাস্তব উদাহরণ

GitHub API: 5,000 req/hour authenticated; 60 unauthenticated।
Twitter API: 300 req/15min।
Stripe: 100 req/sec dynamic।
Reddit: 60 req/minute OAuth।
Banking: Aggressive — fraud detection।

সাধারণ ভুল ধারণা

"In-memory counter sufficient": Single server-এ; distributed-এ no।
"Fixed window enough": Boundary issue — 2× burst possible।
"Block silently": 429 + headers এ inform করুন।
"Same limit সব endpoint-এ": Expensive endpoint stricter।

Best Practices

Multi-layer (IP + user + API key)।
Token bucket default choice।
Distributed: Redis centralized counter।
HTTP 429 + Retry-After + RateLimit headers।
Different limit-এ different endpoint।
Whitelist trusted IP।
Monitoring + alerting।

📌 চ্যাপ্টার সারমর্ম

Rate Limiting = abuse, overload, fair usage protect।
Algorithms: Token bucket, Leaky bucket, Fixed/Sliding window।
Distributed-এ Redis-based centralized counter।
HTTP 429 + Retry-After response।
API Gateway, Cloudflare common implementation।