Rate Limiting
"কত request প্রতি unit time" — abuse ও overload থেকে protection।
Twitter API আপনাকে দেয় ৩০০ request প্রতি ১৫ মিনিট। GitHub API ৫,০০০/hour। কেন? নাহলে এক user/bot পুরো API ভেঙে দিতে পারে। এটাই Rate Limiting।
কেন Rate Limiting?
- Abuse prevention: Brute force, scraping, spam।
- DDoS protection: Malicious traffic flood।
- Fair usage: এক user সব resource নিতে না পারে।
- Cost control: Cloud bill predictable।
- Service stability: Downstream service overwhelmed না।
- SLA enforcement: Tier-based pricing।
Rate Limiting Algorithms
১. Token Bucket
Bucket-এ token জমা — fixed rate-এ refill। প্রতি request-এ ১ token consume। Empty হলে — block।
Capacity: 10 tokens
Refill: 1 token/second
Request → consume token → if 0 = REJECT
Bucket refills naturally over time
- Allow burst (full bucket থেকে শুরু)।
- Smooth rate average-এ।
- সবচেয়ে কমন।
২. Leaky Bucket
Bucket-এ request জমা — fixed rate-এ leak (process)। Bucket overflow = reject।
- Constant outflow rate।
- Burst smooth out।
- Queue হিসেবে কাজ করে।
৩. Fixed Window Counter
Time window (e.g., 1 minute)-এ counter। Threshold পার = reject। Window-এর শেষে reset।
- Simple।
- সমস্যা: window boundary-এ double burst (12:59-1:00 + 1:00-1:01)।
৪. Sliding Window Log
প্রতি request-এর timestamp log। Recent N seconds-এ count।
- Most accurate।
- Memory expensive।
৫. Sliding Window Counter
Fixed window + previous window-এর weighted contribution। Approximate কিন্তু efficient।
- Cloudflare-এ use।
- Memory-efficient।
- Reasonable accuracy।
Algorithm Comparison
Token Bucket
- Burst allow
- Average rate smooth
- Most flexible
- Memory: O(1) per user
Leaky Bucket
- Constant rate
- Smooth output
- Queue-based
- Less burst-friendly
Fixed Window
- Simplest
- Boundary issue
- Memory: O(1)
- Approximate
Sliding Window
- Accurate
- Memory tradeoff
- Complex
- Counter version efficient
Rate Limit by কী?
- IP address: DDoS protection।
- User ID: Authenticated user।
- API key: SaaS — tier-based।
- Endpoint: Expensive endpoint stricter।
- Combined: Multi-layer।
Distributed Rate Limiting
Single server-এ in-memory counter সহজ। কিন্তু ১০টি server-এ?
সমস্যা
- Each server-এর own count = total = ১০× allowed।
- Coordination দরকার।
Solutions
Centralized counter (Redis)
- সব server Redis-এ INCR।
- Atomic operation।
- Network hop-এর latency।
Sticky session
- Same user same server।
- In-memory counter।
- Server fail হলে state hারায়।
Consistent hashing
- User → specific limiter node।
- Distributed efficient।
Response Strategy
HTTP 429 Too Many Requests
HTTP/1.1 429 Too Many Requests
Retry-After: 60
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1700000000
{"error": "Rate limit exceeded"}
Behavior options
- Block: Reject extra request।
- Throttle: Delay response।
- Queue: Process later।
- Shape: Lower priority।
Implementation Layers
- API Gateway: Kong, AWS API Gateway, Cloudflare।
- Reverse proxy: NGINX limit_req।
- Service Mesh: Istio।
- Application code: Library (express-rate-limit)।
- Cloudflare/CDN: Edge level।
বাস্তব উদাহরণ
- GitHub API: 5,000 req/hour authenticated; 60 unauthenticated।
- Twitter API: 300 req/15min।
- Stripe: 100 req/sec dynamic।
- Reddit: 60 req/minute OAuth।
- Banking: Aggressive — fraud detection।
সাধারণ ভুল ধারণা
- "In-memory counter sufficient": Single server-এ; distributed-এ no।
- "Fixed window enough": Boundary issue — 2× burst possible।
- "Block silently": 429 + headers এ inform করুন।
- "Same limit সব endpoint-এ": Expensive endpoint stricter।
Best Practices
- Multi-layer (IP + user + API key)।
- Token bucket default choice।
- Distributed: Redis centralized counter।
- HTTP 429 + Retry-After + RateLimit headers।
- Different limit-এ different endpoint।
- Whitelist trusted IP।
- Monitoring + alerting।
📌 চ্যাপ্টার সারমর্ম
- Rate Limiting = abuse, overload, fair usage protect।
- Algorithms: Token bucket, Leaky bucket, Fixed/Sliding window।
- Distributed-এ Redis-based centralized counter।
- HTTP 429 + Retry-After response।
- API Gateway, Cloudflare common implementation।