If your API endpoints don't have rate limiting, they will eventually be abused—by scrapers, malicious actors, or even frontend bugs firing duplicate requests. We built a Rate Limiter in a real project that started with an in-memory approach and can smoothly upgrade to Redis. Here's the design.
The Simplest Approach: In-Memory Map
The core idea is to use a Map tracking request counts and time windows per IP:
interface RateLimitEntry {
count: number;
resetTime: number;
}
const store = new Map<string, RateLimitEntry>();
const WINDOW_MS = 60 * 1000; // 1-minute window
const MAX_REQUESTS = 60; // 60 requests per window
function isRateLimited(identifier: string): boolean {
const now = Date.now();
const entry = store.get(identifier);
if (!entry || now > entry.resetTime) {
store.set(identifier, { count: 1, resetTime: now + WINDOW_MS });
return false;
}
entry.count++;
return entry.count > MAX_REQUESTS;
}
Usage in an API route:
export async function POST(request: Request) {
const ip = request.headers.get('x-forwarded-for') || 'unknown';
if (isRateLimited(ip)) {
return Response.json(
{ error: 'Too many requests' },
{ status: 429 }
);
}
// Business logic...
}
Getting the Real Client IP
Retrieving the actual client IP is trickier than it seems. Behind a reverse proxy (Nginx, CDN), request.ip might be the proxy's address. Read headers in priority order:
function getClientIP(request: Request): string {
const forwarded = request.headers.get('x-forwarded-for');
if (forwarded) {
// Take the first IP (closest to the client)
return forwarded.split(',')[0].trim();
}
const realIP = request.headers.get('x-real-ip');
if (realIP) return realIP;
return 'unknown';
}
Note: x-forwarded-for can be spoofed. In production, make sure your reverse proxy overwrites this header rather than appending to it.
Preventing Memory Leaks
The biggest risk with in-memory storage is unbounded Map growth. Every unique IP adds an entry, and memory climbs steadily over time. You must periodically purge expired entries:
// Clean up every 5 minutes
setInterval(() => {
const now = Date.now();
for (const [key, entry] of store) {
if (now > entry.resetTime) {
store.delete(key);
}
}
}, 5 * 60 * 1000);
This seems obvious but is easy to forget. In one of our projects, we missed the cleanup and watched memory grow from 200MB to 1.2GB over a month.
Tiered Rate Limits
In practice, different endpoints need different limits:
const RATE_LIMITS = {
login: { window: 5 * 60 * 1000, max: 5 }, // Login: 5 per 5 min
api: { window: 60 * 1000, max: 100 }, // General: 100 per min
upload: { window: 60 * 1000, max: 10 }, // Upload: 10 per min
};
function isRateLimited(ip: string, type: keyof typeof RATE_LIMITS): boolean {
const config = RATE_LIMITS[type];
const key = `${type}:${ip}`;
// Same logic using config.window and config.max
}
Login endpoints need strict limits (brute-force prevention), upload endpoints need throttling (resource abuse), and general queries can be more lenient.
When to Upgrade to Redis
The in-memory approach falls short when:
- Multiple instances: Each process has its own Map, making rate limiting effectively useless. Users hitting different instances bypass the limits.
- Restarts reset counters: A service restart clears all counts, temporarily disabling rate limiting.
- Memory pressure: Under high concurrency with many unique IPs, in-memory storage becomes expensive.
Upgrading to Redis requires minimal changes—same logic, different storage:
import Redis from 'ioredis';
const redis = new Redis();
async function isRateLimited(ip: string): Promise<boolean> {
const key = `rate:${ip}`;
const count = await redis.incr(key);
if (count === 1) {
await redis.expire(key, 60); // 60-second TTL
}
return count > MAX_REQUESTS;
}
Redis's INCR + EXPIRE combination natively handles counting and automatic expiration—no manual cleanup needed. And since all instances share one Redis, rate limiting works globally.
Response Headers Done Right
A well-behaved Rate Limiter communicates its state through response headers:
const headers = {
'X-RateLimit-Limit': String(MAX_REQUESTS),
'X-RateLimit-Remaining': String(Math.max(0, MAX_REQUESTS - count)),
'X-RateLimit-Reset': String(Math.ceil(resetTime / 1000)),
};
With this information, frontends can adapt their request patterns (e.g., throttle when approaching the limit) instead of waiting for a 429 to find out.
Takeaways
Rate limiting isn't complex to implement, but choosing the right approach matters. Single-instance apps can use an in-memory Map with expiration cleanup; multi-instance deployments need Redis. Regardless of the approach, the core flow is: identify the client → count requests → check the threshold → return appropriate status codes and headers.