ek Studio | Code Your Vision

If your API endpoints don't have rate limiting, they will eventually be abused—by scrapers, malicious actors, or even frontend bugs firing duplicate requests. We built a Rate Limiter in a real project that started with an in-memory approach and can smoothly upgrade to Redis. Here's the design.

The Simplest Approach: In-Memory Map

The core idea is to use a Map tracking request counts and time windows per IP:

interface RateLimitEntry {
  count: number;
  resetTime: number;
}

const store = new Map<string, RateLimitEntry>();

const WINDOW_MS = 60 * 1000;  // 1-minute window
const MAX_REQUESTS = 60;       // 60 requests per window

function isRateLimited(identifier: string): boolean {
  const now = Date.now();
  const entry = store.get(identifier);

  if (!entry || now > entry.resetTime) {
    store.set(identifier, { count: 1, resetTime: now + WINDOW_MS });
    return false;
  }

  entry.count++;
  return entry.count > MAX_REQUESTS;
}

Usage in an API route:

export async function POST(request: Request) {
  const ip = request.headers.get('x-forwarded-for') || 'unknown';

  if (isRateLimited(ip)) {
    return Response.json(
      { error: 'Too many requests' },
      { status: 429 }
    );
  }

  // Business logic...
}

Getting the Real Client IP

Retrieving the actual client IP is trickier than it seems. Behind a reverse proxy (Nginx, CDN), request.ip might be the proxy's address. Read headers in priority order:

function getClientIP(request: Request): string {
  const forwarded = request.headers.get('x-forwarded-for');
  if (forwarded) {
    // Take the first IP (closest to the client)
    return forwarded.split(',')[0].trim();
  }

  const realIP = request.headers.get('x-real-ip');
  if (realIP) return realIP;

  return 'unknown';
}

Note: x-forwarded-for can be spoofed. In production, make sure your reverse proxy overwrites this header rather than appending to it.

Preventing Memory Leaks

The biggest risk with in-memory storage is unbounded Map growth. Every unique IP adds an entry, and memory climbs steadily over time. You must periodically purge expired entries:

// Clean up every 5 minutes
setInterval(() => {
  const now = Date.now();
  for (const [key, entry] of store) {
    if (now > entry.resetTime) {
      store.delete(key);
    }
  }
}, 5 * 60 * 1000);

This seems obvious but is easy to forget. In one of our projects, we missed the cleanup and watched memory grow from 200MB to 1.2GB over a month.

Tiered Rate Limits

In practice, different endpoints need different limits:

const RATE_LIMITS = {
  login:    { window: 5 * 60 * 1000, max: 5 },   // Login: 5 per 5 min
  api:      { window: 60 * 1000, max: 100 },      // General: 100 per min
  upload:   { window: 60 * 1000, max: 10 },       // Upload: 10 per min
};

function isRateLimited(ip: string, type: keyof typeof RATE_LIMITS): boolean {
  const config = RATE_LIMITS[type];
  const key = `${type}:${ip}`;
  // Same logic using config.window and config.max
}

Login endpoints need strict limits (brute-force prevention), upload endpoints need throttling (resource abuse), and general queries can be more lenient.

When to Upgrade to Redis

The in-memory approach falls short when:

Multiple instances: Each process has its own Map, making rate limiting effectively useless. Users hitting different instances bypass the limits.
Restarts reset counters: A service restart clears all counts, temporarily disabling rate limiting.
Memory pressure: Under high concurrency with many unique IPs, in-memory storage becomes expensive.

Upgrading to Redis requires minimal changes—same logic, different storage:

import Redis from 'ioredis';
const redis = new Redis();

async function isRateLimited(ip: string): Promise<boolean> {
  const key = `rate:${ip}`;
  const count = await redis.incr(key);

  if (count === 1) {
    await redis.expire(key, 60); // 60-second TTL
  }

  return count > MAX_REQUESTS;
}

Redis's INCR + EXPIRE combination natively handles counting and automatic expiration—no manual cleanup needed. And since all instances share one Redis, rate limiting works globally.

Response Headers Done Right

A well-behaved Rate Limiter communicates its state through response headers:

const headers = {
  'X-RateLimit-Limit': String(MAX_REQUESTS),
  'X-RateLimit-Remaining': String(Math.max(0, MAX_REQUESTS - count)),
  'X-RateLimit-Reset': String(Math.ceil(resetTime / 1000)),
};

With this information, frontends can adapt their request patterns (e.g., throttle when approaching the limit) instead of waiting for a 429 to find out.

Takeaways

Rate limiting isn't complex to implement, but choosing the right approach matters. Single-instance apps can use an in-memory Map with expiration cleanup; multi-instance deployments need Redis. Regardless of the approach, the core flow is: identify the client → count requests → check the threshold → return appropriate status codes and headers.

API Rate Limiting: From In-Memory to Redis