Loading content...
Consider a rate limit of 100 requests per minute using a fixed window approach. At 11:59:59, a client sends 100 requests—all allowed. At 12:00:01, the window resets, and the client sends another 100 requests—also allowed. 200 requests in 2 seconds, despite a limit of 100/minute.
This is the boundary problem of fixed window rate limiting—clients can exploit window boundaries to effectively double their allowed rate.
The Sliding Window Algorithm elegantly solves this problem by considering a continuously moving time window rather than fixed intervals. It's the rate limiting approach of choice when you need precise, predictable limits without exploitable edge cases.
By the end of this page, you will understand fixed window limitations, sliding window log and counter variants, implementation trade-offs, and when to choose sliding window over token bucket. You'll be equipped to implement sliding window rate limiting in production.
Before diving into sliding windows, let's understand why simpler approaches fall short. The fixed window approach divides time into discrete intervals and counts requests within each.
12345678910111213141516171819202122232425262728293031323334353637
class FixedWindowRateLimiter { private windowDurationMs: number; private maxRequests: number; private counts: Map<string, { count: number; windowStart: number }>; constructor(windowDurationMs: number, maxRequests: number) { this.windowDurationMs = windowDurationMs; this.maxRequests = maxRequests; this.counts = new Map(); } tryConsume(key: string): boolean { const now = Date.now(); const windowStart = Math.floor(now / this.windowDurationMs) * this.windowDurationMs; let entry = this.counts.get(key); // Reset if we're in a new window if (!entry || entry.windowStart !== windowStart) { entry = { count: 0, windowStart }; this.counts.set(key, entry); } if (entry.count < this.maxRequests) { entry.count++; return true; } return false; }} // Problem demonstration:// Limit: 100 requests per minute// 11:59:59.500 - Send 100 requests ✓ (window: 11:59:00 - 11:59:59)// 12:00:00.500 - Send 100 requests ✓ (window: 12:00:00 - 12:00:59)// Result: 200 requests in 1 second, despite 100/minute limit!At window boundaries, fixed window allows up to 2x the intended rate. For a 100 req/min limit, clients can achieve 200 req/min by timing requests around the boundary. This can overload systems designed for the stated limit.
The most precise sliding window implementation maintains a log of all request timestamps within the window. This approach is conceptually simple and perfectly accurate but has memory trade-offs.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849
class SlidingWindowLogLimiter { private windowDurationMs: number; private maxRequests: number; private logs: Map<string, number[]>; // key -> array of timestamps constructor(windowDurationMs: number, maxRequests: number) { this.windowDurationMs = windowDurationMs; this.maxRequests = maxRequests; this.logs = new Map(); } tryConsume(key: string): boolean { const now = Date.now(); const windowStart = now - this.windowDurationMs; // Get or create log for this key let log = this.logs.get(key); if (!log) { log = []; this.logs.set(key, log); } // Remove timestamps outside the window (older than windowStart) // Binary search would be more efficient for large logs while (log.length > 0 && log[0] <= windowStart) { log.shift(); } // Check if we're under the limit if (log.length < this.maxRequests) { log.push(now); return true; } return false; } // Get time until next request will be allowed getRetryAfter(key: string): number { const log = this.logs.get(key); if (!log || log.length < this.maxRequests) { return 0; } const oldestRelevant = log[0]; const now = Date.now(); return Math.max(0, oldestRelevant + this.windowDurationMs - now); }}The sliding window counter is a clever approximation that achieves near-perfect accuracy with constant O(1) memory per client. It works by weighting the previous window's count based on overlap.
The Core Insight:
Instead of tracking individual timestamps, track counts for the current and previous window. The effective count is:
effective_count = previous_window_count × (1 - elapsed_percentage) + current_window_count
Example:
effective_count = 80 × (1 - 0.25) + 20 = 60 + 20 = 80 requests
This approximation assumes requests in the previous window were evenly distributed—not always true, but close enough for rate limiting purposes.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374
interface WindowState { previousCount: number; currentCount: number; currentWindowStart: number;} class SlidingWindowCounterLimiter { private windowDurationMs: number; private maxRequests: number; private windows: Map<string, WindowState>; constructor(windowDurationMs: number, maxRequests: number) { this.windowDurationMs = windowDurationMs; this.maxRequests = maxRequests; this.windows = new Map(); } tryConsume(key: string): boolean { const now = Date.now(); const currentWindowStart = Math.floor(now / this.windowDurationMs) * this.windowDurationMs; let state = this.windows.get(key); if (!state) { // First request from this client state = { previousCount: 0, currentCount: 0, currentWindowStart }; this.windows.set(key, state); } else if (state.currentWindowStart !== currentWindowStart) { // We've moved to a new window if (currentWindowStart - state.currentWindowStart === this.windowDurationMs) { // Adjacent window: current becomes previous state.previousCount = state.currentCount; } else { // Skipped windows: reset previous state.previousCount = 0; } state.currentCount = 0; state.currentWindowStart = currentWindowStart; } // Calculate weighted count (sliding window approximation) const elapsedMs = now - currentWindowStart; const elapsedRatio = elapsedMs / this.windowDurationMs; const weightedPreviousCount = state.previousCount * (1 - elapsedRatio); const effectiveCount = weightedPreviousCount + state.currentCount; if (effectiveCount < this.maxRequests) { state.currentCount++; return true; } return false; } getState(key: string): { remaining: number; resetMs: number } { const now = Date.now(); const currentWindowStart = Math.floor(now / this.windowDurationMs) * this.windowDurationMs; const state = this.windows.get(key); if (!state) { return { remaining: this.maxRequests, resetMs: 0 }; } const elapsedMs = now - currentWindowStart; const elapsedRatio = elapsedMs / this.windowDurationMs; const weightedPreviousCount = state.previousCount * (1 - elapsedRatio); const effectiveCount = weightedPreviousCount + state.currentCount; return { remaining: Math.max(0, Math.floor(this.maxRequests - effectiveCount)), resetMs: this.windowDurationMs - elapsedMs, }; }}Sliding window counter is the most popular choice for production rate limiting. It provides O(1) memory per client, ~99% accuracy compared to sliding window log, and is simple to implement in distributed systems (only two counters to synchronize).
Let's compare all the rate limiting algorithms we've discussed to understand when to use each.
| Algorithm | Memory | Accuracy | Burst Handling | Best For |
|---|---|---|---|---|
| Fixed Window | O(1) | Poor at boundaries | Double burst at boundaries | Simple, non-critical limits |
| Sliding Window Log | O(n) | Perfect | Smooth | Small limits, precision needed |
| Sliding Window Counter | O(1) | ~99% | Smooth | General purpose, distributed |
| Token Bucket | O(1) | Exact average | Controlled burst | APIs with bursty traffic |
| Leaky Bucket | O(1) + queue | Perfect rate | Queued/delayed | Traffic shaping, streaming |
Decision Framework:
For production systems, rate limit state is typically stored in Redis for performance and distribution across multiple gateway instances. Here's an efficient sliding window counter implementation using Redis.
123456789101112131415161718192021222324252627282930313233343536373839
-- Sliding Window Counter in Redis (Lua script for atomicity)-- Keys: KEYS[1] = rate limit key (e.g., "ratelimit:user:123")-- Args: ARGV[1] = window_size_ms, ARGV[2] = max_requests, ARGV[3] = current_time_ms local key = KEYS[1]local window_size = tonumber(ARGV[1])local max_requests = tonumber(ARGV[2])local now = tonumber(ARGV[3]) -- Calculate window boundarieslocal current_window = math.floor(now / window_size) * window_sizelocal previous_window = current_window - window_size -- Keys for current and previous windowlocal current_key = key .. ":" .. current_windowlocal previous_key = key .. ":" .. previous_window -- Get countslocal current_count = tonumber(redis.call("GET", current_key) or "0")local previous_count = tonumber(redis.call("GET", previous_key) or "0") -- Calculate weighted countlocal elapsed = now - current_windowlocal elapsed_ratio = elapsed / window_sizelocal weighted_previous = previous_count * (1 - elapsed_ratio)local effective_count = weighted_previous + current_count -- Check limitif effective_count >= max_requests then -- Calculate retry-after local retry_after = window_size - elapsed return {0, math.ceil(retry_after), math.floor(max_requests - effective_count)}end -- Increment and set expiryredis.call("INCR", current_key)redis.call("PEXPIRE", current_key, window_size * 2) -- Keep for overlap calculation return {1, 0, math.floor(max_requests - effective_count - 1)}12345678910111213141516171819202122232425262728293031323334353637383940414243444546
import Redis from 'ioredis'; class RedisSlidingWindowLimiter { private redis: Redis; private script: string; private scriptSha: string | null = null; constructor(redisUrl: string) { this.redis = new Redis(redisUrl); this.script = `/* Lua script from above */`; } async tryConsume( key: string, windowSizeMs: number, maxRequests: number ): Promise<{ allowed: boolean; remaining: number; retryAfterMs: number }> { const now = Date.now(); // Load script if not already loaded if (!this.scriptSha) { this.scriptSha = await this.redis.script("LOAD", this.script); } try { const result = await this.redis.evalsha( this.scriptSha, 1, `ratelimit:${key}`, windowSizeMs, maxRequests, now ) as [number, number, number]; return { allowed: result[0] === 1, retryAfterMs: result[1], remaining: Math.max(0, result[2]), }; } catch (error) { // Script not found (Redis restart), reload it this.scriptSha = null; return this.tryConsume(key, windowSizeMs, maxRequests); } }}Redis Lua scripts execute atomically, preventing race conditions when multiple gateway instances access the same rate limit state. Without atomicity, concurrent requests could all pass before any increment is recorded.
The sliding window algorithm provides smooth, predictable rate limiting without the boundary exploitation problems of fixed windows. Let's recap:
What's Next:
We've now covered the core algorithms. But rate limiting isn't one-size-fits-all—different resources and users need different limits. The next page explores Per-User vs. Per-API Limits, covering hierarchical limiting, user tiers, and multi-dimensional rate limiting strategies.
You now understand sliding window algorithms and their trade-offs. The sliding window counter in Redis is your go-to for most production rate limiting needs.