Rate Limiting At Gateway - Learning Module

Loading content...

0/273

Sliding Window Algorithm

The Boundary Problem

Consider a rate limit of 100 requests per minute using a fixed window approach. At 11:59:59, a client sends 100 requests—all allowed. At 12:00:01, the window resets, and the client sends another 100 requests—also allowed. 200 requests in 2 seconds, despite a limit of 100/minute.

This is the boundary problem of fixed window rate limiting—clients can exploit window boundaries to effectively double their allowed rate.

The Sliding Window Algorithm elegantly solves this problem by considering a continuously moving time window rather than fixed intervals. It's the rate limiting approach of choice when you need precise, predictable limits without exploitable edge cases.

What You Will Learn

By the end of this page, you will understand fixed window limitations, sliding window log and counter variants, implementation trade-offs, and when to choose sliding window over token bucket. You'll be equipped to implement sliding window rate limiting in production.

Fixed Window Limitations

Before diving into sliding windows, let's understand why simpler approaches fall short. The fixed window approach divides time into discrete intervals and counts requests within each.

fixed-window.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
class FixedWindowRateLimiter {
  private windowDurationMs: number;
  private maxRequests: number;
  private counts: Map<string, { count: number; windowStart: number }>;
 
  constructor(windowDurationMs: number, maxRequests: number) {
    this.windowDurationMs = windowDurationMs;
    this.maxRequests = maxRequests;
    this.counts = new Map();
  }
 
  tryConsume(key: string): boolean {
    const now = Date.now();
    const windowStart = Math.floor(now / this.windowDurationMs) * this.windowDurationMs;
    
    let entry = this.counts.get(key);
    
    // Reset if we're in a new window
    if (!entry || entry.windowStart !== windowStart) {
      entry = { count: 0, windowStart };
      this.counts.set(key, entry);
    }
    
    if (entry.count < this.maxRequests) {
      entry.count++;
      return true;
    }
    
    return false;
  }
}
 
// Problem demonstration:
// Limit: 100 requests per minute
// 11:59:59.500 - Send 100 requests ✓ (window: 11:59:00 - 11:59:59)
// 12:00:00.500 - Send 100 requests ✓ (window: 12:00:00 - 12:00:59)
// Result: 200 requests in 1 second, despite 100/minute limit!

The 2x Burst Problem

At window boundaries, fixed window allows up to 2x the intended rate. For a 100 req/min limit, clients can achieve 200 req/min by timing requests around the boundary. This can overload systems designed for the stated limit.

Converting Mermaid diagram...

Sliding Window Log Algorithm

The most precise sliding window implementation maintains a log of all request timestamps within the window. This approach is conceptually simple and perfectly accurate but has memory trade-offs.

Algorithm Steps

•When a request arrives, record the current timestamp in a log for that client.
•Remove all entries older than (current_time - window_duration).
•Count remaining entries in the log.
•If count < limit, allow the request. Otherwise, reject.

sliding-window-log.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
class SlidingWindowLogLimiter {
  private windowDurationMs: number;
  private maxRequests: number;
  private logs: Map<string, number[]>;  // key -> array of timestamps
 
  constructor(windowDurationMs: number, maxRequests: number) {
    this.windowDurationMs = windowDurationMs;
    this.maxRequests = maxRequests;
    this.logs = new Map();
  }
 
  tryConsume(key: string): boolean {
    const now = Date.now();
    const windowStart = now - this.windowDurationMs;
    
    // Get or create log for this key
    let log = this.logs.get(key);
    if (!log) {
      log = [];
      this.logs.set(key, log);
    }
    
    // Remove timestamps outside the window (older than windowStart)
    // Binary search would be more efficient for large logs
    while (log.length > 0 && log[0] <= windowStart) {
      log.shift();
    }
    
    // Check if we're under the limit
    if (log.length < this.maxRequests) {
      log.push(now);
      return true;
    }
    
    return false;
  }
 
  // Get time until next request will be allowed
  getRetryAfter(key: string): number {
    const log = this.logs.get(key);
    if (!log || log.length < this.maxRequests) {
      return 0;
    }
    
    const oldestRelevant = log[0];
    const now = Date.now();
    return Math.max(0, oldestRelevant + this.windowDurationMs - now);
  }
}

Advantages

•Perfectly accurate—no approximation
•Precise retry-after calculation
•Easy to understand and debug
•Exact knowledge of when next request allowed

Disadvantages

•O(n) memory per client (n = limit)
•High memory for large limits (10K req/min = 10K timestamps)
•More storage in distributed implementations
•Log cleanup adds computational overhead

Sliding Window Counter Algorithm

The sliding window counter is a clever approximation that achieves near-perfect accuracy with constant O(1) memory per client. It works by weighting the previous window's count based on overlap.

The Core Insight:

Instead of tracking individual timestamps, track counts for the current and previous window. The effective count is:

effective_count = previous_window_count × (1 - elapsed_percentage) + current_window_count

Example:

Window duration: 60 seconds
Previous window: 80 requests
Current window: 20 requests
Current position: 15 seconds into current window (25% elapsed)

effective_count = 80 × (1 - 0.25) + 20 = 60 + 20 = 80 requests

This approximation assumes requests in the previous window were evenly distributed—not always true, but close enough for rate limiting purposes.

sliding-window-counter.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
interface WindowState {
  previousCount: number;
  currentCount: number;
  currentWindowStart: number;
}
 
class SlidingWindowCounterLimiter {
  private windowDurationMs: number;
  private maxRequests: number;
  private windows: Map<string, WindowState>;
 
  constructor(windowDurationMs: number, maxRequests: number) {
    this.windowDurationMs = windowDurationMs;
    this.maxRequests = maxRequests;
    this.windows = new Map();
  }
 
  tryConsume(key: string): boolean {
    const now = Date.now();
    const currentWindowStart = Math.floor(now / this.windowDurationMs) * this.windowDurationMs;
    
    let state = this.windows.get(key);
    
    if (!state) {
      // First request from this client
      state = { previousCount: 0, currentCount: 0, currentWindowStart };
      this.windows.set(key, state);
    } else if (state.currentWindowStart !== currentWindowStart) {
      // We've moved to a new window
      if (currentWindowStart - state.currentWindowStart === this.windowDurationMs) {
        // Adjacent window: current becomes previous
        state.previousCount = state.currentCount;
      } else {
        // Skipped windows: reset previous
        state.previousCount = 0;
      }
      state.currentCount = 0;
      state.currentWindowStart = currentWindowStart;
    }
    
    // Calculate weighted count (sliding window approximation)
    const elapsedMs = now - currentWindowStart;
    const elapsedRatio = elapsedMs / this.windowDurationMs;
    const weightedPreviousCount = state.previousCount * (1 - elapsedRatio);
    const effectiveCount = weightedPreviousCount + state.currentCount;
    
    if (effectiveCount < this.maxRequests) {
      state.currentCount++;
      return true;
    }
    
    return false;
  }
 
  getState(key: string): { remaining: number; resetMs: number } {
    const now = Date.now();
    const currentWindowStart = Math.floor(now / this.windowDurationMs) * this.windowDurationMs;
    const state = this.windows.get(key);
    
    if (!state) {
      return { remaining: this.maxRequests, resetMs: 0 };
    }
    
    const elapsedMs = now - currentWindowStart;
    const elapsedRatio = elapsedMs / this.windowDurationMs;
    const weightedPreviousCount = state.previousCount * (1 - elapsedRatio);
    const effectiveCount = weightedPreviousCount + state.currentCount;
    
    return {
      remaining: Math.max(0, Math.floor(this.maxRequests - effectiveCount)),
      resetMs: this.windowDurationMs - elapsedMs,
    };
  }
}

The Sweet Spot

Sliding window counter is the most popular choice for production rate limiting. It provides O(1) memory per client, ~99% accuracy compared to sliding window log, and is simple to implement in distributed systems (only two counters to synchronize).

Algorithm Comparison

Let's compare all the rate limiting algorithms we've discussed to understand when to use each.

Rate Limiting Algorithm Comparison
Algorithm	Memory	Accuracy	Burst Handling	Best For
Fixed Window	O(1)	Poor at boundaries	Double burst at boundaries	Simple, non-critical limits
Sliding Window Log	O(n)	Perfect	Smooth	Small limits, precision needed
Sliding Window Counter	O(1)	~99%	Smooth	General purpose, distributed
Token Bucket	O(1)	Exact average	Controlled burst	APIs with bursty traffic
Leaky Bucket	O(1) + queue	Perfect rate	Queued/delayed	Traffic shaping, streaming

Decision Framework:

Need burst accommodation? → Token Bucket
Need precise window-based limits? → Sliding Window Counter
Need exact timing data? → Sliding Window Log (if memory allows)
Need constant output rate? → Leaky Bucket
Simple use case, minimal resources? → Fixed Window (with awareness of limitations)

Redis Implementation

For production systems, rate limit state is typically stored in Redis for performance and distribution across multiple gateway instances. Here's an efficient sliding window counter implementation using Redis.

sliding-window-redis.lua
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
-- Sliding Window Counter in Redis (Lua script for atomicity)
-- Keys: KEYS[1] = rate limit key (e.g., "ratelimit:user:123")
-- Args: ARGV[1] = window_size_ms, ARGV[2] = max_requests, ARGV[3] = current_time_ms
 
local key = KEYS[1]
local window_size = tonumber(ARGV[1])
local max_requests = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
 
-- Calculate window boundaries
local current_window = math.floor(now / window_size) * window_size
local previous_window = current_window - window_size
 
-- Keys for current and previous window
local current_key = key .. ":" .. current_window
local previous_key = key .. ":" .. previous_window
 
-- Get counts
local current_count = tonumber(redis.call("GET", current_key) or "0")
local previous_count = tonumber(redis.call("GET", previous_key) or "0")
 
-- Calculate weighted count
local elapsed = now - current_window
local elapsed_ratio = elapsed / window_size
local weighted_previous = previous_count * (1 - elapsed_ratio)
local effective_count = weighted_previous + current_count
 
-- Check limit
if effective_count >= max_requests then
    -- Calculate retry-after
    local retry_after = window_size - elapsed
    return {0, math.ceil(retry_after), math.floor(max_requests - effective_count)}
end
 
-- Increment and set expiry
redis.call("INCR", current_key)
redis.call("PEXPIRE", current_key, window_size * 2)  -- Keep for overlap calculation
 
return {1, 0, math.floor(max_requests - effective_count - 1)}

rate-limiter-client.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import Redis from 'ioredis';
 
class RedisSlidingWindowLimiter {
  private redis: Redis;
  private script: string;
  private scriptSha: string | null = null;
 
  constructor(redisUrl: string) {
    this.redis = new Redis(redisUrl);
    this.script = `/* Lua script from above */`;
  }
 
  async tryConsume(
    key: string,
    windowSizeMs: number,
    maxRequests: number
  ): Promise<{ allowed: boolean; remaining: number; retryAfterMs: number }> {
    const now = Date.now();
    
    // Load script if not already loaded
    if (!this.scriptSha) {
      this.scriptSha = await this.redis.script("LOAD", this.script);
    }
    
    try {
      const result = await this.redis.evalsha(
        this.scriptSha,
        1,
        `ratelimit:${key}`,
        windowSizeMs,
        maxRequests,
        now
      ) as [number, number, number];
      
      return {
        allowed: result[0] === 1,
        retryAfterMs: result[1],
        remaining: Math.max(0, result[2]),
      };
    } catch (error) {
      // Script not found (Redis restart), reload it
      this.scriptSha = null;
      return this.tryConsume(key, windowSizeMs, maxRequests);
    }
  }
}

Why Lua Scripts?

Redis Lua scripts execute atomically, preventing race conditions when multiple gateway instances access the same rate limit state. Without atomicity, concurrent requests could all pass before any increment is recorded.

Summary: Sliding Window Algorithm

The sliding window algorithm provides smooth, predictable rate limiting without the boundary exploitation problems of fixed windows. Let's recap:

Key Takeaways

•Fixed windows have the 2x burst problem — Clients can exploit boundaries to double their rate.
•Sliding window log is precise but memory-heavy — O(n) storage per client where n = limit.
•Sliding window counter is the sweet spot — O(1) memory with ~99% accuracy.
•Redis Lua scripts ensure atomicity — Critical for distributed rate limiting.
•Choose algorithm based on requirements — Burst handling, precision, and resource constraints.

What's Next:

We've now covered the core algorithms. But rate limiting isn't one-size-fits-all—different resources and users need different limits. The next page explores Per-User vs. Per-API Limits, covering hierarchical limiting, user tiers, and multi-dimensional rate limiting strategies.

Page Complete

You now understand sliding window algorithms and their trade-offs. The sliding window counter in Redis is your go-to for most production rate limiting needs.