Key Value Stores - Learning Module

Loading content...

0/252

Redis Example

The Swiss Army Knife of Databases

Redis (Remote Dictionary Server) stands as the most popular key-value store in the world, powering the real-time features of companies like Twitter, GitHub, Pinterest, Snapchat, and Stack Overflow. Created by Salvatore Sanfilippo in 2009, Redis has evolved from a simple cache into a versatile data structure server that can serve as a database, cache, message broker, and streaming platform.

What sets Redis apart from basic key-value stores is its rich collection of native data structures—strings, lists, sets, sorted sets, hashes, streams, and more—each with specialized operations that execute atomically on the server. This means you don't just store and retrieve bytes; you can push to lists, add to sets, increment counters, and rank leaderboard entries—all as single, atomic operations.

What You Will Learn

By the end of this page, you will understand Redis's architecture and data model, master its core data structures and operations, learn production patterns for caching, sessions, rate limiting, and real-time features, and understand persistence, replication, and clustering for production deployments.

Redis Architecture

Redis is fundamentally an in-memory database. All data lives in RAM, which is the source of its legendary performance—most operations complete in microseconds. But Redis is not just a cache; it provides configurable persistence to ensure data survives restarts.

Core architectural principles:

Redis Design Principles

•Single-Threaded Command Execution — All commands execute sequentially in a single thread. No locks needed; every command is atomic by design.
•Memory-First Storage — All data in RAM for microsecond access. Persistence is asynchronous background process.
•Rich Data Structures — Not just key→bytes; native support for lists, sets, sorted sets, hashes, streams.
•Simple Text Protocol — RESP (Redis Serialization Protocol) is human-readable and easy to implement.
•Atomic Operations — Complex operations (LPUSH, ZADD, HSET) are atomic; no partial updates visible.

Converting Mermaid diagram...

Why single-threaded works:

Counter-intuitively, Redis's single-threaded design is a performance advantage, not a limitation. Since all operations are in-memory and most complete in microseconds, a single thread can handle 100,000+ operations per second. The absence of locks means:

No lock contention overhead
No deadlock possibilities
Guaranteed atomicity without explicit transactions (for single commands)
Predictable latency with no lock-wait spikes

For CPU-intensive workloads or to utilize multiple cores, you scale by running multiple Redis instances (sharding).

Redis 6+ Threading

Redis 6 introduced I/O threading—multiple threads handle network I/O while command execution remains single-threaded. This improves performance for workloads with many concurrent connections without sacrificing the atomicity guarantees of single-threaded execution.

Core Data Structures

Redis goes far beyond simple key-value pairs. Each key can hold one of several data structure types, each with specialized commands. Understanding these structures and when to use them is the key to effective Redis usage.

Redis Data Structures Overview
Structure	Description	Max Size	Common Commands
String	Binary-safe string or integer	512 MB	GET, SET, INCR, APPEND
List	Ordered collection, doubly-linked	4 billion elements	LPUSH, RPUSH, LPOP, LRANGE
Set	Unordered unique strings	4 billion members	SADD, SMEMBERS, SINTER, SUNION
Sorted Set	Set with score for ordering	4 billion members	ZADD, ZRANGE, ZRANK, ZINCRBY
Hash	Field-value pairs (like mini-object)	4 billion fields	HSET, HGET, HMGET, HINCRBY
Stream	Append-only log with consumer groups	Unlimited	XADD, XREAD, XREADGROUP
HyperLogLog	Probabilistic cardinality counter	12 KB fixed	PFADD, PFCOUNT, PFMERGE
Bitmap	Bit-level operations on strings	512 MB (4B bits)	SETBIT, GETBIT, BITCOUNT

redis_data_structures.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
import redis
 
# Connect to Redis
r = redis.Redis(host='localhost', port=6379, db=0, decode_responses=True)
 
# ========================================
# STRINGS: Basic key-value pairs
# ========================================
 
# Simple string
r.set('user:1001:name', 'Alice Johnson')
name = r.get('user:1001:name')  # 'Alice Johnson'
 
# String with expiration (TTL)
r.setex('session:abc123', 3600, 'user_data_here')  # Expires in 1 hour
 
# Atomic increment (counters)
r.set('page_views', 0)
r.incr('page_views')      # Returns 1
r.incrby('page_views', 5)  # Returns 6
 
# Set only if not exists (distributed lock primitive)
acquired = r.setnx('lock:resource', 'owner_id')  # Returns True if set
 
# Set with expiration only if key doesn't exist (better lock)
r.set('lock:resource', 'owner_id', nx=True, ex=30)
 
# ========================================
# LISTS: Ordered collections
# ========================================
 
# Add to lists
r.lpush('notifications:1001', 'msg3', 'msg2', 'msg1')  # Push to left
r.rpush('queue:tasks', 'task1', 'task2')  # Push to right
 
# Pop from lists
task = r.lpop('queue:tasks')  # Returns 'task1'
task = r.brpop('queue:tasks', timeout=5)  # Blocking pop, waits 5s
 
# Range access
notifications = r.lrange('notifications:1001', 0, 9)  # First 10
 
# Trim list (keep only recent)
r.ltrim('notifications:1001', 0, 99)  # Keep only 100 most recent
 
# ========================================
# SETS: Unique unordered collections
# ========================================
 
# Add members
r.sadd('user:1001:tags', 'premium', 'developer', 'sf-bay')
r.sadd('user:1002:tags', 'free', 'designer', 'sf-bay')
 
# Check membership
is_premium = r.sismember('user:1001:tags', 'premium')  # True
 
# Set operations
common_tags = r.sinter('user:1001:tags', 'user:1002:tags')  # {'sf-bay'}
all_tags = r.sunion('user:1001:tags', 'user:1002:tags')
 
# Random member (for sampling)
random_tag = r.srandmember('user:1001:tags')
 
# ========================================
# SORTED SETS: Ordered by score
# ========================================
 
# Add with scores (leaderboard)
r.zadd('leaderboard:game1', {'alice': 1500, 'bob': 1200, 'charlie': 1800})
 
# Increment score
r.zincrby('leaderboard:game1', 100, 'alice')  # Alice now 1600
 
# Rank queries
rank = r.zrevrank('leaderboard:game1', 'alice')  # 1 (0-indexed, descending)
top_3 = r.zrevrange('leaderboard:game1', 0, 2, withscores=True)
# [('charlie', 1800), ('alice', 1600), ('bob', 1200)]
 
# Score range (e.g., recent time-based events)
# Using timestamp as score
now = time.time()
r.zadd('events:user:1001', {f'event_{i}': now + i for i in range(10)})
recent = r.zrangebyscore('events:user:1001', now - 3600, now)  # Last hour
 
# ========================================
# HASHES: Field-value maps
# ========================================
 
# Store object as hash
r.hset('user:1001', mapping={
    'email': 'alice@example.com',
    'name': 'Alice Johnson',
    'login_count': 0,
    'last_login': '',
})
 
# Get single field
email = r.hget('user:1001', 'email')
 
# Get multiple fields
user_data = r.hmget('user:1001', ['email', 'name'])
 
# Get all fields
user = r.hgetall('user:1001')
 
# Increment field
r.hincrby('user:1001', 'login_count', 1)
 
# ========================================
# HYPERLOGLOG: Cardinality estimation
# ========================================
 
# Count unique visitors (O(1) memory regardless of count)
r.pfadd('unique_visitors:2024-01-15', 'user1', 'user2', 'user3')
r.pfadd('unique_visitors:2024-01-15', 'user2', 'user4')  # user2 not re-counted
 
count = r.pfcount('unique_visitors:2024-01-15')  # ~4 (approx, <1% error)
 
# Merge multiple HyperLogLogs
r.pfmerge('unique_visitors:week', 
          'unique_visitors:2024-01-15', 
          'unique_visitors:2024-01-16')

Data Structure Selection

Choose the right data structure for your access pattern: Strings for simple values and counters, Lists for queues and recent items, Sets for unique collections and intersections, Sorted Sets for leaderboards and time-ranges, Hashes for object-like structures with partial access.

Essential Patterns

Redis's data structures enable powerful patterns that solve common distributed systems problems. Let's examine the most important production patterns.

Pattern 1: Cache-Aside (Lazy Loading)

The most common caching pattern: check cache first, load from database on miss, populate cache.

cache_aside_pattern.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
class CacheAsidePattern:
    """
    Cache-Aside pattern implementation.
    Also known as lazy-loading or look-aside cache.
    """
    
    def __init__(self, redis_client, database, default_ttl=3600):
        self.redis = redis_client
        self.db = database
        self.default_ttl = default_ttl
    
    def get_user(self, user_id: str) -> dict:
        """
        Get user with cache-aside pattern.
        
        1. Check cache first (fast path)
        2. On miss, load from database
        3. Populate cache for next request
        """
        cache_key = f"cache:user:{user_id}"
        
        # Step 1: Try cache
        cached = self.redis.get(cache_key)
        if cached:
            return json.loads(cached)
        
        # Step 2: Cache miss - load from database
        user = self.db.get_user(user_id)
        if user is None:
            return None
        
        # Step 3: Populate cache with TTL
        self.redis.setex(
            cache_key, 
            self.default_ttl, 
            json.dumps(user)
        )
        
        return user
    
    def update_user(self, user_id: str, data: dict) -> None:
        """
        Update user with cache invalidation.
        
        Strategy: Write to database, then invalidate cache.
        Do NOT write to cache directly (data inconsistency risk).
        """
        # Update database (source of truth)
        self.db.update_user(user_id, data)
        
        # Invalidate cache - next read will reload
        cache_key = f"cache:user:{user_id}"
        self.redis.delete(cache_key)
    
    def delete_user(self, user_id: str) -> None:
        """Delete user and invalidate cache."""
        self.db.delete_user(user_id)
        self.redis.delete(f"cache:user:{user_id}")
 
 
# ========================================
# Pattern 2: Write-Through Cache
# ========================================
 
class WriteThroughPattern:
    """
    Write-Through: Write to cache AND database synchronously.
    Ensures cache is always up-to-date.
    """
    
    def update_user(self, user_id: str, data: dict) -> None:
        """
        Update database and cache synchronously.
        
        Both writes must succeed. Consider:
        - What if cache write fails after DB write?
        - What if DB write fails after cache write?
        """
        cache_key = f"cache:user:{user_id}"
        
        # Write to database first
        self.db.update_user(user_id, data)
        
        # Then write to cache with TTL
        user = self.db.get_user(user_id)  # Get complete fresh state
        self.redis.setex(
            cache_key,
            self.default_ttl,
            json.dumps(user)
        )

Pattern 2: Distributed Locking

Acquire exclusive access to a resource across multiple processes/servers.

distributed_lock.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
import uuid
import time
 
class DistributedLock:
    """
    Distributed lock implementation using Redis.
    
    Uses SET NX EX for atomic check-and-set with expiration.
    Lock automatically expires to prevent deadlocks.
    """
    
    def __init__(self, redis_client, lock_name: str, 
                 expire_seconds: int = 30):
        self.redis = redis_client
        self.lock_key = f"lock:{lock_name}"
        self.expire_seconds = expire_seconds
        self.lock_id = None
    
    def acquire(self, blocking: bool = True, 
                timeout: float = None) -> bool:
        """
        Acquire the lock.
        
        Args:
            blocking: If True, wait until lock is available
            timeout: Max seconds to wait (None = wait forever)
        
        Returns:
            True if lock acquired, False otherwise
        """
        self.lock_id = str(uuid.uuid4())
        start_time = time.time()
        
        while True:
            # SET key value NX EX seconds
            # NX = only if not exists, EX = expiration
            acquired = self.redis.set(
                self.lock_key,
                self.lock_id,
                nx=True,
                ex=self.expire_seconds
            )
            
            if acquired:
                return True
            
            if not blocking:
                return False
            
            # Check timeout
            if timeout is not None:
                if time.time() - start_time > timeout:
                    return False
            
            # Wait before retry (avoid spinning)
            time.sleep(0.01)
    
    def release(self) -> bool:
        """
        Release the lock safely.
        
        Uses Lua script for atomic check-and-delete.
        Only releases if we still own the lock.
        """
        if self.lock_id is None:
            return False
        
        # Lua script ensures atomic check-and-delete
        lua_script = """
        if redis.call("GET", KEYS[1]) == ARGV[1] then
            return redis.call("DEL", KEYS[1])
        else
            return 0
        end
        """
        
        result = self.redis.eval(lua_script, 1, 
                                 self.lock_key, self.lock_id)
        
        self.lock_id = None
        return result == 1
    
    def extend(self, additional_seconds: int = None) -> bool:
        """
        Extend lock expiration (for long operations).
        
        Only extends if we still own the lock.
        """
        if additional_seconds is None:
            additional_seconds = self.expire_seconds
        
        lua_script = """
        if redis.call("GET", KEYS[1]) == ARGV[1] then
            return redis.call("EXPIRE", KEYS[1], ARGV[2])
        else
            return 0
        end
        """
        
        result = self.redis.eval(
            lua_script, 1,
            self.lock_key, self.lock_id, additional_seconds
        )
        
        return result == 1
    
    def __enter__(self):
        """Context manager support."""
        if not self.acquire():
            raise Exception("Could not acquire lock")
        return self
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        self.release()
        return False
 
 
# Usage
lock = DistributedLock(redis_client, "process_order:123")
 
with lock:
    # Only one process can execute this block at a time
    process_order("123")

Pattern 3: Rate Limiting

Limit API requests per client using sliding window or token bucket algorithms.

rate_limiting.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
import time
 
class SlidingWindowRateLimiter:
    """
    Sliding window rate limiter using Redis sorted sets.
    
    More accurate than fixed window, prevents burst at window edges.
    """
    
    def __init__(self, redis_client, max_requests: int, 
                 window_seconds: int):
        self.redis = redis_client
        self.max_requests = max_requests
        self.window_seconds = window_seconds
    
    def is_allowed(self, client_id: str) -> tuple[bool, dict]:
        """
        Check if request is allowed and record it.
        
        Returns:
            (allowed: bool, info: dict with remaining, reset_at)
        """
        key = f"ratelimit:{client_id}"
        now = time.time()
        window_start = now - self.window_seconds
        
        # Use pipeline for atomic operation
        pipe = self.redis.pipeline()
        
        # Remove old entries outside window
        pipe.zremrangebyscore(key, 0, window_start)
        
        # Count entries in current window
        pipe.zcard(key)
        
        # Add current request (score = timestamp)
        request_id = f"{now}:{uuid.uuid4().hex[:8]}"
        pipe.zadd(key, {request_id: now})
        
        # Set expiration on key
        pipe.expire(key, self.window_seconds + 1)
        
        # Execute pipeline
        results = pipe.execute()
        
        current_count = results[1]  # zcard result
        
        allowed = current_count < self.max_requests
        remaining = max(0, self.max_requests - current_count - 1)
        
        if not allowed:
            # Remove the request we just added
            self.redis.zrem(key, request_id)
            remaining = 0
        
        # Calculate reset time (when oldest entry expires)
        oldest = self.redis.zrange(key, 0, 0, withscores=True)
        if oldest:
            reset_at = oldest[0][1] + self.window_seconds
        else:
            reset_at = now + self.window_seconds
        
        return allowed, {
            'remaining': remaining,
            'reset_at': reset_at,
            'retry_after': 0 if allowed else reset_at - now
        }
 
 
class TokenBucketRateLimiter:
    """
    Token bucket rate limiter using Redis.
    
    Allows bursts up to bucket capacity while maintaining average rate.
    """
    
    def __init__(self, redis_client, capacity: int, 
                 refill_rate: float):
        """
        Args:
            capacity: Max tokens in bucket
            refill_rate: Tokens added per second
        """
        self.redis = redis_client
        self.capacity = capacity
        self.refill_rate = refill_rate
    
    def is_allowed(self, client_id: str, tokens: int = 1) -> bool:
        """
        Try to consume tokens from the bucket.
        
        Uses Lua script for atomic token calculation.
        """
        key = f"bucket:{client_id}"
        now = time.time()
        
        lua_script = """
        local key = KEYS[1]
        local capacity = tonumber(ARGV[1])
        local refill_rate = tonumber(ARGV[2])
        local now = tonumber(ARGV[3])
        local requested = tonumber(ARGV[4])
        
        -- Get current bucket state
        local last_update = tonumber(redis.call("HGET", key, "last_update") or now)
        local tokens = tonumber(redis.call("HGET", key, "tokens") or capacity)
        
        -- Calculate tokens to add based on time elapsed
        local elapsed = now - last_update
        local new_tokens = math.min(capacity, tokens + (elapsed * refill_rate))
        
        -- Check if we have enough tokens
        if new_tokens >= requested then
            new_tokens = new_tokens - requested
            redis.call("HSET", key, "tokens", new_tokens)
            redis.call("HSET", key, "last_update", now)
            redis.call("EXPIRE", key, 86400)  -- Clean up after 1 day idle
            return 1
        else
            -- Update timestamp but don't consume
            redis.call("HSET", key, "tokens", new_tokens)
            redis.call("HSET", key, "last_update", now)
            redis.call("EXPIRE", key, 86400)
            return 0
        end
        """
        
        result = self.redis.eval(
            lua_script, 1, key,
            self.capacity, self.refill_rate, now, tokens
        )
        
        return result == 1

Lua Scripts in Redis

Lua scripts execute atomically in Redis—no other commands can run while a script executes. Use Lua when you need to read, compute, and write atomically. This is essential for rate limiting, distributed locks, and other patterns that require read-modify-write atomicity.

Real-Time Features

Redis excels at real-time features due to its sub-millisecond latency. Here are patterns for common real-time use cases.

Pattern: Leaderboards

Sorted sets are perfect for leaderboards—O(log N) insertions and O(log N + M) range queries.

leaderboard.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
class Leaderboard:
    """
    Real-time leaderboard using Redis sorted sets.
    
    Sorted sets maintain ordering automatically, making
    leaderboard operations extremely efficient.
    """
    
    def __init__(self, redis_client, leaderboard_name: str):
        self.redis = redis_client
        self.key = f"leaderboard:{leaderboard_name}"
    
    def set_score(self, user_id: str, score: float) -> None:
        """Set/update user's score."""
        self.redis.zadd(self.key, {user_id: score})
    
    def increment_score(self, user_id: str, delta: float) -> float:
        """Atomically increment score. Returns new score."""
        return self.redis.zincrby(self.key, delta, user_id)
    
    def get_rank(self, user_id: str) -> int | None:
        """
        Get user's rank (0-indexed, highest score = rank 0).
        Returns None if user not in leaderboard.
        """
        rank = self.redis.zrevrank(self.key, user_id)
        return rank  # 0 = first place
    
    def get_score(self, user_id: str) -> float | None:
        """Get user's current score."""
        return self.redis.zscore(self.key, user_id)
    
    def get_top(self, count: int = 10) -> list[tuple[str, float]]:
        """Get top N players with scores."""
        return self.redis.zrevrange(
            self.key, 0, count - 1, withscores=True
        )
    
    def get_around(self, user_id: str, count: int = 5)         -> list[tuple[str, float]]:
        """
        Get players around the given user.
        
        Returns 'count' players above and below.
        """
        rank = self.get_rank(user_id)
        if rank is None:
            return []
        
        start = max(0, rank - count)
        end = rank + count
        
        return self.redis.zrevrange(
            self.key, start, end, withscores=True
        )
    
    def get_page(self, page: int, page_size: int = 50)         -> list[tuple[str, float]]:
        """Get paginated leaderboard results."""
        start = page * page_size
        end = start + page_size - 1
        
        return self.redis.zrevrange(
            self.key, start, end, withscores=True
        )
    
    def remove_user(self, user_id: str) -> bool:
        """Remove user from leaderboard."""
        return self.redis.zrem(self.key, user_id) == 1
    
    def get_total_count(self) -> int:
        """Total number of players in leaderboard."""
        return self.redis.zcard(self.key)
 
 
# Usage
lb = Leaderboard(redis_client, "game:puzzle:weekly")
 
# Player completes level
lb.increment_score("player123", 100)
 
# Get player's current standing
rank = lb.get_rank("player123")  # 42
score = lb.get_score("player123")  # 1500
 
# Display top 10
top_10 = lb.get_top(10)
# [('champion', 10000), ('pro_player', 9500), ...]
 
# Show players around current player
nearby = lb.get_around("player123", count=3)
# Shows 3 above and 3 below in ranking

Pattern: Pub/Sub for Real-Time Notifications

Redis Pub/Sub enables real-time message broadcasting to subscribed clients.

pubsub.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
import threading
 
class RealtimeNotifications:
    """
    Real-time notification system using Redis Pub/Sub.
    
    Pub/Sub is fire-and-forget - messages are NOT persisted.
    For durable messaging, use Redis Streams instead.
    """
    
    def __init__(self, redis_client):
        self.redis = redis_client
        self.pubsub = self.redis.pubsub()
    
    def subscribe_to_user(self, user_id: str, 
                          callback: callable) -> None:
        """Subscribe to notifications for a user."""
        channel = f"notifications:{user_id}"
        self.pubsub.subscribe(**{channel: callback})
        
        # Start listening in background thread
        thread = threading.Thread(target=self.pubsub.run_in_thread, 
                                  args=(0.01,))  # 10ms interval
        thread.daemon = True
        thread.start()
    
    def notify_user(self, user_id: str, message: dict) -> int:
        """
        Send notification to a user.
        
        Returns number of subscribers who received the message.
        Note: 0 means no one is currently subscribed!
        """
        channel = f"notifications:{user_id}"
        return self.redis.publish(channel, json.dumps(message))
    
    def broadcast_to_all(self, message: dict) -> int:
        """Broadcast to all users (use with caution)."""
        return self.redis.publish("notifications:broadcast", 
                                  json.dumps(message))
    
    def subscribe_to_pattern(self, pattern: str, 
                             callback: callable) -> None:
        """
        Subscribe to channels matching a pattern.
        
        Example: "notifications:*" for all user channels
        """
        self.pubsub.psubscribe(**{pattern: callback})
 
 
# ========================================
# Redis Streams: Durable Message Streaming
# ========================================
 
class DurableEventStream:
    """
    Durable event streaming using Redis Streams.
    
    Unlike Pub/Sub, Streams PERSIST messages and support:
    - Consumer groups for distributed processing
    - Message acknowledgment
    - Reading from specific position
    - Automatic trimming by count or time
    """
    
    def __init__(self, redis_client, stream_name: str):
        self.redis = redis_client
        self.stream = stream_name
    
    def publish(self, event: dict, max_len: int = 10000) -> str:
        """
        Publish event to stream.
        
        Returns stream ID (timestamp-sequence format).
        max_len trims old entries to cap memory usage.
        """
        # XADD stream MAXLEN ~ 10000 * field value ...
        return self.redis.xadd(
            self.stream, 
            event,
            maxlen=max_len,
            approximate=True  # ~ for better performance
        )
    
    def read_latest(self, count: int = 10) -> list:
        """Read latest events without consumer group."""
        # XREVRANGE stream + - COUNT 10
        return self.redis.xrevrange(self.stream, '+', '-', count=count)
    
    def create_consumer_group(self, group_name: str, 
                              start_id: str = '0') -> bool:
        """Create consumer group for distributed processing."""
        try:
            self.redis.xgroup_create(
                self.stream, group_name, 
                id=start_id, mkstream=True
            )
            return True
        except redis.ResponseError as e:
            if "BUSYGROUP" in str(e):
                return False  # Group already exists
            raise
    
    def consume(self, group_name: str, consumer_name: str,
                count: int = 10, block_ms: int = 5000) -> list:
        """
        Consume messages as part of a consumer group.
        
        Multiple consumers can share the load - each message
        is delivered to only ONE consumer in the group.
        """
        # XREADGROUP GROUP group consumer BLOCK 5000 COUNT 10 STREAMS stream >
        result = self.redis.xreadgroup(
            group_name, consumer_name,
            {self.stream: '>'},  # '>' = only new messages
            count=count,
            block=block_ms
        )
        
        if result:
            return result[0][1]  # List of (id, data) tuples
        return []
    
    def acknowledge(self, group_name: str, message_id: str) -> int:
        """Acknowledge message processing completed."""
        return self.redis.xack(self.stream, group_name, message_id)

Pub/Sub vs Streams

Pub/Sub is fire-and-forget—if no one is subscribed, messages are lost. Use Redis Streams for durable messaging where message persistence, replay, and guaranteed delivery are required. Streams add some overhead but provide reliability.

Persistence Options

Redis provides two persistence mechanisms that can be used independently or together, offering flexibility in the durability vs. performance trade-off.

RDB (Snapshotting)

•Mechanism: Point-in-time snapshot to disk
•Format: Compact binary dump file
•Triggers: Automatic (time/changes) or manual BGSAVE
•Restart Speed: Fast (just load the dump)
•Data Loss Risk: Up to last snapshot interval
•Best For: Backups, disaster recovery, replicas

AOF (Append-Only File)

•Mechanism: Log every write operation
•Format: Text file of Redis commands
•Sync Options: every second, every write, or OS-controlled
•Restart Speed: Slower (must replay all commands)
•Data Loss Risk: Configurable (down to 0 with fsync always)
•Best For: Maximum durability requirements

persistence_config.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# ========================================
# RDB SNAPSHOTTING CONFIGURATION
# ========================================
 
# Save after 900 seconds if at least 1 key changed
save 900 1
# Save after 300 seconds if at least 10 keys changed
save 300 10
# Save after 60 seconds if at least 10000 keys changed
save 60 10000
 
# Disable RDB entirely
# save ""
 
# Stop accepting writes if RDB save fails
stop-writes-on-bgsave-error yes
 
# Compress RDB dump file (uses LZF, slight CPU cost)
rdbcompression yes
 
# Include CRC64 checksum in RDB file
rdbchecksum yes
 
# RDB filename
dbfilename dump.rdb
 
# Working directory (where RDB and AOF are saved)
dir /var/lib/redis
 
# ========================================
# AOF CONFIGURATION
# ========================================
 
# Enable AOF persistence
appendonly yes
 
# AOF filename
appendfilename "appendonly.aof"
 
# Sync policy options:
# - always: fsync after every write (safest, slowest)
# - everysec: fsync every second (good balance, default)
# - no: let OS decide when to sync (fastest, riskiest)
appendfsync everysec
 
# Don't fsync during BGSAVE/BGREWRITEAOF (better perf, may lose data on crash)
no-appendfsync-on-rewrite no
 
# Auto-rewrite AOF when it grows by 100% and is > 64MB
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
 
# Load truncated AOF on startup (useful after crash)
aof-load-truncated yes
 
# Use RDB-format for AOF (hybrid approach, best of both)
aof-use-rdb-preamble yes
 
# ========================================
# RECOMMENDATIONS BY USE CASE
# ========================================
 
# Pure Cache (no persistence needed):
#   save ""
#   appendonly no
 
# Standard Durability:
#   save 900 1
#   save 300 10
#   appendonly yes
#   appendfsync everysec
 
# Maximum Durability:
#   save 60 1
#   appendonly yes
#   appendfsync always
#   (Note: ~10x performance impact)
 
# Hybrid (belt and suspenders):
#   save 900 1
#   save 300 10
#   appendonly yes
#   appendfsync everysec
#   aof-use-rdb-preamble yes

Understanding fsync timing:

The appendfsync setting controls how often Redis forces data to physical disk:

Setting	Behavior	Performance	Max Data Loss
`always`	fsync after every write	~1000 ops/sec	Minimal
`everysec`	fsync every second	~100,000 ops/sec	~1 second
`no`	OS decides (typically 30s)	~150,000 ops/sec	Many seconds

For most production uses, everysec provides the right balance. Only use always when data loss is truly unacceptable (financial transactions, etc.).

Modern Hybrid Approach

Enable both RDB and AOF with aof-use-rdb-preamble=yes. The AOF file starts with an RDB snapshot (fast loading) followed by only the commands since that snapshot. This gives fast restarts AND durability.

Replication and Clustering

For production deployments, a single Redis instance is a single point of failure. Redis provides two scaling and availability mechanisms.

Redis Replication (Primary-Replica)

A primary instance replicates all writes to one or more replicas asynchronously. Replicas can serve read queries to scale read throughput.

Converting Mermaid diagram...

replication_setup.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# ========================================
# REPLICA CONFIGURATION
# ========================================
 
# On the replica server, configure to replicate from primary:
 
# Option 1: In redis.conf
replicaof 192.168.1.100 6379  # Primary host and port
 
# If primary has password
masterauth your_primary_password
 
# Serve stale data during sync (or refuse with "no")
replica-serve-stale-data yes
 
# Replica is read-only (highly recommended)
replica-read-only yes
 
# ========================================
# RUNTIME REPLICATION COMMANDS
# ========================================
 
# Make this instance a replica of another
# redis-cli
> REPLICAOF 192.168.1.100 6379
 
# Check replication status
> INFO replication
# role:slave
# master_host:192.168.1.100
# master_port:6379
# master_link_status:up
# master_last_io_seconds_ago:1
# master_sync_in_progress:0
 
# Promote replica to primary (stops replication)
> REPLICAOF NO ONE
 
# ========================================
# REDIS SENTINEL CONFIGURATION
# ========================================
 
# Sentinel monitors primary and initiates failover
# Configure in sentinel.conf:
 
# Monitor primary with name "mymaster", 
# quorum of 2 sentinels needed to confirm failover
sentinel monitor mymaster 192.168.1.100 6379 2
 
# How long to wait before considering primary down
sentinel down-after-milliseconds mymaster 30000
 
# Max replicas to reconfigure in parallel during failover
sentinel parallel-syncs mymaster 1
 
# Failover timeout
sentinel failover-timeout mymaster 180000
 
# Auth for monitored instances
sentinel auth-pass mymaster your_password

Redis Cluster (Horizontal Scaling)

Redis Cluster automatically partitions data across multiple primary nodes using hash slots. Each primary can have replicas for high availability.

Replication vs Clustering
Aspect	Replication + Sentinel	Redis Cluster
Data Distribution	All data on every node	Data sharded across nodes
Scaling Writes	Single primary bottleneck	Linear write scaling
Scaling Reads	Add more replicas	Add more shards
Max Dataset Size	Limited by single node RAM	Sum of all nodes' RAM
Multi-Key Operations	All keys available	Only if keys in same slot
Complexity	Lower	Higher
Use Case	HA without massive scale needs	Large datasets, high throughput

Cluster Multi-Key Limitations

In Redis Cluster, multi-key operations (MGET, MSET, transactions) only work if ALL keys hash to the same slot. Use hash tags (e.g., {user:123}:profile and {user:123}:settings) to force related keys to the same slot. Plan key design carefully before adopting cluster.

Production Best Practices

Running Redis in production requires attention to configuration, monitoring, and operational practices.

Production Configuration Checklist

•Set maxmemory — Always configure memory limit to prevent OOM. Redis must know when to evict.
•Choose eviction policy — allkeys-lru for cache, noeviction for database, volatile-lru for mixed.
•Enable authentication — Set requirepass and use ACLs (Redis 6+) for fine-grained access control.
•Disable dangerous commands — Rename or disable KEYS, FLUSHALL, FLUSHDB, DEBUG in production.
•Configure persistence appropriately — Match RDB/AOF settings to your durability requirements.
•Monitor slow queries — Enable slowlog to catch commands taking too long.
•Set up replication — Never run a single instance for important data.

production_redis.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
# ========================================
# MEMORY MANAGEMENT
# ========================================
 
# Maximum memory Redis can use (adjust to your server)
maxmemory 4gb
 
# Eviction policy when maxmemory reached
# - noeviction: return errors for writes
# - allkeys-lru: evict least recently used keys
# - volatile-lru: evict LRU keys with TTL only
# - allkeys-lfu: evict least frequently used (Redis 4.0+)
maxmemory-policy allkeys-lru
 
# Number of keys to sample for LRU eviction
maxmemory-samples 10
 
# ========================================
# SECURITY
# ========================================
 
# Require password for all clients
requirepass your_strong_password_here
 
# Bind to specific interface (not 0.0.0.0!)
bind 127.0.0.1 10.0.0.1
 
# Disable protected mode if binding to non-localhost
protected-mode yes
 
# Rename dangerous commands (or disable with "")
rename-command FLUSHALL ""
rename-command FLUSHDB ""
rename-command DEBUG ""
rename-command KEYS "KEYS_DISABLED_IN_PROD"
 
# ========================================
# PERFORMANCE TUNING
# ========================================
 
# TCP backlog for high connection rates
tcp-backlog 511
 
# Client timeout (0 = disabled)
timeout 0
 
# TCP keepalive (seconds)
tcp-keepalive 300
 
# Number of databases (default 16, usually only use db 0)
databases 16
 
# ========================================
# LOGGING AND MONITORING
# ========================================
 
# Log level (debug, verbose, notice, warning)
loglevel notice
 
# Log file (empty = stdout)
logfile /var/log/redis/redis.log
 
# Slow log: log commands taking > 10ms
slowlog-log-slower-than 10000
 
# Keep last 128 slow commands
slowlog-max-len 128
 
# ========================================
# CLIENT CONNECTION LIMITS
# ========================================
 
# Max simultaneous clients
maxclients 10000
 
# ========================================
# LATENCY MONITORING
# ========================================
 
# Enable latency monitoring for spikes > 10ms
latency-monitor-threshold 10

Key metrics to monitor:

Metric	Command	Warning Threshold
Memory usage	`INFO memory`	80% of maxmemory
Connected clients	`INFO clients`	Near maxclients
Commands per second	`INFO stats`	Significant drops
Key evictions	`INFO stats`	Any in non-cache use
Slow commands	`SLOWLOG`	Growing count
Replication lag	`INFO replication`	master_link_status != up
Persistence status	`INFO persistence`	rdb_last_bgsave_status != ok

Never Run KEYS in Production

The KEYS command scans ALL keys and blocks Redis during the scan. For a large database, this can freeze Redis for seconds. Use SCAN for iterative, non-blocking key enumeration instead. Rename KEYS to something unusable in production configs.

Summary: Mastering Redis

We've taken a comprehensive tour of Redis as the canonical key-value store. Let's consolidate the essential knowledge:

Key Takeaways

•Beyond Simple Key-Value — Redis offers rich data structures (strings, lists, sets, sorted sets, hashes, streams) enabling powerful patterns.
•Single-Threaded Excellence — All commands are atomic by design. No locks, no race conditions on single-key operations.
•Essential Patterns — Cache-aside, distributed locking, rate limiting, leaderboards—all elegantly implemented with Redis primitives.
•Real-Time Capable — Sub-millisecond latency enables real-time features. Pub/Sub for ephemeral messaging, Streams for durable events.
•Flexible Persistence — RDB snapshots for fast recovery, AOF for durability. Hybrid approach provides best of both.
•Scalable Architecture — Replication for read scaling and HA, Cluster for horizontal scaling of both reads and writes.
•Production Readiness — Memory limits, security, monitoring, and careful command usage are essential for reliability.

What's next:

Now that we've mastered Redis as a canonical key-value store example, we'll explore the use cases where key-value stores excel—caching, session management, real-time analytics, queuing, and more. We'll understand when key-value stores are the right tool and when you should look elsewhere.

Page Complete

You now have deep, practical knowledge of Redis—the world's most popular key-value store. You understand its architecture, data structures, production patterns, persistence options, and scaling approaches. Next, we'll explore the specific use cases where key-value stores provide the most value.

Redis Example

The Swiss Army Knife of Databases

What You Will Learn

Redis Architecture

Core architectural principles:

Redis Design Principles

•Single-Threaded Command Execution — All commands execute sequentially in a single thread. No locks needed; every command is atomic by design.
•Memory-First Storage — All data in RAM for microsecond access. Persistence is asynchronous background process.
•Rich Data Structures — Not just key→bytes; native support for lists, sets, sorted sets, hashes, streams.
•Simple Text Protocol — RESP (Redis Serialization Protocol) is human-readable and easy to implement.
•Atomic Operations — Complex operations (LPUSH, ZADD, HSET) are atomic; no partial updates visible.

Converting Mermaid diagram...

Why single-threaded works:

No lock contention overhead
No deadlock possibilities
Guaranteed atomicity without explicit transactions (for single commands)
Predictable latency with no lock-wait spikes

For CPU-intensive workloads or to utilize multiple cores, you scale by running multiple Redis instances (sharding).

Redis 6+ Threading

Core Data Structures

Redis Data Structures Overview
Structure	Description	Max Size	Common Commands
String	Binary-safe string or integer	512 MB	GET, SET, INCR, APPEND
List	Ordered collection, doubly-linked	4 billion elements	LPUSH, RPUSH, LPOP, LRANGE
Set	Unordered unique strings	4 billion members	SADD, SMEMBERS, SINTER, SUNION
Sorted Set	Set with score for ordering	4 billion members	ZADD, ZRANGE, ZRANK, ZINCRBY
Hash	Field-value pairs (like mini-object)	4 billion fields	HSET, HGET, HMGET, HINCRBY
Stream	Append-only log with consumer groups	Unlimited	XADD, XREAD, XREADGROUP
HyperLogLog	Probabilistic cardinality counter	12 KB fixed	PFADD, PFCOUNT, PFMERGE
Bitmap	Bit-level operations on strings	512 MB (4B bits)	SETBIT, GETBIT, BITCOUNT

redis_data_structures.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
import redis
 
# Connect to Redis
r = redis.Redis(host='localhost', port=6379, db=0, decode_responses=True)
 
# ========================================
# STRINGS: Basic key-value pairs
# ========================================
 
# Simple string
r.set('user:1001:name', 'Alice Johnson')
name = r.get('user:1001:name')  # 'Alice Johnson'
 
# String with expiration (TTL)
r.setex('session:abc123', 3600, 'user_data_here')  # Expires in 1 hour
 
# Atomic increment (counters)
r.set('page_views', 0)
r.incr('page_views')      # Returns 1
r.incrby('page_views', 5)  # Returns 6
 
# Set only if not exists (distributed lock primitive)
acquired = r.setnx('lock:resource', 'owner_id')  # Returns True if set
 
# Set with expiration only if key doesn't exist (better lock)
r.set('lock:resource', 'owner_id', nx=True, ex=30)
 
# ========================================
# LISTS: Ordered collections
# ========================================
 
# Add to lists
r.lpush('notifications:1001', 'msg3', 'msg2', 'msg1')  # Push to left
r.rpush('queue:tasks', 'task1', 'task2')  # Push to right
 
# Pop from lists
task = r.lpop('queue:tasks')  # Returns 'task1'
task = r.brpop('queue:tasks', timeout=5)  # Blocking pop, waits 5s
 
# Range access
notifications = r.lrange('notifications:1001', 0, 9)  # First 10
 
# Trim list (keep only recent)
r.ltrim('notifications:1001', 0, 99)  # Keep only 100 most recent
 
# ========================================
# SETS: Unique unordered collections
# ========================================
 
# Add members
r.sadd('user:1001:tags', 'premium', 'developer', 'sf-bay')
r.sadd('user:1002:tags', 'free', 'designer', 'sf-bay')
 
# Check membership
is_premium = r.sismember('user:1001:tags', 'premium')  # True
 
# Set operations
common_tags = r.sinter('user:1001:tags', 'user:1002:tags')  # {'sf-bay'}
all_tags = r.sunion('user:1001:tags', 'user:1002:tags')
 
# Random member (for sampling)
random_tag = r.srandmember('user:1001:tags')
 
# ========================================
# SORTED SETS: Ordered by score
# ========================================
 
# Add with scores (leaderboard)
r.zadd('leaderboard:game1', {'alice': 1500, 'bob': 1200, 'charlie': 1800})
 
# Increment score
r.zincrby('leaderboard:game1', 100, 'alice')  # Alice now 1600
 
# Rank queries
rank = r.zrevrank('leaderboard:game1', 'alice')  # 1 (0-indexed, descending)
top_3 = r.zrevrange('leaderboard:game1', 0, 2, withscores=True)
# [('charlie', 1800), ('alice', 1600), ('bob', 1200)]
 
# Score range (e.g., recent time-based events)
# Using timestamp as score
now = time.time()
r.zadd('events:user:1001', {f'event_{i}': now + i for i in range(10)})
recent = r.zrangebyscore('events:user:1001', now - 3600, now)  # Last hour
 
# ========================================
# HASHES: Field-value maps
# ========================================
 
# Store object as hash
r.hset('user:1001', mapping={
    'email': 'alice@example.com',
    'name': 'Alice Johnson',
    'login_count': 0,
    'last_login': '',
})
 
# Get single field
email = r.hget('user:1001', 'email')
 
# Get multiple fields
user_data = r.hmget('user:1001', ['email', 'name'])
 
# Get all fields
user = r.hgetall('user:1001')
 
# Increment field
r.hincrby('user:1001', 'login_count', 1)
 
# ========================================
# HYPERLOGLOG: Cardinality estimation
# ========================================
 
# Count unique visitors (O(1) memory regardless of count)
r.pfadd('unique_visitors:2024-01-15', 'user1', 'user2', 'user3')
r.pfadd('unique_visitors:2024-01-15', 'user2', 'user4')  # user2 not re-counted
 
count = r.pfcount('unique_visitors:2024-01-15')  # ~4 (approx, <1% error)
 
# Merge multiple HyperLogLogs
r.pfmerge('unique_visitors:week', 
          'unique_visitors:2024-01-15', 
          'unique_visitors:2024-01-16')

Data Structure Selection

Essential Patterns

Redis's data structures enable powerful patterns that solve common distributed systems problems. Let's examine the most important production patterns.

Pattern 1: Cache-Aside (Lazy Loading)

The most common caching pattern: check cache first, load from database on miss, populate cache.

cache_aside_pattern.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
class CacheAsidePattern:
    """
    Cache-Aside pattern implementation.
    Also known as lazy-loading or look-aside cache.
    """
    
    def __init__(self, redis_client, database, default_ttl=3600):
        self.redis = redis_client
        self.db = database
        self.default_ttl = default_ttl
    
    def get_user(self, user_id: str) -> dict:
        """
        Get user with cache-aside pattern.
        
        1. Check cache first (fast path)
        2. On miss, load from database
        3. Populate cache for next request
        """
        cache_key = f"cache:user:{user_id}"
        
        # Step 1: Try cache
        cached = self.redis.get(cache_key)
        if cached:
            return json.loads(cached)
        
        # Step 2: Cache miss - load from database
        user = self.db.get_user(user_id)
        if user is None:
            return None
        
        # Step 3: Populate cache with TTL
        self.redis.setex(
            cache_key, 
            self.default_ttl, 
            json.dumps(user)
        )
        
        return user
    
    def update_user(self, user_id: str, data: dict) -> None:
        """
        Update user with cache invalidation.
        
        Strategy: Write to database, then invalidate cache.
        Do NOT write to cache directly (data inconsistency risk).
        """
        # Update database (source of truth)
        self.db.update_user(user_id, data)
        
        # Invalidate cache - next read will reload
        cache_key = f"cache:user:{user_id}"
        self.redis.delete(cache_key)
    
    def delete_user(self, user_id: str) -> None:
        """Delete user and invalidate cache."""
        self.db.delete_user(user_id)
        self.redis.delete(f"cache:user:{user_id}")
 
 
# ========================================
# Pattern 2: Write-Through Cache
# ========================================
 
class WriteThroughPattern:
    """
    Write-Through: Write to cache AND database synchronously.
    Ensures cache is always up-to-date.
    """
    
    def update_user(self, user_id: str, data: dict) -> None:
        """
        Update database and cache synchronously.
        
        Both writes must succeed. Consider:
        - What if cache write fails after DB write?
        - What if DB write fails after cache write?
        """
        cache_key = f"cache:user:{user_id}"
        
        # Write to database first
        self.db.update_user(user_id, data)
        
        # Then write to cache with TTL
        user = self.db.get_user(user_id)  # Get complete fresh state
        self.redis.setex(
            cache_key,
            self.default_ttl,
            json.dumps(user)
        )

Pattern 2: Distributed Locking

Acquire exclusive access to a resource across multiple processes/servers.

distributed_lock.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
import uuid
import time
 
class DistributedLock:
    """
    Distributed lock implementation using Redis.
    
    Uses SET NX EX for atomic check-and-set with expiration.
    Lock automatically expires to prevent deadlocks.
    """
    
    def __init__(self, redis_client, lock_name: str, 
                 expire_seconds: int = 30):
        self.redis = redis_client
        self.lock_key = f"lock:{lock_name}"
        self.expire_seconds = expire_seconds
        self.lock_id = None
    
    def acquire(self, blocking: bool = True, 
                timeout: float = None) -> bool:
        """
        Acquire the lock.
        
        Args:
            blocking: If True, wait until lock is available
            timeout: Max seconds to wait (None = wait forever)
        
        Returns:
            True if lock acquired, False otherwise
        """
        self.lock_id = str(uuid.uuid4())
        start_time = time.time()
        
        while True:
            # SET key value NX EX seconds
            # NX = only if not exists, EX = expiration
            acquired = self.redis.set(
                self.lock_key,
                self.lock_id,
                nx=True,
                ex=self.expire_seconds
            )
            
            if acquired:
                return True
            
            if not blocking:
                return False
            
            # Check timeout
            if timeout is not None:
                if time.time() - start_time > timeout:
                    return False
            
            # Wait before retry (avoid spinning)
            time.sleep(0.01)
    
    def release(self) -> bool:
        """
        Release the lock safely.
        
        Uses Lua script for atomic check-and-delete.
        Only releases if we still own the lock.
        """
        if self.lock_id is None:
            return False
        
        # Lua script ensures atomic check-and-delete
        lua_script = """
        if redis.call("GET", KEYS[1]) == ARGV[1] then
            return redis.call("DEL", KEYS[1])
        else
            return 0
        end
        """
        
        result = self.redis.eval(lua_script, 1, 
                                 self.lock_key, self.lock_id)
        
        self.lock_id = None
        return result == 1
    
    def extend(self, additional_seconds: int = None) -> bool:
        """
        Extend lock expiration (for long operations).
        
        Only extends if we still own the lock.
        """
        if additional_seconds is None:
            additional_seconds = self.expire_seconds
        
        lua_script = """
        if redis.call("GET", KEYS[1]) == ARGV[1] then
            return redis.call("EXPIRE", KEYS[1], ARGV[2])
        else
            return 0
        end
        """
        
        result = self.redis.eval(
            lua_script, 1,
            self.lock_key, self.lock_id, additional_seconds
        )
        
        return result == 1
    
    def __enter__(self):
        """Context manager support."""
        if not self.acquire():
            raise Exception("Could not acquire lock")
        return self
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        self.release()
        return False
 
 
# Usage
lock = DistributedLock(redis_client, "process_order:123")
 
with lock:
    # Only one process can execute this block at a time
    process_order("123")

Pattern 3: Rate Limiting

Limit API requests per client using sliding window or token bucket algorithms.

rate_limiting.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
import time
 
class SlidingWindowRateLimiter:
    """
    Sliding window rate limiter using Redis sorted sets.
    
    More accurate than fixed window, prevents burst at window edges.
    """
    
    def __init__(self, redis_client, max_requests: int, 
                 window_seconds: int):
        self.redis = redis_client
        self.max_requests = max_requests
        self.window_seconds = window_seconds
    
    def is_allowed(self, client_id: str) -> tuple[bool, dict]:
        """
        Check if request is allowed and record it.
        
        Returns:
            (allowed: bool, info: dict with remaining, reset_at)
        """
        key = f"ratelimit:{client_id}"
        now = time.time()
        window_start = now - self.window_seconds
        
        # Use pipeline for atomic operation
        pipe = self.redis.pipeline()
        
        # Remove old entries outside window
        pipe.zremrangebyscore(key, 0, window_start)
        
        # Count entries in current window
        pipe.zcard(key)
        
        # Add current request (score = timestamp)
        request_id = f"{now}:{uuid.uuid4().hex[:8]}"
        pipe.zadd(key, {request_id: now})
        
        # Set expiration on key
        pipe.expire(key, self.window_seconds + 1)
        
        # Execute pipeline
        results = pipe.execute()
        
        current_count = results[1]  # zcard result
        
        allowed = current_count < self.max_requests
        remaining = max(0, self.max_requests - current_count - 1)
        
        if not allowed:
            # Remove the request we just added
            self.redis.zrem(key, request_id)
            remaining = 0
        
        # Calculate reset time (when oldest entry expires)
        oldest = self.redis.zrange(key, 0, 0, withscores=True)
        if oldest:
            reset_at = oldest[0][1] + self.window_seconds
        else:
            reset_at = now + self.window_seconds
        
        return allowed, {
            'remaining': remaining,
            'reset_at': reset_at,
            'retry_after': 0 if allowed else reset_at - now
        }
 
 
class TokenBucketRateLimiter:
    """
    Token bucket rate limiter using Redis.
    
    Allows bursts up to bucket capacity while maintaining average rate.
    """
    
    def __init__(self, redis_client, capacity: int, 
                 refill_rate: float):
        """
        Args:
            capacity: Max tokens in bucket
            refill_rate: Tokens added per second
        """
        self.redis = redis_client
        self.capacity = capacity
        self.refill_rate = refill_rate
    
    def is_allowed(self, client_id: str, tokens: int = 1) -> bool:
        """
        Try to consume tokens from the bucket.
        
        Uses Lua script for atomic token calculation.
        """
        key = f"bucket:{client_id}"
        now = time.time()
        
        lua_script = """
        local key = KEYS[1]
        local capacity = tonumber(ARGV[1])
        local refill_rate = tonumber(ARGV[2])
        local now = tonumber(ARGV[3])
        local requested = tonumber(ARGV[4])
        
        -- Get current bucket state
        local last_update = tonumber(redis.call("HGET", key, "last_update") or now)
        local tokens = tonumber(redis.call("HGET", key, "tokens") or capacity)
        
        -- Calculate tokens to add based on time elapsed
        local elapsed = now - last_update
        local new_tokens = math.min(capacity, tokens + (elapsed * refill_rate))
        
        -- Check if we have enough tokens
        if new_tokens >= requested then
            new_tokens = new_tokens - requested
            redis.call("HSET", key, "tokens", new_tokens)
            redis.call("HSET", key, "last_update", now)
            redis.call("EXPIRE", key, 86400)  -- Clean up after 1 day idle
            return 1
        else
            -- Update timestamp but don't consume
            redis.call("HSET", key, "tokens", new_tokens)
            redis.call("HSET", key, "last_update", now)
            redis.call("EXPIRE", key, 86400)
            return 0
        end
        """
        
        result = self.redis.eval(
            lua_script, 1, key,
            self.capacity, self.refill_rate, now, tokens
        )
        
        return result == 1

Lua Scripts in Redis

Real-Time Features

Redis excels at real-time features due to its sub-millisecond latency. Here are patterns for common real-time use cases.

Pattern: Leaderboards

Sorted sets are perfect for leaderboards—O(log N) insertions and O(log N + M) range queries.

leaderboard.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
class Leaderboard:
    """
    Real-time leaderboard using Redis sorted sets.
    
    Sorted sets maintain ordering automatically, making
    leaderboard operations extremely efficient.
    """
    
    def __init__(self, redis_client, leaderboard_name: str):
        self.redis = redis_client
        self.key = f"leaderboard:{leaderboard_name}"
    
    def set_score(self, user_id: str, score: float) -> None:
        """Set/update user's score."""
        self.redis.zadd(self.key, {user_id: score})
    
    def increment_score(self, user_id: str, delta: float) -> float:
        """Atomically increment score. Returns new score."""
        return self.redis.zincrby(self.key, delta, user_id)
    
    def get_rank(self, user_id: str) -> int | None:
        """
        Get user's rank (0-indexed, highest score = rank 0).
        Returns None if user not in leaderboard.
        """
        rank = self.redis.zrevrank(self.key, user_id)
        return rank  # 0 = first place
    
    def get_score(self, user_id: str) -> float | None:
        """Get user's current score."""
        return self.redis.zscore(self.key, user_id)
    
    def get_top(self, count: int = 10) -> list[tuple[str, float]]:
        """Get top N players with scores."""
        return self.redis.zrevrange(
            self.key, 0, count - 1, withscores=True
        )
    
    def get_around(self, user_id: str, count: int = 5)         -> list[tuple[str, float]]:
        """
        Get players around the given user.
        
        Returns 'count' players above and below.
        """
        rank = self.get_rank(user_id)
        if rank is None:
            return []
        
        start = max(0, rank - count)
        end = rank + count
        
        return self.redis.zrevrange(
            self.key, start, end, withscores=True
        )
    
    def get_page(self, page: int, page_size: int = 50)         -> list[tuple[str, float]]:
        """Get paginated leaderboard results."""
        start = page * page_size
        end = start + page_size - 1
        
        return self.redis.zrevrange(
            self.key, start, end, withscores=True
        )
    
    def remove_user(self, user_id: str) -> bool:
        """Remove user from leaderboard."""
        return self.redis.zrem(self.key, user_id) == 1
    
    def get_total_count(self) -> int:
        """Total number of players in leaderboard."""
        return self.redis.zcard(self.key)
 
 
# Usage
lb = Leaderboard(redis_client, "game:puzzle:weekly")
 
# Player completes level
lb.increment_score("player123", 100)
 
# Get player's current standing
rank = lb.get_rank("player123")  # 42
score = lb.get_score("player123")  # 1500
 
# Display top 10
top_10 = lb.get_top(10)
# [('champion', 10000), ('pro_player', 9500), ...]
 
# Show players around current player
nearby = lb.get_around("player123", count=3)
# Shows 3 above and 3 below in ranking

Pattern: Pub/Sub for Real-Time Notifications

Redis Pub/Sub enables real-time message broadcasting to subscribed clients.

pubsub.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
import threading
 
class RealtimeNotifications:
    """
    Real-time notification system using Redis Pub/Sub.
    
    Pub/Sub is fire-and-forget - messages are NOT persisted.
    For durable messaging, use Redis Streams instead.
    """
    
    def __init__(self, redis_client):
        self.redis = redis_client
        self.pubsub = self.redis.pubsub()
    
    def subscribe_to_user(self, user_id: str, 
                          callback: callable) -> None:
        """Subscribe to notifications for a user."""
        channel = f"notifications:{user_id}"
        self.pubsub.subscribe(**{channel: callback})
        
        # Start listening in background thread
        thread = threading.Thread(target=self.pubsub.run_in_thread, 
                                  args=(0.01,))  # 10ms interval
        thread.daemon = True
        thread.start()
    
    def notify_user(self, user_id: str, message: dict) -> int:
        """
        Send notification to a user.
        
        Returns number of subscribers who received the message.
        Note: 0 means no one is currently subscribed!
        """
        channel = f"notifications:{user_id}"
        return self.redis.publish(channel, json.dumps(message))
    
    def broadcast_to_all(self, message: dict) -> int:
        """Broadcast to all users (use with caution)."""
        return self.redis.publish("notifications:broadcast", 
                                  json.dumps(message))
    
    def subscribe_to_pattern(self, pattern: str, 
                             callback: callable) -> None:
        """
        Subscribe to channels matching a pattern.
        
        Example: "notifications:*" for all user channels
        """
        self.pubsub.psubscribe(**{pattern: callback})
 
 
# ========================================
# Redis Streams: Durable Message Streaming
# ========================================
 
class DurableEventStream:
    """
    Durable event streaming using Redis Streams.
    
    Unlike Pub/Sub, Streams PERSIST messages and support:
    - Consumer groups for distributed processing
    - Message acknowledgment
    - Reading from specific position
    - Automatic trimming by count or time
    """
    
    def __init__(self, redis_client, stream_name: str):
        self.redis = redis_client
        self.stream = stream_name
    
    def publish(self, event: dict, max_len: int = 10000) -> str:
        """
        Publish event to stream.
        
        Returns stream ID (timestamp-sequence format).
        max_len trims old entries to cap memory usage.
        """
        # XADD stream MAXLEN ~ 10000 * field value ...
        return self.redis.xadd(
            self.stream, 
            event,
            maxlen=max_len,
            approximate=True  # ~ for better performance
        )
    
    def read_latest(self, count: int = 10) -> list:
        """Read latest events without consumer group."""
        # XREVRANGE stream + - COUNT 10
        return self.redis.xrevrange(self.stream, '+', '-', count=count)
    
    def create_consumer_group(self, group_name: str, 
                              start_id: str = '0') -> bool:
        """Create consumer group for distributed processing."""
        try:
            self.redis.xgroup_create(
                self.stream, group_name, 
                id=start_id, mkstream=True
            )
            return True
        except redis.ResponseError as e:
            if "BUSYGROUP" in str(e):
                return False  # Group already exists
            raise
    
    def consume(self, group_name: str, consumer_name: str,
                count: int = 10, block_ms: int = 5000) -> list:
        """
        Consume messages as part of a consumer group.
        
        Multiple consumers can share the load - each message
        is delivered to only ONE consumer in the group.
        """
        # XREADGROUP GROUP group consumer BLOCK 5000 COUNT 10 STREAMS stream >
        result = self.redis.xreadgroup(
            group_name, consumer_name,
            {self.stream: '>'},  # '>' = only new messages
            count=count,
            block=block_ms
        )
        
        if result:
            return result[0][1]  # List of (id, data) tuples
        return []
    
    def acknowledge(self, group_name: str, message_id: str) -> int:
        """Acknowledge message processing completed."""
        return self.redis.xack(self.stream, group_name, message_id)

Pub/Sub vs Streams

Persistence Options

Redis provides two persistence mechanisms that can be used independently or together, offering flexibility in the durability vs. performance trade-off.

RDB (Snapshotting)

•Mechanism: Point-in-time snapshot to disk
•Format: Compact binary dump file
•Triggers: Automatic (time/changes) or manual BGSAVE
•Restart Speed: Fast (just load the dump)
•Data Loss Risk: Up to last snapshot interval
•Best For: Backups, disaster recovery, replicas

AOF (Append-Only File)

•Mechanism: Log every write operation
•Format: Text file of Redis commands
•Sync Options: every second, every write, or OS-controlled
•Restart Speed: Slower (must replay all commands)
•Data Loss Risk: Configurable (down to 0 with fsync always)
•Best For: Maximum durability requirements

persistence_config.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# ========================================
# RDB SNAPSHOTTING CONFIGURATION
# ========================================
 
# Save after 900 seconds if at least 1 key changed
save 900 1
# Save after 300 seconds if at least 10 keys changed
save 300 10
# Save after 60 seconds if at least 10000 keys changed
save 60 10000
 
# Disable RDB entirely
# save ""
 
# Stop accepting writes if RDB save fails
stop-writes-on-bgsave-error yes
 
# Compress RDB dump file (uses LZF, slight CPU cost)
rdbcompression yes
 
# Include CRC64 checksum in RDB file
rdbchecksum yes
 
# RDB filename
dbfilename dump.rdb
 
# Working directory (where RDB and AOF are saved)
dir /var/lib/redis
 
# ========================================
# AOF CONFIGURATION
# ========================================
 
# Enable AOF persistence
appendonly yes
 
# AOF filename
appendfilename "appendonly.aof"
 
# Sync policy options:
# - always: fsync after every write (safest, slowest)
# - everysec: fsync every second (good balance, default)
# - no: let OS decide when to sync (fastest, riskiest)
appendfsync everysec
 
# Don't fsync during BGSAVE/BGREWRITEAOF (better perf, may lose data on crash)
no-appendfsync-on-rewrite no
 
# Auto-rewrite AOF when it grows by 100% and is > 64MB
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
 
# Load truncated AOF on startup (useful after crash)
aof-load-truncated yes
 
# Use RDB-format for AOF (hybrid approach, best of both)
aof-use-rdb-preamble yes
 
# ========================================
# RECOMMENDATIONS BY USE CASE
# ========================================
 
# Pure Cache (no persistence needed):
#   save ""
#   appendonly no
 
# Standard Durability:
#   save 900 1
#   save 300 10
#   appendonly yes
#   appendfsync everysec
 
# Maximum Durability:
#   save 60 1
#   appendonly yes
#   appendfsync always
#   (Note: ~10x performance impact)
 
# Hybrid (belt and suspenders):
#   save 900 1
#   save 300 10
#   appendonly yes
#   appendfsync everysec
#   aof-use-rdb-preamble yes

Understanding fsync timing:

The appendfsync setting controls how often Redis forces data to physical disk:

Setting	Behavior	Performance	Max Data Loss
`always`	fsync after every write	~1000 ops/sec	Minimal
`everysec`	fsync every second	~100,000 ops/sec	~1 second
`no`	OS decides (typically 30s)	~150,000 ops/sec	Many seconds

For most production uses, everysec provides the right balance. Only use always when data loss is truly unacceptable (financial transactions, etc.).

Modern Hybrid Approach

Replication and Clustering

For production deployments, a single Redis instance is a single point of failure. Redis provides two scaling and availability mechanisms.

Redis Replication (Primary-Replica)

A primary instance replicates all writes to one or more replicas asynchronously. Replicas can serve read queries to scale read throughput.

Converting Mermaid diagram...

replication_setup.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# ========================================
# REPLICA CONFIGURATION
# ========================================
 
# On the replica server, configure to replicate from primary:
 
# Option 1: In redis.conf
replicaof 192.168.1.100 6379  # Primary host and port
 
# If primary has password
masterauth your_primary_password
 
# Serve stale data during sync (or refuse with "no")
replica-serve-stale-data yes
 
# Replica is read-only (highly recommended)
replica-read-only yes
 
# ========================================
# RUNTIME REPLICATION COMMANDS
# ========================================
 
# Make this instance a replica of another
# redis-cli
> REPLICAOF 192.168.1.100 6379
 
# Check replication status
> INFO replication
# role:slave
# master_host:192.168.1.100
# master_port:6379
# master_link_status:up
# master_last_io_seconds_ago:1
# master_sync_in_progress:0
 
# Promote replica to primary (stops replication)
> REPLICAOF NO ONE
 
# ========================================
# REDIS SENTINEL CONFIGURATION
# ========================================
 
# Sentinel monitors primary and initiates failover
# Configure in sentinel.conf:
 
# Monitor primary with name "mymaster", 
# quorum of 2 sentinels needed to confirm failover
sentinel monitor mymaster 192.168.1.100 6379 2
 
# How long to wait before considering primary down
sentinel down-after-milliseconds mymaster 30000
 
# Max replicas to reconfigure in parallel during failover
sentinel parallel-syncs mymaster 1
 
# Failover timeout
sentinel failover-timeout mymaster 180000
 
# Auth for monitored instances
sentinel auth-pass mymaster your_password

Redis Cluster (Horizontal Scaling)

Redis Cluster automatically partitions data across multiple primary nodes using hash slots. Each primary can have replicas for high availability.

Replication vs Clustering
Aspect	Replication + Sentinel	Redis Cluster
Data Distribution	All data on every node	Data sharded across nodes
Scaling Writes	Single primary bottleneck	Linear write scaling
Scaling Reads	Add more replicas	Add more shards
Max Dataset Size	Limited by single node RAM	Sum of all nodes' RAM
Multi-Key Operations	All keys available	Only if keys in same slot
Complexity	Lower	Higher
Use Case	HA without massive scale needs	Large datasets, high throughput

Cluster Multi-Key Limitations

Production Best Practices

Running Redis in production requires attention to configuration, monitoring, and operational practices.

Production Configuration Checklist

•Set maxmemory — Always configure memory limit to prevent OOM. Redis must know when to evict.
•Choose eviction policy — allkeys-lru for cache, noeviction for database, volatile-lru for mixed.
•Enable authentication — Set requirepass and use ACLs (Redis 6+) for fine-grained access control.
•Disable dangerous commands — Rename or disable KEYS, FLUSHALL, FLUSHDB, DEBUG in production.
•Configure persistence appropriately — Match RDB/AOF settings to your durability requirements.
•Monitor slow queries — Enable slowlog to catch commands taking too long.
•Set up replication — Never run a single instance for important data.

production_redis.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
# ========================================
# MEMORY MANAGEMENT
# ========================================
 
# Maximum memory Redis can use (adjust to your server)
maxmemory 4gb
 
# Eviction policy when maxmemory reached
# - noeviction: return errors for writes
# - allkeys-lru: evict least recently used keys
# - volatile-lru: evict LRU keys with TTL only
# - allkeys-lfu: evict least frequently used (Redis 4.0+)
maxmemory-policy allkeys-lru
 
# Number of keys to sample for LRU eviction
maxmemory-samples 10
 
# ========================================
# SECURITY
# ========================================
 
# Require password for all clients
requirepass your_strong_password_here
 
# Bind to specific interface (not 0.0.0.0!)
bind 127.0.0.1 10.0.0.1
 
# Disable protected mode if binding to non-localhost
protected-mode yes
 
# Rename dangerous commands (or disable with "")
rename-command FLUSHALL ""
rename-command FLUSHDB ""
rename-command DEBUG ""
rename-command KEYS "KEYS_DISABLED_IN_PROD"
 
# ========================================
# PERFORMANCE TUNING
# ========================================
 
# TCP backlog for high connection rates
tcp-backlog 511
 
# Client timeout (0 = disabled)
timeout 0
 
# TCP keepalive (seconds)
tcp-keepalive 300
 
# Number of databases (default 16, usually only use db 0)
databases 16
 
# ========================================
# LOGGING AND MONITORING
# ========================================
 
# Log level (debug, verbose, notice, warning)
loglevel notice
 
# Log file (empty = stdout)
logfile /var/log/redis/redis.log
 
# Slow log: log commands taking > 10ms
slowlog-log-slower-than 10000
 
# Keep last 128 slow commands
slowlog-max-len 128
 
# ========================================
# CLIENT CONNECTION LIMITS
# ========================================
 
# Max simultaneous clients
maxclients 10000
 
# ========================================
# LATENCY MONITORING
# ========================================
 
# Enable latency monitoring for spikes > 10ms
latency-monitor-threshold 10

Key metrics to monitor:

Metric	Command	Warning Threshold
Memory usage	`INFO memory`	80% of maxmemory
Connected clients	`INFO clients`	Near maxclients
Commands per second	`INFO stats`	Significant drops
Key evictions	`INFO stats`	Any in non-cache use
Slow commands	`SLOWLOG`	Growing count
Replication lag	`INFO replication`	master_link_status != up
Persistence status	`INFO persistence`	rdb_last_bgsave_status != ok

Never Run KEYS in Production

Summary: Mastering Redis

We've taken a comprehensive tour of Redis as the canonical key-value store. Let's consolidate the essential knowledge:

Key Takeaways

•Beyond Simple Key-Value — Redis offers rich data structures (strings, lists, sets, sorted sets, hashes, streams) enabling powerful patterns.
•Single-Threaded Excellence — All commands are atomic by design. No locks, no race conditions on single-key operations.
•Essential Patterns — Cache-aside, distributed locking, rate limiting, leaderboards—all elegantly implemented with Redis primitives.
•Real-Time Capable — Sub-millisecond latency enables real-time features. Pub/Sub for ephemeral messaging, Streams for durable events.
•Flexible Persistence — RDB snapshots for fast recovery, AOF for durability. Hybrid approach provides best of both.
•Scalable Architecture — Replication for read scaling and HA, Cluster for horizontal scaling of both reads and writes.
•Production Readiness — Memory limits, security, monitoring, and careful command usage are essential for reliability.

What's next:

Page Complete