Notification System - Learning Module

Loading content...

0/273

Batching and Deduplication

Respecting User Attention

Imagine receiving 50 separate notifications in 10 minutes because 50 people liked your viral post. Or getting the same "Your order has shipped" notification three times because of a retry race condition. These experiences erode user trust and lead to notification opt-outs—the death knell for engagement.

Batching and deduplication are two sides of the same coin: respecting user attention. Batching groups related notifications ("50 people liked your post") instead of bombarding users. Deduplication ensures users never receive the same notification twice, even when distributed systems make this surprisingly difficult.

What You Will Learn

This page teaches you how to design batching systems that intelligently group notifications, implement deduplication in distributed environments, handle edge cases that cause duplicates, and balance immediate delivery against user experience.

Why Batching Matters

Notification batching serves multiple stakeholders:

For Users:

Reduces notification fatigue and interruption frequency
Provides contextual summaries ("5 new messages" vs. 5 separate alerts)
Respects attention and improves signal-to-noise ratio
Prevents the "cry wolf" effect where users ignore all notifications

For the System:

Reduces load on delivery infrastructure
Lowers costs (especially for SMS and email)
Decreases provider rate limit pressure
Enables more efficient rendering (single grouped notification)

Batching Impact Metrics
Metric	Without Batching	With Batching	Improvement
Notifications per active user/day	50-100	10-20	5x reduction
Opt-out rate (monthly)	2-5%	0.5-1%	4x improvement
Push delivery costs	$10,000/day	$8,000/day	20% savings
Email server load	10M sends/day	3M sends/day	70% reduction
User engagement rate	15% open rate	35% open rate	2x improvement

The Batching Paradox

Batching delays notifications. Delay too long and notifications lose relevance. Batch too aggressively and summaries become overwhelming ('273 people liked your posts'). The art of batching is finding the sweet spot for each notification type and user.

Batching Strategies

Different scenarios call for different batching approaches. Understanding the trade-offs helps you choose the right strategy for each notification type.

Time-Based Batching

Collect notifications for a fixed time window before delivery:

class TimeBasedBatcher:
    def __init__(self, window_seconds: int = 60):
        self.window = window_seconds
        self.pending = defaultdict(list)  # user_id -> notifications
        self.timers = {}  # user_id -> timer
    
    def add(self, notification: Notification):
        user_id = notification.user_id
        batch_key = self.get_batch_key(notification)
        
        self.pending[(user_id, batch_key)].append(notification)
        
        # Start timer on first notification
        if (user_id, batch_key) not in self.timers:
            self.timers[(user_id, batch_key)] = Timer(
                self.window, 
                self.flush, 
                args=[user_id, batch_key]
            )
            self.timers[(user_id, batch_key)].start()
    
    def flush(self, user_id: str, batch_key: str):
        notifications = self.pending.pop((user_id, batch_key), [])
        if notifications:
            batched = self.create_summary(notifications)
            self.deliver(batched)

When to use: Social interactions (likes, comments), activity updates, non-urgent aggregatable content.

Trade-offs:

✅ Predictable delay, simple implementation
❌ Fixed delay even for single notifications
❌ Doesn't adapt to notification velocity

Batching Architecture

Implementing batching in a distributed system requires careful architecture to handle state, timers, and coordination across multiple workers.

Converting Mermaid diagram...

Key Components:

1. Partitioned Processing: Partition notifications by user_id to ensure all notifications for a user are processed by the same worker. This prevents race conditions and enables local batching state.

2. Distributed State (Redis):

# Redis data structures for batching
# Batch contents: List of notification IDs
RPUSH batch:{user_id}:{type} notification_id

# Batch metadata: First notification time, count
HSET batch_meta:{user_id}:{type} first_ts 1703001234 count 5

# TTL ensures orphaned batches expire
EXPIRE batch:{user_id}:{type} 3600

3. Timer Service: Manages delayed flush events. Options include:

Redis ZADD with scored timestamps, polled periodically
SQS delay queues or similar
In-memory timer wheels (for single-process systems)
Kafka scheduled topics with delayed delivery

4. Exactly-Once Flush: Prevent double-flush when timer fires and count threshold hits simultaneously:

def flush_with_lock(batch_key: str):
    lock_key = f"flush_lock:{batch_key}"
    
    # Try to acquire lock with short TTL
    if redis.set(lock_key, "1", nx=True, ex=5):
        try:
            notifications = redis.lrange(f"batch:{batch_key}", 0, -1)
            redis.delete(f"batch:{batch_key}")
            
            if notifications:
                self.deliver_batch(notifications)
        finally:
            redis.delete(lock_key)

Batch Key Design

Choose batch keys carefully. Too specific (user + notification_type + source_id) creates too many small batches. Too general (user + notification_type) may batch unrelated items. Example: for 'like' notifications, batch by (user_id, 'like', content_id) to group likes on the same post.

Generating Batch Summaries

Batched notifications need intelligent summarization. "10 people liked your post" is more useful than just a count—it should highlight who liked it and provide quick actions.

Summary Generation Strategies:

class NotificationSummarizer:
    def summarize(self, notifications: List[Notification]) -> BatchedNotification:
        notification_type = notifications[0].type
        
        if notification_type == 'like':
            return self.summarize_likes(notifications)
        elif notification_type == 'comment':
            return self.summarize_comments(notifications)
        elif notification_type == 'follow':
            return self.summarize_follows(notifications)
        else:
            return self.generic_summary(notifications)
    
    def summarize_likes(self, notifications: List[Notification]) -> BatchedNotification:
        actors = [n.actor for n in notifications]
        content = notifications[0].target_content
        
        # Prioritize actors: friends first, then verified, then by recency
        prioritized = self.prioritize_actors(actors)
        
        count = len(notifications)
        if count == 1:
            title = f"{prioritized[0].name} liked your post"
        elif count == 2:
            title = f"{prioritized[0].name} and {prioritized[1].name} liked your post"
        elif count <= 5:
            title = f"{prioritized[0].name}, {prioritized[1].name}, and {count-2} others liked your post"
        else:
            title = f"{prioritized[0].name} and {count-1} others liked your post"
        
        return BatchedNotification(
            title=title,
            body=f'"{content.preview}"',
            data={
                'type': 'batched_likes',
                'content_id': content.id,
                'actor_ids': [a.id for a in prioritized[:5]],
                'total_count': count,
            },
            actions=[
                Action('view', 'View Post', f'/post/{content.id}'),
            ]
        )

Summary Formatting Patterns
Count	Template Pattern	Example
1	{actor} {action} your {target}	John liked your photo
2	{actor1} and {actor2} {action}...	John and Jane liked your photo
3-5	{actor1}, {actor2}, and N others...	John, Jane, and 3 others liked...
6+	{actor1} and N others {action}...	John and 47 others liked your photo
100+	Many people {action} your {target}	Many people liked your photo

Actor Prioritization Rules

•Friends First — Users you follow or are friends with appear first in summary
•Verified/Notable — Celebrity or verified accounts get priority
•Most Recent — Default to most recent actors when other signals are equal
•Deduplicate Names — Don't show 'John, John Smith, and 3 others'
•Locale Aware — Format names according to user's locale (Western vs Eastern order)

Understanding Deduplication

Deduplication ensures users never receive the same notification twice—even when distributed systems, retries, and race conditions conspire to create duplicates. This is harder than it sounds.

Common Duplicate Scenarios

Producer crashes after sending, restarts and resends. 2) Network timeout causes retry, but original succeeded. 3) Multiple service instances process same event. 4) Message queue redelivers after consumer crash. 5) Fallback triggers while primary was actually delivered.

Root Cause Analysis:

Duplicates arise from the inherent tension between:

At-least-once delivery: Messages are never lost (may duplicate)
Exactly-once delivery: Each message processed exactly once (impossible in distributed systems without coordination)

Since true exactly-once is impractical at scale, we implement at-least-once with application-level deduplication.

Idempotency Keys:

Every notification must carry a unique, deterministic identifier:

def generate_idempotency_key(event: Event) -> str:
    """
    Generate a stable key that identifies this unique notification.
    Same input always produces same key.
    """
    components = [
        event.type,
        event.user_id,
        event.source_id,    # e.g., the post that was liked
        event.actor_id,     # e.g., who liked it
        event.timestamp.date().isoformat(),  # Daily uniqueness
    ]
    
    # Hash to fixed-length key
    content = "|".join(str(c) for c in components)
    return hashlib.sha256(content.encode()).hexdigest()[:32]

Key Design Considerations:

Include enough context for semantic uniqueness (same like shouldn't notify twice)
Don't include timestamps at full precision (retries have different timestamps)
Consider whether re-occurrence should notify (daily login bonus: yes, same like: no)

Deduplication Implementation

Implementing deduplication requires a fast lookup mechanism to check if a notification has already been processed or delivered.

Redis-Based Deduplication

•SET with NX flag for atomic check-and-store
•TTL ensures storage doesn't grow unbounded
•Sub-millisecond lookup performance
•Cluster mode for high availability
•Memory-bound: 1B keys ≈ 100GB RAM

Bloom Filter Deduplication

•Probabilistic: may have false positives
•Extremely memory efficient
•No TTL: requires periodic rebuild
•Can't remove entries
•Good for first-pass filtering

Multi-Layer Deduplication:

class DeduplicationService:
    def __init__(self):
        self.bloom = BloomFilter(expected_items=100_000_000, fp_rate=0.01)
        self.redis = RedisClient()
        self.db = DatabaseClient()
    
    def check_and_mark(self, key: str) -> bool:
        """
        Returns True if this is a new notification (should process).
        Returns False if duplicate (should skip).
        """
        
        # Layer 1: Bloom filter (fast negative check)
        if self.bloom.check(key):
            # Might be duplicate, need definitive check
            
            # Layer 2: Redis (recent dedup window)
            if not self.redis.set(f"dedup:{key}", "1", nx=True, ex=86400):
                # Already exists in Redis - definite duplicate
                return False
            
            # Layer 3: Database (persist for compliance)
            try:
                self.db.insert_dedup_key(key)
            except UniqueViolation:
                # Race condition: another process inserted first
                return False
        else:
            # Not in bloom filter - definitely new
            self.bloom.add(key)
            self.redis.set(f"dedup:{key}", "1", ex=86400)
            self.db.insert_dedup_key(key)
        
        return True

TTL Strategy:

Notification Type	Dedup Window	Rationale
Transaction (order shipped)	7 days	Prevent retry duplicates
Social (like, comment)	24 hours	Same action shouldn't re-notify within a day
Marketing campaign	30+ days	Prevent re-sending same campaign
Security alerts	1 hour	May need to re-alert if ongoing
OTP codes	5 minutes	Short-lived, frequent regeneration

Database as Source of Truth

Redis can lose data during failover. For critical paths (financial notifications), write dedup keys to a database before delivery. Accept the performance cost for guaranteed correctness. Use Redis as a cache layer for performance, not as the authoritative store.

Cross-Channel Deduplication

When notifications go through multiple channels (push + email for important updates), deduplication becomes more nuanced. Users shouldn't receive the same information twice, but the rules depend on intent.

Cross-Channel Deduplication Strategies
Strategy	Behavior	Use Case
Full Dedup	One channel success cancels others	Redundant fallback chains
Acknowledgment-Based	Stop fallback when user acknowledges	Critical notifications with escalation
Independent Channels	Each channel delivers regardless	Intentional multi-channel (push + email receipt)
Time-Delayed Dedup	Cancel secondary if primary acked within N minutes	Soft fallback escalation

Implementation Pattern:

class CrossChannelDeduplicator:
    def __init__(self):
        self.delivery_tracker = DeliveryTracker()
    
    def should_deliver(self, notification: Notification, channel: str) -> bool:
        dedup_strategy = notification.cross_channel_strategy
        tracking_key = notification.idempotency_key
        
        if dedup_strategy == 'independent':
            # Each channel operates independently
            return self.single_channel_dedup(tracking_key, channel)
        
        elif dedup_strategy == 'full_dedup':
            # Check if ANY channel already delivered
            return not self.delivery_tracker.any_channel_delivered(tracking_key)
        
        elif dedup_strategy == 'acknowledgment_based':
            # Check if user acknowledged via any channel
            return not self.delivery_tracker.user_acknowledged(tracking_key)
        
        elif dedup_strategy == 'time_delayed':
            # Primary has N minutes before secondary fires
            primary_delivered = self.delivery_tracker.get_delivery_time(
                tracking_key, 
                notification.primary_channel
            )
            
            if primary_delivered:
                time_since = now() - primary_delivered
                if time_since < notification.fallback_delay:
                    return False  # Give primary more time
                    
                if self.delivery_tracker.user_acknowledged(tracking_key):
                    return False  # User already responded
            
            return True
    
    def record_delivery(self, notification: Notification, channel: str):
        self.delivery_tracker.mark_delivered(
            notification.idempotency_key,
            channel,
            timestamp=now()
        )

Device-Level Deduplication

Users with multiple devices (phone + tablet + watch) may receive the same push notification on all devices. Operating systems handle some of this (notification sync), but your system may need to track per-device delivery and suppress on already-seen notifications when users switch devices.

Handling Edge Cases

Real-world batching and deduplication systems encounter numerous edge cases that can cause surprising behavior if not handled carefully.

Edge Cases and Solutions

•Clock Skew — Servers with different clocks may generate conflicting idempotency keys. Solution: Use logical clocks or UUID-based keys instead of timestamps.
•Batch Fragmentation — Network partitions can split a batch across workers. Solution: Use consistent hashing and implement batch reconciliation.
•Stale Dedup Cache — Redis failover loses recent keys, causing duplicates. Solution: Write-through to database for critical notifications.
•Retroactive Dedup — Admin resends campaign that was already delivered. Solution: Track campaign IDs separately with longer TTL.
•Like-Unlike-Like — User rapidly likes and unlikes. First like notifies, second like should not. Solution: Include action state in dedup key.
•Timezone Batching — Midnight batch should align to user's timezone, not server's. Solution: Per-user timezone-aware batch windows.

The "Unlike" Problem in Detail:

def handle_like_event(event: LikeEvent) -> Optional[Notification]:
    """
    Handle the complexity of like/unlike sequences.
    """
    # Get current like state from source of truth
    current_state = self.content_service.get_like_state(
        event.user_id, 
        event.content_id
    )
    
    if not current_state.is_liked:
        # User has since unliked - don't notify
        return None
    
    # Check if we already notified for a like on this content
    dedup_key = f"like:{event.content_owner_id}:{event.content_id}:{event.user_id}"
    notification_window = 86400 * 7  # 7 days
    
    last_notification = self.dedup.get_last_notification_time(dedup_key)
    
    if last_notification and (now() - last_notification) < notification_window:
        # Already notified within window - skip
        return None
    
    # All checks passed - create notification
    return create_like_notification(event)

This handles the sequence: Like → Notify → Unlike → Like (don't notify again within window).

Summary: Batching and Deduplication

Batching and deduplication are essential for user experience and system efficiency. They transform a potential flood of notifications into a manageable, valuable stream of information.

Key Takeaways

•Batching Reduces Fatigue — Group related notifications to respect user attention and improve engagement rates
•Multiple Strategies — Time-based, count-based, and adaptive batching serve different use cases
•Distributed State — Use Redis for batch state with careful handling of races and failures
•Smart Summaries — Prioritize actors meaningfully and format summaries for quick comprehension
•Idempotency Keys — Deterministic, stable keys enable reliable deduplication
•Multi-Layer Dedup — Bloom filters for speed, Redis for recency, database for correctness
•Cross-Channel Awareness — Coordinate deduplication across channels to prevent redundant delivery
•Edge Case Handling — Plan for clock skew, failovers, and rapid state changes

What's Next:

With batching and deduplication covered, we'll explore User Preferences—the system that gives users control over their notification experience. You'll learn how to design preference schemas, handle inheritance and defaults, and implement preference-aware routing.

Page Complete

You now understand how to implement batching for user experience and system efficiency, and how to prevent duplicates in distributed environments. These techniques are crucial for any notification system handling significant volume.