Selective Repeat Arq - Learning Module

Loading content...

0/228

Buffer Requirements: Memory Constraints and Sizing

The Memory Cost of Efficiency

Selective Repeat's efficiency comes at a price: memory. Unlike Go-Back-N where the receiver needs almost no buffer (just enough for one frame), SR requires both sender and receiver to maintain substantial buffers.

This isn't a minor implementation detail—it's a fundamental architectural decision that shapes how the protocol scales, what hardware can implement it, and where it's appropriate to deploy.

In the early days of networking, memory was expensive and precious. A router with 64KB of RAM couldn't afford to buffer dozens of 1500-byte frames per connection. This is why Go-Back-N remained popular despite its inefficiency—the hardware couldn't support SR.

Today, memory is abundant and cheap. A smartphone has gigabytes of RAM. But understanding buffer requirements remains essential for protocol design, embedded systems, and high-performance networking where every microsecond and every byte matters.

What You Will Learn

By the end of this page, you will understand the complete buffer requirements for Selective Repeat—sender buffers for retransmission, receiver buffers for reordering, how to calculate buffer sizes, and the trade-offs between buffer memory and protocol performance.

Sender Buffer Requirements

The sender in Selective Repeat must retain copies of all transmitted but unacknowledged frames. These copies are essential for retransmission when frames are lost or when ACKs don't arrive in time.

Why the Sender Needs Buffers:

Retransmission — When a timeout occurs for frame N, the sender must retransmit exactly that frame
Multiple outstanding frames — With a window of size W, up to W frames may be in flight simultaneously
Independent acknowledgments — Any subset of W frames might need retransmission

Sender Buffer Formula:

Sender Buffer Size = Window Size (W) × Maximum Frame Size
                   = W × S bytes

Where S is the maximum transmission unit (MTU) for the link.

Sender Buffer Requirements by Window Size
Window Size (W)	Frame Size (S)	Buffer Required	Typical Use Case
4	1500 bytes	6 KB	Low-latency LAN
8	1500 bytes	12 KB	Standard Ethernet
32	1500 bytes	48 KB	WAN links
64	1500 bytes	96 KB	Satellite links
128	1500 bytes	192 KB	Long-fat networks
1024	1500 bytes	1.5 MB	High-BDP datacenter

Important Observations:

Same as Go-Back-N — The sender buffer requirement is identical for both protocols. GBN also keeps W frames for potential retransmission.
Buffer per connection — For systems handling multiple connections (like a server), multiply by connection count.
Memory is cheap — Even a 128-window satellite link needs only 192KB per connection—trivial for modern systems.

When Can Sender Release Buffer?

A sender can free the buffer for frame N when:

ACK N is received, AND
Frame N's position is at or before send_base (window has slid past it)

In practice, the sender releases buffers incrementally as the window advances.

Buffer Pooling

Production implementations often use buffer pools—pre-allocated memory chunks that are recycled. This avoids the overhead of malloc/free for each frame and prevents memory fragmentation under load.

Receiver Buffer Requirements

The receiver buffer requirement is where Selective Repeat fundamentally differs from Go-Back-N. The receiver must buffer out-of-order frames until gaps are filled.

Go-Back-N Receiver:

Buffer = 1 frame (just the expected next frame, briefly)
Memory = Minimal, effectively zero permanent allocation

Selective Repeat Receiver:

Buffer = W frames (full window size)
Memory = W × S bytes

Why Full Window Size?

In the worst case, frames 1 through W-1 all arrive before frame 0:

Frame 1 arrives: buffer it
Frame 2 arrives: buffer it
...
Frame W-1 arrives: buffer it
Now holding W-1 frames
Frame 0 finally arrives: can deliver all W frames

The buffer must accommodate this worst-case scenario.

buffer_calculation.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
def calculate_sr_buffer_requirements(
    window_size: int,
    max_frame_size: int = 1500,
    num_connections: int = 1
) -> dict:
    """
    Calculate complete buffer requirements for Selective Repeat.
    
    Args:
        window_size: Number of frames in sliding window (W)
        max_frame_size: Maximum frame size in bytes (default: 1500 MTU)
        num_connections: Number of concurrent connections
    
    Returns:
        Dictionary with buffer requirements
    """
    
    # Per-connection sender buffer
    # Must store W frames for potential retransmission
    sender_buffer_per_conn = window_size * max_frame_size
    
    # Per-connection receiver buffer  
    # Must store up to W frames for out-of-order handling
    receiver_buffer_per_conn = window_size * max_frame_size
    
    # Per-connection total
    total_per_conn = sender_buffer_per_conn + receiver_buffer_per_conn
    
    # Per-connection overhead (metadata, timers, etc.)
    # Typical: ~100 bytes per frame for tracking
    metadata_per_conn = window_size * 100  # Conservative estimate
    
    # Full system requirements
    total_sender = sender_buffer_per_conn * num_connections
    total_receiver = receiver_buffer_per_conn * num_connections
    total_metadata = metadata_per_conn * num_connections
    
    return {
        'window_size': window_size,
        'frame_size': max_frame_size,
        'connections': num_connections,
        
        # Per-connection
        'sender_per_conn_kb': sender_buffer_per_conn / 1024,
        'receiver_per_conn_kb': receiver_buffer_per_conn / 1024,
        'total_per_conn_kb': total_per_conn / 1024,
        
        # System-wide
        'total_sender_mb': total_sender / (1024 * 1024),
        'total_receiver_mb': total_receiver / (1024 * 1024),
        'total_data_mb': (total_sender + total_receiver) / (1024 * 1024),
        'total_with_overhead_mb': (total_sender + total_receiver + total_metadata) / (1024 * 1024),
    }
 
 
# Example calculations
print("=== Low-latency LAN (W=8) ===")
lan = calculate_sr_buffer_requirements(window_size=8, num_connections=100)
print(f"Per connection: {lan['total_per_conn_kb']:.1f} KB")
print(f"100 connections: {lan['total_with_overhead_mb']:.1f} MB")
 
print("\n=== Satellite Link (W=64) ===")
sat = calculate_sr_buffer_requirements(window_size=64, num_connections=10)
print(f"Per connection: {sat['total_per_conn_kb']:.1f} KB") 
print(f"10 connections: {sat['total_with_overhead_mb']:.1f} MB")
 
print("\n=== High-throughput Server (W=128, 10K connections) ===")
server = calculate_sr_buffer_requirements(window_size=128, num_connections=10000)
print(f"Per connection: {server['total_per_conn_kb']:.1f} KB")
print(f"10K connections: {server['total_with_overhead_mb']:.1f} MB = {server['total_with_overhead_mb']/1024:.1f} GB")

High-Connection-Count Servers

A server handling 10,000 connections with W=128 needs ~3.8GB just for SR buffers. This is why large-scale servers carefully tune window sizes and use techniques like receive window auto-sizing (TCP RWIN).

Buffer vs Window Size Trade-offs

Window size (W) directly controls both throughput potential and memory consumption. Choosing W involves balancing these competing concerns:

Larger Window (Higher W):

Benefit	Cost
Higher potential throughput	More memory per connection
Better utilization of high-BDP links	More per-frame metadata
Can keep pipe full during bursts	Longer recovery from congestion
More resilient to ACK loss	Higher head-of-line blocking latency

The Bandwidth-Delay Product Constraint:

For full throughput, window size must satisfy:

W ≥ (Bandwidth × RTT) / Frame_Size
  = BDP / Frame_Size

But memory limits W, creating a practical ceiling.

Window Size Selection Guide
Link Type	RTT	Bandwidth	BDP	Min Window	Memory/Conn
LAN (Ethernet)	0.5 ms	1 Gbps	62.5 KB	~42 frames	126 KB
WAN (Cross-country)	50 ms	100 Mbps	625 KB	~417 frames	1.25 MB
Satellite	600 ms	10 Mbps	750 KB	~500 frames	1.5 MB
Intercontinental	200 ms	1 Gbps	25 MB	~16,667 frames	50 MB
Datacenter (low lat)	0.1 ms	100 Gbps	1.25 MB	~833 frames	2.5 MB

Practical Implications:

LAN — Small windows (8-64) are sufficient; memory is not a concern
WAN — Moderate windows (64-256) needed; memory manageable
Satellite — Large windows (256-1024) required; significant memory investment
High-BDP Links — Beyond protocol window limits; need advanced techniques (TCP window scaling)

Key Insight:

The sequence number space must be at least 2W (we'll cover this in the next page), so larger windows also require more sequence number bits. With 8-bit sequence numbers (0-255), maximum W is 128.

Dynamic Window Sizing

Modern protocols like TCP dynamically adjust window size based on network conditions and available buffer memory. This allows adapting to both fast LANs and slow WANs with a single implementation.

Memory Layout and Data Structures

Efficient buffer implementation requires careful attention to memory layout. Poor choices can lead to cache misses, fragmentation, and excessive overhead.

Naive Approach (Don't Do This):

# Allocate frames individually as needed
self.buffer = {}

def store_frame(self, seq, data):
    self.buffer[seq] = bytearray(data)  # Heap allocation each time

def free_frame(self, seq):
    del self.buffer[seq]  # Orphan memory, eventual GC

Problems:

Heap allocation per frame (slow, fragmentation)
Dictionary overhead (~50-100 bytes per entry)
Non-contiguous memory (poor cache locality)

Efficient Approach (Circular Buffer + Pool):

efficient_buffer.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
import mmap
from dataclasses import dataclass
from typing import Optional
 
@dataclass
class FrameMetadata:
    """Compact per-frame tracking information."""
    sequence_number: int = -1    # -1 means slot is free
    data_offset: int = 0         # Offset into data buffer
    data_length: int = 0         # Actual frame size
    timestamp: float = 0.0       # For timeout calculation
    status: int = 0              # 0=free, 1=sent, 2=acked
 
class EfficientSRBuffer:
    """
    Memory-efficient buffer for Selective Repeat.
    
    Uses contiguous pre-allocated memory for both data and metadata,
    ensuring cache efficiency and zero per-frame allocation overhead.
    """
    
    # Frame status constants
    FREE = 0
    SENT = 1  
    ACKED = 2
    RECEIVED = 3
    
    def __init__(self, window_size: int, max_frame_size: int = 1500):
        self.W = window_size
        self.frame_size = max_frame_size
        
        # Pre-allocate contiguous data buffer
        # Using mmap for efficient memory management
        self.data_buffer_size = window_size * max_frame_size
        self.data_buffer = mmap.mmap(-1, self.data_buffer_size)
        
        # Pre-allocate metadata array (fixed-size, no dict overhead)
        self.metadata = [FrameMetadata() for _ in range(window_size)]
        
        # Free list for O(1) slot allocation
        self.free_slots = list(range(window_size))
        
        # Statistics
        self.stats = {
            'stores': 0,
            'retrievals': 0,
            'max_utilization': 0
        }
    
    def _slot_for_seq(self, seq_num: int) -> int:
        """Map sequence number to buffer slot (circular)."""
        return seq_num % self.W
    
    def store(self, seq_num: int, data: bytes) -> bool:
        """
        Store a frame in the buffer.
        
        Zero-copy when possible: data goes directly into pre-allocated memory.
        """
        slot = self._slot_for_seq(seq_num)
        
        if self.metadata[slot].status != self.FREE:
            return False  # Slot in use
        
        if len(data) > self.frame_size:
            raise ValueError(f"Frame too large: {len(data)} > {self.frame_size}")
        
        # Calculate offset into data buffer
        offset = slot * self.frame_size
        
        # Copy data into pre-allocated buffer (single memcpy)
        self.data_buffer[offset:offset + len(data)] = data
        
        # Update metadata
        self.metadata[slot] = FrameMetadata(
            sequence_number=seq_num,
            data_offset=offset,
            data_length=len(data),
            timestamp=time.time(),
            status=self.SENT
        )
        
        self.stats['stores'] += 1
        current_util = sum(1 for m in self.metadata if m.status != self.FREE)
        self.stats['max_utilization'] = max(self.stats['max_utilization'], current_util)
        
        return True
    
    def retrieve(self, seq_num: int) -> Optional[bytes]:
        """
        Retrieve a frame from the buffer.
        
        Returns copy of data (caller owns it).
        """
        slot = self._slot_for_seq(seq_num)
        meta = self.metadata[slot]
        
        if meta.status == self.FREE or meta.sequence_number != seq_num:
            return None
        
        self.stats['retrievals'] += 1
        
        # Return copy of data from buffer
        return bytes(self.data_buffer[meta.data_offset:meta.data_offset + meta.data_length])
    
    def mark_acked(self, seq_num: int) -> bool:
        """Mark a frame as acknowledged (ready for reuse)."""
        slot = self._slot_for_seq(seq_num)
        meta = self.metadata[slot]
        
        if meta.sequence_number == seq_num:
            meta.status = self.ACKED
            return True
        return False
    
    def release(self, seq_num: int):
        """Release a slot for reuse."""
        slot = self._slot_for_seq(seq_num)
        self.metadata[slot] = FrameMetadata()  # Reset to free
    
    def get_memory_usage(self) -> dict:
        """Return current memory usage statistics."""
        used_slots = sum(1 for m in self.metadata if m.status != self.FREE)
        
        return {
            'total_allocated_kb': self.data_buffer_size / 1024,
            'used_slots': used_slots,
            'used_data_kb': used_slots * self.frame_size / 1024,
            'utilization_pct': used_slots / self.W * 100,
            'metadata_overhead_bytes': len(self.metadata) * 48,  # ~48 bytes per FrameMetadata
        }
    
    def __del__(self):
        """Clean up memory-mapped buffer."""
        self.data_buffer.close()

Sender vs Receiver Buffer Asymmetry

While basic SR requires equal sender and receiver window sizes, real-world implementations often have asymmetric requirements:

Sender Buffer Usage Patterns:

Fills quickly when application has data to send
Empties gradually as ACKs arrive
Under congestion: stays near full
Low load: mostly empty

Receiver Buffer Usage Patterns:

Usually mostly empty (frames arrive in order mostly)
Spikes during reordering events
High during loss bursts
Empties in batches when gaps fill

Asymmetric Windowing (TCP approach):

TCP allows the receiver to advertise a smaller receive window:

Sender window size = min(congestion_window, receiver_advertised_window)

Receiver can say: "I only have room for 32 frames right now"
Sender respects this even if it could send more

Buffer Usage Under Different Conditions
Scenario	Sender Buffer	Receiver Buffer	Notes
Light load, no errors	Low (<25%)	Empty (0%)	Ideal case
Heavy load, no errors	Full (100%)	Low (<10%)	All in-flight
Heavy load, random loss	Full (100%)	Moderate (25-50%)	Buffering gaps
Burst loss event	Near full	High (50-90%)	Many gaps
Severe congestion	Full (100%)	Variable	Timeout-driven
Recovery after outage	Draining	Bursty high	Delivering buffered

Flow Control Implications:

Receiver buffer exhaustion creates a critical flow control signal:

IF receiver_buffer is full:
    Receiver cannot accept more out-of-order frames
    Receiver window effectively shrinks
    Receiver may need to drop incoming frames (defeats SR purpose!)
    Should signal sender to slow down

This is why TCP's receive window advertisement exists—the receiver proactively tells the sender how much buffer space remains, preventing buffer overflow.

Buffer Pressure Attack

Malicious senders could exploit receiver buffering by intentionally creating gaps (sending frames 1, 3, 5, 7... but never 0, 2, 4, 6...). This fills receiver buffers without delivering data. Real protocols have timeouts to abandon incomplete sequences.

Timer and Metadata Overhead

Beyond frame data, Selective Repeat requires significant metadata per frame. This overhead can dominate for small frames.

Per-Frame Metadata at Sender:

Field	Size	Purpose
Sequence number	2-4 bytes	Frame identification
Status flags	1 byte	sent/acked/pending
Retransmit count	1 byte	Track retry attempts
Timestamp	8 bytes	For RTT calculation
Timer reference	8 bytes	Pointer to timer structure
Buffer pointer	8 bytes	Pointer to frame data
Total	~30 bytes	Per outstanding frame

Per-Frame Metadata at Receiver:

Field	Size	Purpose
Received flag	1 byte	Whether frame in buffer
Sequence number	2-4 bytes	Verification
Buffer pointer	8 bytes	Pointer to stored data
Total	~12 bytes	Per window slot

Timer Structures:

Selective Repeat needs per-frame timers. Naive implementation:

W timers × (timer struct size) = W × 32 bytes = W × 32 bytes

With W=128, that's 4KB just for timer structures.

timer_optimization.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
from heapq import heappush, heappop
from dataclasses import dataclass, field
from typing import Callable
import time
 
@dataclass(order=True)
class TimerEvent:
    """Represents a scheduled timeout event."""
    expire_time: float
    seq_num: int = field(compare=False)
    callback: Callable = field(compare=False)
    cancelled: bool = field(default=False, compare=False)
 
class EfficientTimerWheel:
    """
    Timer wheel for efficient per-frame timeout management.
    
    Instead of W separate timer objects, maintains a single
    priority queue of timeout events. O(log n) insert/cancel,
    O(1) check for next expiry.
    
    Memory: Single heap + event objects (~50 bytes each)
    vs W separate threading.Timer objects (~300 bytes each)
    """
    
    def __init__(self, default_timeout: float = 1.0):
        self.timeout = default_timeout
        self.timer_heap = []  # Min-heap by expire_time
        self.active_timers = {}  # seq_num -> TimerEvent (for cancellation)
        
    def schedule(self, seq_num: int, callback: Callable):
        """Schedule a timeout for a specific frame."""
        # Cancel any existing timer for this seq_num
        self.cancel(seq_num)
        
        expire_time = time.time() + self.timeout
        event = TimerEvent(expire_time, seq_num, callback)
        
        heappush(self.timer_heap, event)
        self.active_timers[seq_num] = event
        
    def cancel(self, seq_num: int):
        """Cancel timeout for a frame (e.g., when ACKed)."""
        if seq_num in self.active_timers:
            # Mark as cancelled (lazy deletion)
            self.active_timers[seq_num].cancelled = True
            del self.active_timers[seq_num]
    
    def get_next_expiry(self) -> float:
        """Get time until next timeout (for select/poll)."""
        while self.timer_heap:
            event = self.timer_heap[0]
            if event.cancelled:
                heappop(self.timer_heap)
                continue
            return max(0, event.expire_time - time.time())
        return float('inf')  # No pending timers
    
    def process_expired(self):
        """Process all expired timers, calling their callbacks."""
        now = time.time()
        expired_count = 0
        
        while self.timer_heap:
            event = self.timer_heap[0]
            
            if event.cancelled:
                heappop(self.timer_heap)
                continue
                
            if event.expire_time > now:
                break  # No more expired
            
            # Remove and process
            heappop(self.timer_heap)
            
            if event.seq_num in self.active_timers:
                del self.active_timers[event.seq_num]
                event.callback(event.seq_num)  # Trigger retransmission
                expired_count += 1
        
        return expired_count
    
    def get_memory_usage(self) -> dict:
        """Return timer subsystem memory usage."""
        return {
            'heap_entries': len(self.timer_heap),
            'active_timers': len(self.active_timers),
            'estimated_bytes': len(self.timer_heap) * 50 + len(self.active_timers) * 8,
        }

Timer Wheel Efficiency

Production implementations use timer wheels (circular arrays of timeout buckets) for O(1) timer operations. Linux kernel networking uses hierarchical timer wheels for thousands of connections.

Embedded and Constrained Systems

While modern computers have abundant memory, embedded systems, IoT devices, and network infrastructure equipment often operate under severe constraints.

Constrained Environment Examples:

Device Class	Typical RAM	Max Practical Window	Notes
8-bit microcontroller	2-8 KB	2-4 frames	Often use Stop-and-Wait
16-bit embedded	32-128 KB	8-16 frames	Careful buffer management
IoT sensor	256 KB - 1 MB	32-64 frames	CoAP uses similar concepts
Network switch ASIC	Limited on-chip	64-256 frames	Per-port hardware buffers
Router linecard	8-64 MB shared	128-512 per flow	Shared across many flows

Strategies for Constrained Systems:

Smaller Windows — Sacrifice throughput for memory
Fallback to GBN — Under memory pressure, switch to Go-Back-N
Shared Buffer Pools — Multiple connections share one buffer pool
Compression — Compress frame data in buffer
Hybrid Schemes — Small SR window + GBN for overflow

When to Avoid SR in Embedded:

Single connection, low latency link: Stop-and-Wait is simpler and sufficient
Extreme memory constraints: GBN uses less receiver memory
Predictable, low-loss environment: SR's benefits don't justify complexity

When SR is Essential:

High-latency links (satellite, long-range radio)
Lossy wireless connections
High-value data where retransmission is costly
Multi-hop networks with significant reordering

Protocol Selection Matrix

Choose Stop-and-Wait for simplicity on low-BDP links. Choose Go-Back-N when receiver memory is scarce and error rate is low. Choose Selective Repeat when memory is available and efficiency matters.

Summary: Buffer Requirements

We've thoroughly examined the memory and buffer requirements that underpin Selective Repeat. Let's consolidate the key concepts:

Key Takeaways

•Sender buffer: W × S bytes — Same as Go-Back-N; stores frames for potential retransmission.
•Receiver buffer: W × S bytes — The key difference from GBN; stores out-of-order frames.
•Total per-connection: 2 × W × S — Both sender and receiver need full-window buffers.
•Metadata overhead — Per-frame tracking adds ~30-50 bytes per frame on top of data.
•Timer efficiency — Use timer wheels or heaps instead of individual timer objects.
•Trade-off equation — Larger W = higher throughput potential but more memory consumption.

What's Next:

In the final page of this module, we'll explore Window Size Constraints—the mathematical relationship between window size and sequence number space that prevents ambiguity, and why SR requires W ≤ 2^(n-1) where n is the number of sequence bits.

Page Complete

You now understand the buffer requirements for Selective Repeat—sender retransmission buffers, receiver reorder buffers, efficient memory layouts, and the trade-offs between window size and memory consumption. Next, we'll examine the critical window size constraints.