Rtp And Rtcp - Learning Module

Loading content...

0/240

QoS Considerations: Ensuring Quality for Real-time Media

Why Network Treatment Matters for Media

We've established that RTP provides the transport framing for real-time media and RTCP provides feedback for adaptation. But even the most sophisticated application-layer protocols cannot overcome fundamental network limitations. When a router's queue fills up during congestion, packets must be dropped—and which packets get dropped profoundly affects user experience.

A web page that takes an extra 500ms to load is barely noticeable. A video call with a 500ms gap is jarring. A 500ms latency in cloud gaming makes the game unplayable. Real-time traffic has fundamentally different requirements, and networks can be configured to recognize and respect these differences.

Quality of Service (QoS) encompasses network mechanisms that provide differentiated treatment to different traffic types. For real-time multimedia, QoS can mean the difference between crystal-clear communication and unusable choppiness.

What You Will Learn

By the end of this page, you will understand the four dimensions of QoS (bandwidth, latency, jitter, loss), DiffServ marking and PHBs, queuing mechanisms, traffic shaping, and how to design networks that prioritize real-time traffic appropriately.

The Four Dimensions of QoS

Quality of Service for network traffic is characterized by four measurable dimensions. Different applications have different requirements for each, and understanding these tradeoffs is essential for QoS design.

1. Bandwidth (Throughput) The amount of data that can be transferred per unit time, typically measured in Mbps or Gbps. Video streaming requires sustained high bandwidth; voice calls require low but consistent bandwidth.

2. Latency (Delay) The time for a packet to travel from source to destination. Interactive applications require <150ms end-to-end; voice becomes awkward above 150ms and unusable above 400ms.

3. Jitter (Delay Variation) The variation in packet arrival times. Applications can compensate with jitter buffers, but high jitter requires larger buffers, adding latency. Typically measured as mean deviation from average delay.

4. Packet Loss The percentage of packets that don't arrive at the destination. Real-time media can tolerate 1-5% loss with concealment; above 5%, quality degrades noticeably.

QoS Requirements by Application Type
Application	Bandwidth	Latency	Jitter	Loss Tolerance
VoIP (audio call)	64-128 kbps	< 150ms	< 30ms	1-3%
Video conference (SD)	0.5-1 Mbps	< 200ms	< 50ms	1-3%
Video conference (HD)	1.5-4 Mbps	< 200ms	< 50ms	1-2%
Cloud gaming	10-50 Mbps	< 50ms	< 10ms	< 0.5%
Live streaming	2-20 Mbps	2-30 sec OK	Buffered	0%
Web browsing	Variable	< 1 sec OK	N/A	0% (TCP)
File download	Maximum available	Seconds OK	N/A	0% (TCP)

The Latency-Loss Tradeoff

Jitter buffers trade latency for perceived loss reduction—variable network delay becomes constant playback delay. Larger buffers absorb more jitter but add end-to-end latency. Real-time applications must balance this tradeoff based on their sensitivity to each dimension.

Sources of QoS Degradation

Understanding where QoS problems originate helps in designing solutions. Degradation can occur at any point in the network path.

Last-mile congestion: The connection between user and ISP is often the bottleneck. Residential users sharing cable segments, DSL distance limitations, or oversubscribed PON (fiber) headends cause queuing during peak hours.

Access network queuing: Home routers and access equipment often have large buffers that can hold hundreds of milliseconds of data. Under load, this "bufferbloat" causes latency to spike while bandwidth remains available.

WAN congestion: Inter-ISP links and backbone connections can become congested during traffic surges. Peering disputes or capacity limitations cause packet loss and delay.

Last-meter wireless: WiFi and cellular networks introduce variable latency and loss due to interference, retransmissions, and medium access contention. A single room can see 10-100ms jitter on WiFi.

Latency Sources Along Network Path

Path Analysis

End-to-End Latency Breakdown (Video Call Example):
 
User A ──► Router ──► ISP ──► Internet ──► ISP ──► Router ──► User B
 
┌─────────────────────────────────────────────────────────────────┐
│ SEGMENT                    │ TYPICAL LATENCY  │ VARIABILITY    │
├─────────────────────────────────────────────────────────────────┤
│ WiFi (User A)              │ 2-30ms           │ HIGH (jitter)  │
│ Router queuing             │ 0-200ms          │ VERY HIGH      │
│ DSL/Cable/Fiber modem      │ 5-30ms           │ MEDIUM         │
│ ISP access network         │ 2-10ms           │ LOW            │
│ ISP backbone               │ 1-5ms            │ LOW            │
│ Peering/IXP                │ 1-10ms           │ LOW            │
│ Remote ISP backbone        │ 1-5ms            │ LOW            │
│ Remote ISP access          │ 2-10ms           │ LOW            │
│ Remote modem               │ 5-30ms           │ MEDIUM         │
│ Remote router queuing      │ 0-200ms          │ VERY HIGH      │
│ WiFi (User B)              │ 2-30ms           │ HIGH           │
├─────────────────────────────────────────────────────────────────┤
│ TOTAL (typical)            │ 50-150ms         │                │
│ TOTAL (under load)         │ 200-500ms+       │                │
└─────────────────────────────────────────────────────────────────┘
 
Problem areas highlighted:
⚠ Router queuing: BIGGEST variable contributor
⚠ WiFi: Unpredictable, interference-dependent
⚠ Modem buffers: Often sized for throughput, not latency

Bufferbloat

Large router buffers designed to prevent packet loss during bursts cause massive latency under sustained load. A 2MB buffer on a 10Mbps link holds 1.6 seconds of data! Modern solutions include AQM algorithms like CoDel and fq_codel.

WiFi Challenges

WiFi's CSMA/CA medium access causes inherent jitter. Under contention, stations may wait multiple backoff periods. WiFi retransmissions (for reliability) add latency variation. Consider wired connections for latency-critical applications.

DiffServ and DSCP Marking

Differentiated Services (DiffServ) is the predominant QoS architecture for IP networks. Rather than reserving resources per-flow (as in the older IntServ model), DiffServ classifies packets into a small number of traffic classes and applies consistent treatment to each class.

DSCP (Differentiated Services Code Point): The 6-bit DSCP field in the IP header (replacing the older ToS field) indicates the desired per-hop behavior. Values range from 0-63, with certain values having standardized meanings.

Per-Hop Behaviors (PHBs): DSCP values map to PHBs that define how routers should treat packets. Standard PHBs include:

Standard DSCP Values and PHBs
PHB	DSCP Name	DSCP Value	Binary	Use Case
Default	BE (Best Effort)	0	000000	Normal traffic
Expedited Forwarding	EF	46	101110	VoIP, real-time
Assured Forwarding	AF41	34	100010	Video conferencing
Assured Forwarding	AF31	26	011010	Streaming video
Assured Forwarding	AF21	18	010010	Transactional data
Assured Forwarding	AF11	10	001010	Bulk data
Class Selector	CS6	48	110000	Network control
Class Selector	CS5	40	101000	Signaling (SIP)

Expedited Forwarding (EF): The highest-priority PHB, designed for low-latency, low-jitter, low-loss traffic. Routers implementing EF typically:

Place EF packets in a priority queue serviced before other traffic
Ensure EF queue is rarely congested (admission control or policing)
Minimize processing delay for EF packets

EF is appropriate for VoIP and interactive video. Using EF for bulk traffic defeats its purpose.

Assured Forwarding (AF): Four AF classes (AF1x-AF4x), each with three drop precedence levels (x = 1, 2, 3). During congestion, packets with higher drop precedence are dropped first. This allows traffic engineering with graceful degradation.

DSCP Marking in Practice

Example Configuration

Typical Enterprise QoS Policy:
 
Traffic Class          │ DSCP │ Queue Treatment
───────────────────────┼──────┼─────────────────────────────
Voice (RTP audio)      │ EF   │ Priority queue, police 30%
Video (RTP video)      │ AF41 │ 30% bandwidth guarantee
Signaling (SIP, SRTP)  │ CS5  │ Priority queue, police 5%
Business apps          │ AF21 │ 15% bandwidth guarantee
Default                │ BE   │ Remaining bandwidth
Scavenger (P2P, backup)│ CS1  │ Last priority, no guarantee
 
Application Marking Examples:
 
VoIP Phone:
  RTP audio packets → DSCP 46 (EF)
  SIP signaling     → DSCP 40 (CS5)
 
WebRTC Browser:
  Audio RTP → DSCP 46 (EF) [if permitted by OS/policy]
  Video RTP → DSCP 34 (AF41)
  Signaling → DSCP 40 (CS5)
 
Note: Most residential ISPs ignore DSCP (treat all as BE).
Enterprise networks typically honor DSCP marking.
Cloud egress traffic usually scrubbed to BE.

DSCP Trust Boundaries

Untrusted sources (like the public Internet) can set DSCP to any value. Enterprise networks typically re-mark traffic at ingress based on traffic identification rather than trusting DSCP from outside. ISPs often reset DSCP to 0 at peering points.

Queuing and Scheduling Mechanisms

DSCP marking only works if routers and switches implement queuing mechanisms that honor the markings. Several queuing disciplines are used in practice:

Priority Queuing (PQ): Packets in the highest-priority queue are always served first. Lower-priority queues only get bandwidth when higher queues are empty. Simple but can starve lower queues if high-priority traffic is excessive.

Weighted Fair Queuing (WFQ): Queues share bandwidth proportionally to their weights. During congestion, each flow gets its fair share. Prevents starvation but doesn't guarantee low latency for any class.

Low-Latency Queuing (LLQ): Combines priority queuing for real-time traffic with weighted fair queuing for everything else. The priority queue is policed to prevent starvation of other classes. This is the most common enterprise QoS mechanism.

Queuing Mechanism Comparison

Visual Comparison

FIFO (No QoS):
Input: [V1][D1][V2][D2][V3][D3]...
       └───────── Single queue ─────────┘
Output: V1, D1, V2, D2, V3, D3...
        (voice waits behind data)
 
Priority Queuing:
Input:  Voice → [V1][V2][V3]  ← High priority (served first)
        Data  → [D1][D2][D3]  ← Low priority (waits)
Output: V1, V2, V3, D1, D2, D3
        (data starved while voice exists)
 
Weighted Fair Queuing (2:1 ratio):
Input:  Voice → [V1][V2][V3]  ← Weight 2
        Data  → [D1][D2][D3]  ← Weight 1
Output: V1, V2, D1, V3, D2, (interleaved proportionally)
 
Low-Latency Queuing:
Input:  Voice → [V1][V2][V3]  ← Priority (with police limit)
        Video → [VD1][VD2]    ← WFQ class, 30% weight
        Data  → [D1][D2][D3]  ← WFQ class, remaining
Output: V1, V2, V3, VD1, D1, VD2, D2...
        (voice first, then weighted fair for rest)
        If voice exceeds police rate: dropped/delayed
 
═══════════════════════════════════════════════════════════
 
LLQ Configuration Example (Cisco-like):
 
class-map VOICE
  match dscp ef
class-map VIDEO  
  match dscp af41
class-map SIGNALING
  match dscp cs5
  
policy-map QOS-POLICY
  class VOICE
    priority percent 20    ! Strict priority, max 20%
    police cir 2000000     ! Police to 2Mbps
  class VIDEO
    bandwidth percent 30   ! Guaranteed 30%
    fair-queue
  class SIGNALING
    priority percent 5     ! Strict priority, max 5%
  class class-default
    bandwidth percent 45   ! Remaining for best effort
    fair-queue

Priority Queue Policing

Always police (rate-limit) priority queues. Without policing, a misbehaving or malicious source sending high-priority traffic could starve all other traffic. Police rates should match the expected legitimate traffic—e.g., 30-50kbps per active voice call.

Active Queue Management

Traditional "tail-drop" queuing—drop new packets when the queue is full—has two problems for QoS:

Late congestion signaling: TCP only reduces rate after detecting loss, which happens after the queue is already full
Global synchronization: Multiple TCP flows detect loss simultaneously, reduce rate together, then increase rate together, causing oscillation

Active Queue Management (AQM) algorithms proactively manage queue occupancy to prevent these issues.

Random Early Detection (RED): Drops or marks packets with increasing probability as queue depth grows. Early drops signal congestion before the queue fills, allowing TCP to react gradually.

CoDel (Controlled Delay): Monitors packet sojourn time (time spent in queue) rather than queue length. Drops packets when delay exceeds target for too long. Adapts automatically to link speed without tuning.

FQ-CoDel (Fair Queuing + CoDel): Combines CoDel with flow-based fair queuing. Each flow gets its own queue, preventing a single bulk transfer from impacting other flows. Now the recommended default for many deployments.

AQM Algorithm Comparison
Algorithm	What It Measures	Drop Strategy	Pros	Cons
Tail-drop	Queue length	Drop when full	Simple	Bufferbloat, global sync
RED	Average queue length	Probabilistic early drop	Prevents sync	Requires tuning
WRED	Queue length + DSCP	Class-aware RED	DiffServ-aware	Complex configuration
CoDel	Packet sojourn time	Drop if delay > target	Self-tuning	Per-flow unfair
FQ-CoDel	Per-flow delay	Per-flow CoDel	Fair + low-latency	More state

CoDel Algorithm Pseudocode

Algorithm

CoDel (Controlled Delay) Algorithm:
 
Parameters:
  TARGET  = 5ms    (acceptable queuing delay)
  INTERVAL = 100ms (observation window)
 
Variables:
  first_above_time = 0    (when sojourn > TARGET started)
  drop_next = 0           (when to drop next packet)
  dropping = false        (currently in drop state)
  count = 0               (drops in this interval)
 
On dequeue(packet):
  now = current_time
  sojourn = now - packet.enqueue_time
  
  if sojourn < TARGET:
    first_above_time = 0
  else:
    if first_above_time == 0:
      first_above_time = now + INTERVAL
    else if now >= first_above_time:
      # Been above TARGET for full INTERVAL
      if !dropping:
        dropping = true
        count = 1
        drop_next = now
      
      if now >= drop_next:
        drop(packet)
        count += 1
        # Drop interval decreases: 1/√count
        drop_next = now + INTERVAL / sqrt(count)
        return dequeue()  # Get next packet
  
  if dropping and sojourn < TARGET:
    dropping = false  # Queue recovered
  
  return packet
 
Result: Keeps queue delay near TARGET (5ms) regardless of link speed

ECN and AQM

Explicit Congestion Notification (ECN) allows AQM to mark packets (CE bit) instead of dropping them, signaling congestion without losing data. TCP reacts to ECN marks like it would to drops. This improves QoS for real-time traffic that's also loss-sensitive.

Traffic Shaping and Policing

Beyond classification and queuing, networks control traffic rates through shaping and policing.

Traffic Policing: Drops or re-marks packets that exceed a configured rate. Instantaneous—operates on each packet as it arrives. Policing is typically used at network edges to enforce service contracts.

Traffic Shaping: Buffers packets and releases them at a controlled rate, smoothing bursts. Adds delay but prevents drops from downstream policing. Shaping is typically used on egress to meet downstream bandwidth constraints.

Policing Characteristics

•Drops or re-marks excess traffic
•No buffering—instantaneous decision
•Used at ingress to enforce limits
•Causes loss for bursty traffic
•Configured with CIR and PIR rates
•Good for untrusted traffic sources

Shaping Characteristics

•Buffers and delays excess traffic
•Smooths bursts into steady stream
•Used at egress before slow links
•Adds latency proportional to burst size
•Configured with rate and buffer size
•Good for matching link capacity

Token Bucket Implementation
TypeScript
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
class TokenBucket {
    private tokens: number;
    private lastRefill: number;
    
    constructor(
        private rate: number,        // tokens per second (bytes/sec)
        private bucketSize: number,  // max burst (bytes)
    ) {
        this.tokens = bucketSize;    // Start full
        this.lastRefill = Date.now();
    }
    
    private refill(): void {
        const now = Date.now();
        const elapsed = (now - this.lastRefill) / 1000; // seconds
        const tokensToAdd = elapsed * this.rate;
        this.tokens = Math.min(this.bucketSize, this.tokens + tokensToAdd);
        this.lastRefill = now;
    }
    
    // Policing: returns true if packet can pass, false to drop
    police(packetSize: number): boolean {
        this.refill();
        if (this.tokens >= packetSize) {
            this.tokens -= packetSize;
            return true;   // Allow packet
        }
        return false;      // Drop packet
    }
    
    // Shaping: returns delay in ms before packet should be sent
    shape(packetSize: number): number {
        this.refill();
        if (this.tokens >= packetSize) {
            this.tokens -= packetSize;
            return 0;      // Send immediately
        }
        // Calculate wait time for enough tokens
        const tokensNeeded = packetSize - this.tokens;
        const waitTime = (tokensNeeded / this.rate) * 1000; // ms
        return waitTime;
    }
}
 
// Example: 10 Mbps rate, 100KB burst allowance
const shaper = new TokenBucket(10_000_000 / 8, 100_000);
 
function processPacket(packet: Uint8Array) {
    const delay = shaper.shape(packet.length);
    if (delay > 0) {
        setTimeout(() => sendPacket(packet), delay);
    } else {
        sendPacket(packet);
    }
}

Shaping Latency Impact

Shaping adds latency proportional to the shaped rate and burst size. For a 10Mbps shaper with 100KB buffer receiving a 100KB burst, the last byte waits 80ms. For real-time traffic, this may be unacceptable—use policing or priority queuing instead.

End-to-End QoS Strategies

QoS isn't just a router configuration—it requires end-to-end thinking across application, endpoint, and network layers.

Application-layer strategies:

Adaptive bitrate: Adjust encoding quality based on network feedback (RTCP reports, TWCC). Don't just hope the network handles excess traffic.
FEC and redundancy: Add redundant information so receivers can reconstruct lost packets. Trades bandwidth for loss resilience.
Packetization choices: Smaller packets are more loss-resilient but have higher overhead. Match packet size to expected network conditions.

Endpoint strategies:

DSCP marking: Mark outgoing packets appropriately. Note that OS permissions may be required (e.g., Linux CAP_NET_ADMIN for dscp > 0).
Traffic isolation: Use separate network interfaces or VLANs for real-time traffic when possible.
Buffer management: Use appropriate jitter buffer sizes. Too small causes underruns; too large adds unnecessary latency.

Complete QoS Strategy

Enterprise Design

End-to-End QoS Design for Enterprise Video Conferencing:
 
1. APPLICATION LAYER:
   ┌───────────────────────────────────────────────────────┐
   │ • Client marks RTP audio as DSCP EF (46)             │
   │ • Client marks RTP video as DSCP AF41 (34)           │
   │ • Client marks signaling as DSCP CS5 (40)            │
   │ • Adaptive bitrate responds to RTCP feedback         │
   │ • FEC enabled for audio (15% overhead)               │
   └───────────────────────────────────────────────────────┘
                           │
                           ▼
2. ACCESS SWITCH:
   ┌───────────────────────────────────────────────────────┐
   │ • Trust DSCP from known devices (phones, PCs)        │
   │ • Reclassify untrusted traffic to BE                 │
   │ • Apply QoS ingress policy                           │
   └───────────────────────────────────────────────────────┘
                           │
                           ▼
3. DISTRIBUTION/CORE:
   ┌───────────────────────────────────────────────────────┐
   │ • Honor DSCP, apply LLQ policy                       │
   │ • EF: Priority queue, 15% max                        │
   │ • AF41: 30% bandwidth guarantee                      │
   │ • Monitor queue drops and latency                    │
   └───────────────────────────────────────────────────────┘
                           │
                           ▼
4. WAN EDGE:
   ┌───────────────────────────────────────────────────────┐
   │ • Shape aggregate to WAN link capacity               │
   │ • Apply LLQ within shaped rate                       │
   │ • EF traffic bypasses shaper (priority)              │
   │ • Use WRED on AF classes during congestion           │
   └───────────────────────────────────────────────────────┘
                           │
                           ▼
5. INTERNET/ISP:
   ┌───────────────────────────────────────────────────────┐
   │ • May or may not honor DSCP (often doesn't)          │
   │ • Application adaptation becomes critical            │
   │ • Consider VPN with QoS-aware provider               │
   │ • Fallback to best-effort behavior                   │
   └───────────────────────────────────────────────────────┘

When Networks Don't Cooperate

On the public Internet, assume no QoS. Applications must implement their own quality management through adaptive bitrate, FEC, and smart buffering. DSCP marking still helps for enterprise segments of the path but cannot be relied upon end-to-end.

Summary: QoS for Real-time Success

We've explored the network mechanisms and strategies that ensure real-time multimedia receives the treatment it needs for quality communication.

Key Takeaways

•Four QoS dimensions — Bandwidth, latency, jitter, and loss have different importance for different applications; real-time prioritizes latency and jitter.
•Degradation sources — Last-mile links, bufferbloat in routers, WiFi variability, and WAN congestion all contribute to quality problems.
•DiffServ provides classification — DSCP marking and PHBs (especially EF for real-time) enable differentiated treatment across networks.
•LLQ is the preferred mechanism — Priority queuing for real-time traffic combined with weighted fair queuing for other classes balances latency and fairness.
•AQM prevents bufferbloat — CoDel and FQ-CoDel maintain low latency by controlling queue delay rather than just queue length.
•End-to-end thinking is essential — Application adaptation, endpoint marking, and network configuration must work together; no single layer handles QoS alone.

Module complete:

You've now completed the comprehensive coverage of RTP and RTCP. You understand how real-time media is transported, how feedback enables adaptation, how streaming architectures scale, and how networks can be configured to prioritize this critical traffic. This knowledge is foundational for implementing, deploying, and troubleshooting real-time communication systems.

Module Complete

Congratulations! You've mastered RTP and RTCP—the protocols that power voice calls, video conferencing, live streaming, and interactive applications across the Internet. You understand their packet formats, feedback mechanisms, streaming architectures, and the QoS considerations that ensure quality. This knowledge enables you to build, optimize, and debug real-time communication systems at professional level.