Loading content...
At 10:00 AM on a typical Monday, your e-commerce platform processes 50,000 requests per second. Each request generates an average of 15 spans across your microservices. That's 750,000 spans per second—or 64.8 billion spans per day.
At approximately 1KB per span with indexing overhead, you're looking at ~65 terabytes per day of trace data. With 14-day retention, that's nearly a petabyte of storage. The storage costs alone would exceed most engineering budgets. The query performance would be abysmal. The operational burden would be crushing.
This is why sampling exists.
Sampling is the practice of selectively recording traces rather than recording every single one. Done correctly, sampling dramatically reduces costs while preserving the visibility you need. Done poorly, it leads to blind spots, missed incidents, and false confidence in system health.
This page will make you an expert in sampling—the strategies, tradeoffs, and best practices.
By the end of this page, you will understand: head-based vs. tail-based sampling; probabilistic, rate-limiting, and adaptive sampling algorithms; priority sampling for important traces; how to configure sampling for different traffic patterns; and the tradeoffs between cost, coverage, and observability.
Sampling isn't just about cost reduction—it's about making tracing sustainable at scale. Without sampling, tracing becomes a victim of its own success.
| Request Rate | Avg Spans/Request | Spans/Second | Daily Storage (1KB/span) | Monthly Cost (at $0.10/GB) |
|---|---|---|---|---|
| 1,000 req/s | 10 | 10,000 | 864 GB | $2,592 |
| 10,000 req/s | 15 | 150,000 | 12.96 TB | $38,880 |
| 100,000 req/s | 15 | 1,500,000 | 129.6 TB | $388,800 |
| 1,000,000 req/s | 20 | 20,000,000 | 1.73 PB | $5.2M |
The costs extend beyond storage:
1. Collection Overhead Every span must be created, serialized, and transmitted. At 100% sampling, this overhead impacts application performance—memory allocation, network bandwidth, CPU for serialization.
2. Backend Ingestion Collectors, processors, and storage backends must handle the volume. More data means more infrastructure.
3. Query Performance Searching for a specific trace among billions is slow. Even indexed queries degrade at scale.
4. Retention Limits With finite storage, higher volume means shorter retention. You may need 30 days of traces for debugging, but can only afford 3 days at 100% sampling.
The Key Insight:
You don't need every trace. In a healthy system, most requests look similar. What you need is:
Sampling strategies aim to achieve all three while minimizing costs.
Many organizations find that 1-10% sampling provides sufficient visibility for debugging while reducing costs by 90-99%. But the right rate depends on your traffic, the cost of missed traces, and your observability requirements. Some systems sample more aggressively during known-good periods and less during incidents.
The fundamental distinction in sampling strategies is when the sampling decision is made: at the head (start) or tail (end) of a trace.
HEAD-BASED SAMPLING═══════════════════ Request arrives at edge: │ ├─ Sample decision: random(0.0-1.0) < 0.10 (10% rate) │ │ │ ├─ TRUE (sampled = 1): │ │ • All downstream services create and export spans │ │ • Complete trace stored │ │ │ └─ FALSE (sampled = 0): │ • All downstream services skip span creation entirely │ • No trace data generated │ • Zero overhead beyond decision propagation Risk: If this request later errors, we have no trace! TAIL-BASED SAMPLING═══════════════════ Request arrives and progresses: │ ├─ Service A creates span → buffered ├─ Service B creates span → buffered ├─ Service C creates span → buffered ├─ Service D creates span → buffered │ └─ Request completes, all spans arrive at collector │ └─ Collector evaluates complete trace: │ ├─ Has error spans? → KEEP (100%) ├─ Latency > P99? → KEEP (100%) ├─ Important user tier? → KEEP (100%) ├─ Otherwise → KEEP (1%) │ └─ Only kept traces written to storage Benefit: We ALWAYS have traces for errors and slow requests!Cost: All spans buffered until trace completes (memory, processing)Tail-based sampling is intellectually appealing but operationally complex. It requires: (1) A collector that can buffer and reassemble traces (OpenTelemetry Collector, Jaeger Streaming, etc.), (2) Enough memory to buffer traces during their lifetime, (3) Logic to detect 'trace complete' (timeout-based or explicit signals), (4) Handling for very long traces that exceed buffers. Many organizations start with head-based and add tail-based selectively.
Probabilistic sampling (also called random sampling) is the simplest and most common sampling strategy. Each trace has a fixed probability of being sampled.
12345678910111213141516171819202122232425262728293031
// How probabilistic sampling works internally function shouldSample(traceId: string, samplingRate: number): boolean { // Use trace ID to make deterministic decision // This ensures all services agree on the decision for the same trace const hash = hashTraceId(traceId); // Returns 0.0 - 1.0 return hash < samplingRate;} // Example: 10% sampling rateconst SAMPLING_RATE = 0.10; function onTraceStart(traceId: string): SamplingDecision { if (shouldSample(traceId, SAMPLING_RATE)) { return { decision: 'RECORD_AND_SAMPLE', // Record and export attributes: { 'sampling.probability': SAMPLING_RATE, }, }; } return { decision: 'DROP', // Don't record };} // Why hash the trace ID?// - Trace ID is generated before sampling decision// - Hash function produces deterministic output for same input// - All services with same trace ID get same decision// - No coordination needed between servicesStrengths of Probabilistic Sampling:
✓ Simple to understand and configure ✓ Predictable storage costs ✓ Works across distributed services without coordination ✓ Statistically valid for aggregate analysis
Weaknesses:
✗ Errors may be dropped (if error occurs in the 90% not sampled) ✗ Rare paths may never be sampled ✗ No intelligence about trace importance ✗ Uniform rate may be wrong for different endpoints
Start with a rate that fits your budget, then adjust based on coverage. Calculate: (your_request_rate × avg_spans × desired_rate × span_size) = daily_storage. Work backward from your storage budget. Monitor for blind spots—are you seeing enough error traces? Rare endpoint traces? Adjust rates by service or endpoint as needed.
Rate-limiting sampling (also called rate-based or throughput sampling) caps the number of traces sampled per unit time, regardless of traffic volume.
Traffic Level: | 100 req/s | 1,000 req/s | 10,000 req/s |════════════════════════════════════════════════════════════════ Probabilistic 10% sampling: Sampled: | 10/s | 100/s | 1,000/s | Problem: Sampling rate constant, but VOLUME scales with traffic During traffic spikes, you still get 10x more data Rate-Limiting at 50 traces/s: Sampled: | 50/s | 50/s | 50/s | Sampled %: | 50% | 5% | 0.5% | Benefit: Predictable output regardless of input traffic Storage costs are bounded and predictable Normal Spike Mega-spike Traffic Traffic Traffic │ │ │ ▼ ▼ ▼ Probabilistic: ▓▓▓▓▓▓▓▓▓▓ ▓▓▓▓▓▓▓▓▓▓▓▓▓ ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ (scales up) (keeps scaling!) Rate-Limited: ▓▓▓▓▓▓▓▓▓▓ ▓▓▓▓▓▓▓▓▓▓ ▓▓▓▓▓▓▓▓▓▓ (capped) (still capped) (still capped!) Rate Limiting Implementation:
1234567891011121314151617181920212223242526272829303132333435363738394041
// Token bucket rate limiter for samplingclass RateLimitingSampler { private tokens: number; private readonly maxTokens: number; private readonly refillRate: number; // tokens per second private lastRefill: number; constructor(tracesPerSecond: number) { this.maxTokens = tracesPerSecond; this.tokens = tracesPerSecond; this.refillRate = tracesPerSecond; this.lastRefill = Date.now(); } shouldSample(traceId: string): boolean { this.refillTokens(); if (this.tokens >= 1) { this.tokens -= 1; return true; // Sample this trace } return false; // Drop this trace } private refillTokens(): void { const now = Date.now(); const elapsed = (now - this.lastRefill) / 1000; // seconds this.tokens = Math.min( this.maxTokens, this.tokens + (elapsed * this.refillRate) ); this.lastRefill = now; }} // Usage: Sample at most 100 traces per secondconst sampler = new RateLimitingSampler(100); function onTraceStart(traceId: string): boolean { return sampler.shouldSample(traceId);}Rate limiting is excellent when: (1) You have unpredictable traffic patterns, (2) You need strict storage budget guarantees, (3) You're protecting backend infrastructure from overload. Combine with probabilistic: use rate limiting at the collector level as a safety cap on top of probabilistic sampling at the SDK level.
Adaptive sampling dynamically adjusts the sampling rate based on traffic patterns, system load, or the distribution of operations being sampled.
Consider a system with two endpoints: /api/search: 10,000 requests/second (high volume, well-tested)/api/admin/export: 1 request/minute (rare, complex, often breaks) With 1% probabilistic sampling: /api/search: 100 samples/second ✓ (plenty of visibility) /api/admin/export: ~0.6 samples/hour ✗ (might miss issues entirely!) Problem: Low-volume operations are under-represented in samples. With adaptive sampling targeting 10 samples/minute per operation: /api/search: 10/min = 0.001% sampling rate /api/admin/export: 10/min = 16.7% sampling rate (or 100% since volume < target) Result: Every operation has representation regardless of volume.Types of Adaptive Sampling:
1. Per-Operation Adaptive Sampling Maintains a target sample rate for each unique operation (endpoint, method, etc.). High-volume operations are sampled at lower rates; low-volume operations at higher rates.
2. Load-Based Adaptive Sampling Adjusts sampling rate based on system load. When the collector is overloaded, reduce sampling. When idle, increase it.
3. Latency-Based Adaptive Sampling Sample more traces that exhibit high latency. If an operation's P99 latency spikes, increase sampling for that operation.
4. Error-Rate Adaptive Sampling Increase sampling rate when error rates rise. Normal errors at 0.1% → sample 1%. Errors spike to 5% → sample 50%.
1234567891011121314151617181920212223242526272829303132333435
# Jaeger adaptive sampling configuration# This requires jaeger-collector to compute and distribute rates # Remote sampling endpoint (collector provides rates to SDKs)sampling: strategies_file: /etc/jaeger/sampling_strategies.json sampling_refresh_interval: 60s # Example adaptive sampling strategy{ "service_strategies": [ { "service": "order-service", "type": "adaptive", "options": { "sampling_refresh_interval": "60s", "sampling_store_type": "cassandra", # or in-memory "initial_sampling_rate": 0.1, "target_samples_per_second": 10, "min_sampling_rate": 0.001, "max_sampling_rate": 1.0 }, "operation_strategies": [ { "operation": "GET /api/checkout", "target_samples_per_second": 50 # Important operation }, { "operation": "GET /api/health", "target_samples_per_second": 1 # Noisy, less important } ] } ]}Adaptive sampling needs a central component (like Jaeger Collector) to track traffic patterns and distribute updated sampling rates to SDKs. This adds complexity. Ensure your tracing infrastructure supports adaptive sampling before depending on it. OpenTelemetry's native adaptive sampling is still evolving.
Sometimes you need guaranteed sampling for specific traces regardless of probabilistic rates. Priority sampling and debug flags address these scenarios.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
import { context, trace, SpanKind, TraceFlags } from '@opentelemetry/api'; // Priority sampler that always samples certain tracesclass PrioritySampler { private readonly baseSampler: Sampler; constructor(baseSamplingRate: number) { this.baseSampler = new ProbabilisticSampler(baseSamplingRate); } shouldSample( context: Context, traceId: string, spanName: string, spanKind: SpanKind, attributes: Attributes ): SamplingResult { // Priority 1: Always sample if debug flag is set const debugHeader = context.getValue('x-trace-debug'); if (debugHeader === 'true') { return { decision: SamplingDecision.RECORD_AND_SAMPLED }; } // Priority 2: Always sample high-value users const userTier = attributes['user.tier']; if (userTier === 'enterprise' || userTier === 'premium') { return { decision: SamplingDecision.RECORD_AND_SAMPLED }; } // Priority 3: Always sample critical operations const criticalOperations = [ '/api/checkout', '/api/payment/process', '/api/refund', ]; if (criticalOperations.some(op => spanName.includes(op))) { return { decision: SamplingDecision.RECORD_AND_SAMPLED }; } // Priority 4: Always sample synthetic/canary traffic const syntheticHeader = attributes['http.header.x-synthetic-request']; if (syntheticHeader === 'true') { return { decision: SamplingDecision.RECORD_AND_SAMPLED }; } // Default: Fall back to base probabilistic sampling return this.baseSampler.shouldSample( context, traceId, spanName, spanKind, attributes ); }}Debug flags bypass sampling entirely. If debug is enabled on high-volume traffic (accidentally or maliciously), you can overwhelm your tracing backend. Implement rate limiting specifically for debug-flagged traces. Log and alert when debug usage is high. Consider requiring authentication for debug flag usage.
Tail-based sampling unlocks intelligent policies that consider the complete trace before making sampling decisions. This is the most powerful—and most complex—form of sampling.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
# OpenTelemetry Collector tail-based sampling configuration processors: tail_sampling: # How long to wait for all spans of a trace decision_wait: 10s # Number of traces to hold in memory num_traces: 100000 # Expected max spans per trace expected_new_traces_per_sec: 10000 policies: # Policy 1: Always keep error traces - name: errors-policy type: status_code status_code: status_codes: [ERROR] # Policy 2: Always keep slow traces (> 2 seconds) - name: latency-policy type: latency latency: threshold_ms: 2000 # Policy 3: Always keep traces with specific attributes - name: important-users-policy type: string_attribute string_attribute: key: user.tier values: [enterprise, premium, vip] # Policy 4: Sample 100% of specific operations - name: checkout-policy type: string_attribute string_attribute: key: http.route values: [/api/checkout, /api/payment] # Policy 5: Rate limit everything else - name: default-policy type: rate_limiting rate_limiting: spans_per_second: 500 # Policy 6: Composite policy example - name: composite-policy type: composite composite: max_total_spans_per_second: 1000 policy_order: [errors-policy, latency-policy, default-policy] rate_allocation: - policy: errors-policy percent: 50 - policy: latency-policy percent: 30 - policy: default-policy percent: 20 service: pipelines: traces: receivers: [otlp] processors: [tail_sampling] # Tail sampling here exporters: [jaeger]Common Tail-Based Policies:
| Policy Type | Description | Use Case |
|---|---|---|
status_code | Sample based on span status | Keep all error traces |
latency | Sample based on trace duration | Keep slow traces for analysis |
string_attribute | Sample based on attribute values | Keep traces with specific user/tenant |
numeric_attribute | Sample based on numeric comparisons | Keep traces where order value > $1000 |
probabilistic | Random sampling at tail | Base sampling for non-priority traces |
rate_limiting | Cap traces per second | Protect storage from spikes |
composite | Combine multiple policies | Complex multi-rule scenarios |
If you implement only one tail-based policy, make it 'always keep error traces.' This single policy eliminates the biggest weakness of head-based probabilistic sampling: dropped error traces. Combined with head-based 10% sampling, you get representative normal traces plus complete error coverage.
Let's consolidate everything into actionable best practices for production sampling.
┌─────────────────────────────────────────────────────────────────────────────┐│ RECOMMENDED SAMPLING ARCHITECTURE │└─────────────────────────────────────────────────────────────────────────────┘ LAYER 1: SDK (Head-Based Sampling) ───────────────────────────────────────────────────────────────────────── • Probabilistic 10-20% base sampling • Priority: Always sample debug flags and critical operations • Benefit: Reduces network traffic between apps and collector LAYER 2: OpenTelemetry Collector (Tail-Based / Additional Filtering) ───────────────────────────────────────────────────────────────────────── • Tail-based policies: - 100% errors - 100% latency > P99 - 100% high-value user tiers • Rate limiting: Max 1000 traces/second as safety cap • Benefit: Intelligent selection based on complete trace content LAYER 3: Storage Backend (Retention Policies) ───────────────────────────────────────────────────────────────────────── • Error traces: 30-day retention • Normal traces: 7-day retention • Aggregate data: 90-day retention • Benefit: Different retention for different trace importance ┌─────────────┐ │ Application │ │ SDKs │ └──────┬──────┘ │ 10-20% sampled (head-based) │ + priority traces ▼ ┌─────────────┐ │ OTel │ │ Collector │ └──────┬──────┘ │ Tail-based refinement │ Rate limiting cap ▼ ┌─────────────┐ │ Storage │◀─── Tiered retention policies │ Backend │ └─────────────┘Congratulations! You've completed the Distributed Tracing module. You now understand: why tracing matters in distributed systems; the trace and span data model; how context propagation works; Jaeger and Zipkin architectures; and sampling strategies for production workloads. You have the knowledge to implement, operate, and optimize distributed tracing for any scale of system.