Loading learning content...
Every production system tells a story through numbers. The rate of incoming requests, the amount of memory consumed, the distribution of response times—these numerical signals describe what's happening inside our systems at any given moment. But raw numbers alone don't create understanding. The type of metric you choose fundamentally shapes how you interpret that number and what questions you can answer.
Consider a simple question: "How many requests is my API processing?" This seemingly straightforward question actually has multiple valid interpretations:
Each interpretation requires a different metric type. Choosing the wrong type doesn't just limit your analysis—it can make your data entirely misleading. Understanding metric types is the prerequisite to building any observability system.
By the end of this page, you will deeply understand the three fundamental metric types—counters, gauges, and histograms—their mathematical properties, appropriate use cases, common pitfalls, and how they interact with time-series databases and query languages.
At their core, metrics are time-stamped numerical measurements. But not all numbers behave the same way. Some numbers only go up (total requests processed). Some fluctuate freely (current memory usage). Some capture distributions (request latencies). Each behavior pattern demands a different data model.
The observability industry has converged on three fundamental metric types, each optimized for different measurement patterns:
| Metric Type | Behavior | Primary Use Case | Key Property |
|---|---|---|---|
| Counter | Monotonically increasing | Counting events over time | Can only go up (or reset to zero) |
| Gauge | Arbitrary values | Measuring current state | Can go up, down, or stay constant |
| Histogram | Bucketed distributions | Understanding value distributions | Captures percentiles and ranges |
These three types aren't arbitrary categories—they're mathematical models that enable specific operations. A counter's monotonicity allows calculating rates. A gauge's point-in-time nature enables sampling. A histogram's buckets enable percentile estimation. Choose the right type, and your observability system works seamlessly. Choose the wrong type, and you'll fight your tools at every step.
Why only three types?
You might wonder why we don't have more metric types. The answer lies in balancing expressiveness with simplicity. These three types can represent virtually any measurement pattern while remaining simple enough to implement efficiently in time-series databases. Additional types (like summaries) exist in some systems but are typically variations on these fundamentals.
Think of metric types as contracts. When you declare a metric as a counter, you're promising the database that this value will never decrease (except on reset). The database uses this promise to optimize storage and enable specific operations like rate calculations. Breaking this contract—like using a counter for something that decreases—breaks the math that depends on it.
A counter is a cumulative metric that represents a single monotonically increasing value. It can only go up—never down—though it may reset to zero when the process restarts. Counters are the workhorse of operational metrics, used whenever you need to count discrete events.
Mathematical Properties:
Let's denote a counter value at time t as C(t). The fundamental property of a counter is:
C(t₂) ≥ C(t₁) for all t₂ > t₁ (absent resets)
This monotonicity property enables the most important operation on counters: rate calculation. The rate of events over a time window [t₁, t₂] is simply:
rate = (C(t₂) - C(t₁)) / (t₂ - t₁)
This is why the absolute value of a counter is usually uninteresting—what matters is how fast it's changing.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
package main import ( "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promauto") // Define counters for HTTP request trackingvar ( // Total HTTP requests received (counter) httpRequestsTotal = promauto.NewCounterVec( prometheus.CounterOpts{ Name: "http_requests_total", Help: "Total number of HTTP requests received", }, []string{"method", "endpoint", "status_code"}, ) // Total bytes received (counter) bytesReceivedTotal = promauto.NewCounter( prometheus.CounterOpts{ Name: "http_request_bytes_total", Help: "Total bytes received in HTTP request bodies", }, ) // Errors by type (counter) errorsTotal = promauto.NewCounterVec( prometheus.CounterOpts{ Name: "application_errors_total", Help: "Total number of application errors by type", }, []string{"error_type", "service"}, )) // Usage in request handlerfunc handleRequest(method, endpoint string, bodySize int64, statusCode int) { // Increment request counter with labels httpRequestsTotal.WithLabelValues(method, endpoint, strconv.Itoa(statusCode)).Inc() // Add bytes received bytesReceivedTotal.Add(float64(bodySize)) // Track errors if statusCode >= 500 { errorsTotal.WithLabelValues("server_error", "api").Inc() } else if statusCode >= 400 { errorsTotal.WithLabelValues("client_error", "api").Inc() }}Handling Counter Resets
Counters reset to zero when processes restart. This is expected behavior, but your monitoring system must handle it gracefully. Most time-series databases like Prometheus automatically detect and compensate for resets using algorithms like rate() and increase().
When a counter resets, the rate() function detects that the current value is lower than the previous sample and assumes a reset occurred. It calculates the rate using the new value, treating zero as the starting point. This typically works well, but extremely short scrape intervals during rapid restarts can cause accuracy issues.
Anti-Pattern Warning:
Never decrement a counter. If you find yourself wanting to subtract from a counter, you're using the wrong metric type. Counters are for things that only accumulate. If your value can decrease, use a gauge instead.
Don't use counters for values that can decrease, like queue depth or active connections. 'http_active_requests' that decrements when requests complete is NOT a counter—it's a gauge. The counter equivalent would be 'http_requests_started_total' and 'http_requests_completed_total', where active requests = started - completed.
A gauge is a metric that represents a single numerical value that can arbitrarily go up and down. Gauges are used for measured values like temperature, current memory usage, or the number of concurrent requests. Unlike counters, the absolute value of a gauge is meaningful at any point in time.
Mathematical Properties:
A gauge G(t) at time t has no constraints on its relationship to previous values:
G(t₂) can be >, <, or = to G(t₁) for any t₂ > t₁
This flexibility means gauges represent snapshots. When you query a gauge, you're asking: "What was the value at this moment?" This is fundamentally different from counters, where you typically ask about change over time.
The Sampling Challenge:
Gauges present a unique challenge: values between samples are unknown. If you sample memory usage once per minute and see 50% at 12:00 and 80% at 12:01, you don't know what happened in between. It could have spiked to 100% and come back down. For volatile gauges, sampling frequency matters enormously.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
package main import ( "runtime" "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promauto") var ( // Memory usage gauge memoryBytes = promauto.NewGaugeVec( prometheus.GaugeOpts{ Name: "process_memory_bytes", Help: "Current memory usage in bytes", }, []string{"type"}, ) // Active connections gauge activeConnections = promauto.NewGaugeVec( prometheus.GaugeOpts{ Name: "active_connections", Help: "Number of currently active connections", }, []string{"protocol", "state"}, ) // Queue depth gauge queueDepth = promauto.NewGaugeVec( prometheus.GaugeOpts{ Name: "queue_depth", Help: "Number of items currently in queue", }, []string{"queue_name", "priority"}, ) // In-flight requests gauge inFlightRequests = promauto.NewGaugeVec( prometheus.GaugeOpts{ Name: "http_requests_in_flight", Help: "Number of HTTP requests currently being processed", }, []string{"handler"}, )) // Update memory metrics periodicallyfunc updateMemoryMetrics() { var m runtime.MemStats runtime.ReadMemStats(&m) memoryBytes.WithLabelValues("heap_alloc").Set(float64(m.HeapAlloc)) memoryBytes.WithLabelValues("heap_sys").Set(float64(m.HeapSys)) memoryBytes.WithLabelValues("stack").Set(float64(m.StackInuse))} // Track request lifecyclefunc handleWithMetrics(handler string, next http.Handler) http.Handler { return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { inFlightRequests.WithLabelValues(handler).Inc() defer inFlightRequests.WithLabelValues(handler).Dec() next.ServeHTTP(w, r) })} // Track connectionstype connectionTracker struct { protocol string} func (ct *connectionTracker) OnConnect() { activeConnections.WithLabelValues(ct.protocol, "established").Inc()} func (ct *connectionTracker) OnDisconnect() { activeConnections.WithLabelValues(ct.protocol, "established").Dec()}Gauge Aggregation Challenges
Aggregating gauges across instances requires careful thought. Consider "current queue depth" across 10 service replicas:
Contrast this with counters, where summing rates is almost always correct. Gauge aggregation depends heavily on what you're measuring.
Ephemeral Gauges and Staleness
When an instance dies, its gauge values don't update anymore. Prometheus marks these values as "stale" after a configurable period. For alerting on gauges, consider:
absent() to detect missing metricsFor volatile values like in-flight requests, consider also exposing related counters (requests started, requests completed). This gives you both the instantaneous snapshot (gauge) and the ability to calculate rates over time (counters), providing a more complete picture.
A histogram samples observations (typically request latencies or response sizes) and counts them in configurable buckets. It also provides a sum of all observed values and a total count of observations. Histograms are essential for understanding the distribution of values, not just their average.
Why Averages Lie:
Consider an API with these response times for 100 requests:
The average is 109ms, which tells you almost nothing useful. 98% of users experience 10ms, while 2% suffer 5-second delays. A histogram reveals this bimodal distribution; an average hides it.
Histogram Structure:
A histogram actually exposes multiple time series:
<metric>_bucket{le="<upper_bound>"}: Cumulative count of observations ≤ upper_bound<metric>_sum: Total sum of all observed values<metric>_count: Total count of observationsThe "le" (less than or equal) buckets are cumulative. If you have buckets at 10ms, 50ms, and 100ms, the 50ms bucket includes all observations in the 10ms bucket.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
package main import ( "time" "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promauto") var ( // HTTP request duration histogram with custom buckets httpRequestDuration = promauto.NewHistogramVec( prometheus.HistogramOpts{ Name: "http_request_duration_seconds", Help: "HTTP request latency distribution in seconds", // Buckets designed for typical API response times // .005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10 Buckets: prometheus.DefBuckets, }, []string{"method", "endpoint", "status_code"}, ) // Custom buckets for database query times dbQueryDuration = promauto.NewHistogramVec( prometheus.HistogramOpts{ Name: "db_query_duration_seconds", Help: "Database query latency distribution", // Custom buckets: 1ms, 5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 5s Buckets: []float64{.001, .005, .01, .025, .05, .1, .25, .5, 1, 5}, }, []string{"query_type", "table"}, ) // Response size histogram (in bytes) responseSize = promauto.NewHistogramVec( prometheus.HistogramOpts{ Name: "http_response_size_bytes", Help: "HTTP response size distribution in bytes", // Exponential buckets: 100B, 1KB, 10KB, 100KB, 1MB, 10MB Buckets: prometheus.ExponentialBuckets(100, 10, 6), }, []string{"endpoint"}, )) // Time HTTP request handlingfunc handleHTTPRequest(method, endpoint string) http.Handler { return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { start := time.Now() // Wrap ResponseWriter to capture status and size wrapped := &responseRecorder{ResponseWriter: w, status: 200} // Handle request... handler.ServeHTTP(wrapped, r) // Record metrics duration := time.Since(start).Seconds() httpRequestDuration.WithLabelValues( method, endpoint, strconv.Itoa(wrapped.status), ).Observe(duration) responseSize.WithLabelValues(endpoint).Observe(float64(wrapped.written)) })} // Time database queriesfunc executeQuery(queryType, table, query string) (Result, error) { start := time.Now() result, err := db.Query(query) duration := time.Since(start).Seconds() dbQueryDuration.WithLabelValues(queryType, table).Observe(duration) return result, err}Choosing Bucket Boundaries
Bucket selection is both art and science. Poor bucket boundaries waste storage or lose precision:
| Problem | Cause | Solution |
|---|---|---|
| All observations in first bucket | Lower buckets too high | Add smaller buckets (e.g., 1ms, 5ms) |
| All observations in +Inf bucket | Upper buckets too low | Add larger buckets beyond expected max |
| Can't distinguish p50 from p90 | Too few buckets | Add buckets in the relevant range |
| Cardinality explosion | Too many buckets | Use fewer, strategically placed buckets |
Guidelines for bucket selection:
Some systems (like Prometheus) also offer 'summaries' that calculate precise quantiles client-side. Histograms calculate approximate quantiles server-side. Histograms are generally preferred because they can be aggregated across instances, while summaries cannot. The tradeoff: histograms require good bucket planning, while summaries provide exact quantiles but limited aggregation.
One of the most powerful features of histograms is the ability to estimate percentiles (quantiles). The p99 latency—the value below which 99% of observations fall—is a critical SLO metric that histograms enable.
The histogram_quantile Function:
In Prometheus, you calculate percentiles using histogram_quantile(). This function uses linear interpolation between bucket boundaries:
# Calculate p99 latency over the last 5 minutes
histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))
# Calculate p50 (median) latency
histogram_quantile(0.50, rate(http_request_duration_seconds_bucket[5m]))
# Calculate p99 grouped by endpoint
histogram_quantile(0.99,
sum(rate(http_request_duration_seconds_bucket[5m])) by (le, endpoint)
)
How Linear Interpolation Works:
Suppose you have buckets at 100ms and 250ms. If 90% of observations fall below 100ms and 98% fall below 250ms, and you want p95:
This is an estimate—the true p95 could be anywhere in that range.
| Percentile | Also Known As | Meaning | When to Use |
|---|---|---|---|
| p50 | Median | 50% of observations are faster | Understand typical user experience |
| p90 | 90% of observations are faster | Identify the long tail beginning | |
| p95 | 95% of observations are faster | Balance between typical and worst-case | |
| p99 | 99% of observations are faster | SLO target for latency-sensitive services | |
| p99.9 | Three nines | 99.9% faster | Ultra-premium tier or financial services |
Aggregating Histograms Correctly
One of histogram's superpowers is aggregation across instances. Unlike summaries (which calculate client-side quantiles), you can combine histogram buckets and then calculate quantiles:
# CORRECT: Sum buckets first, then calculate percentile
histogram_quantile(0.99,
sum(rate(http_request_duration_seconds_bucket[5m])) by (le)
)
# WRONG: Averaging percentiles is statistically invalid
avg(histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m])))
Why is averaging percentiles wrong? Consider two instances:
Averaging gives 300ms, but Instance B dominates the traffic. The true aggregate p99 is much closer to 500ms.
Important: Always aggregate the underlying buckets BEFORE calculating the percentile.
Histogram percentile calculations are estimates based on bucket boundaries. If your buckets are too coarse, your percentile estimates will be inaccurate. For critical SLOs, ensure you have buckets close to your target thresholds. A p99 SLO of 100ms should have buckets at 50ms, 75ms, 100ms, 125ms, and 150ms to get accurate measurement.
Selecting the appropriate metric type is one of the first and most important decisions in instrumentation design. Here's a systematic decision framework:
Use a Histogram When:
| Question | Counter | Gauge | Histogram |
|---|---|---|---|
| Does the value only increase? | ✅ Yes | ❌ | N/A |
| Can the value decrease? | ❌ | ✅ Yes | N/A |
| Do you need rates/throughput? | ✅ Ideal | ⚠️ Possible | N/A |
| Is the absolute value important? | ❌ Usually not | ✅ Yes | N/A |
| Do you need percentiles? | ❌ | ❌ | ✅ Yes |
| Do you need distribution info? | ❌ | ❌ | ✅ Yes |
| Is the value discrete events? | ✅ Ideal | ❌ | N/A |
| Is it a point-in-time snapshot? | ❌ | ✅ Yes | N/A |
It's often valuable to expose the same measurement as multiple metric types. For request handling, you might expose: 'http_requests_total' (counter), 'http_requests_in_flight' (gauge), and 'http_request_duration_seconds' (histogram). Each answers different questions about the same phenomenon.
Even experienced engineers make metric type mistakes. Here are the most common pitfalls and how to avoid them:
The Mixed Metric Pattern:
A sophisticated pattern uses multiple metric types together. For request tracking:
http_requests_started_total (counter) → Calculate start rate
http_requests_completed_total (counter) → Calculate completion rate
http_requests_in_flight (gauge) → Current active requests
http_request_duration_seconds (histogram) → Latency distribution
# Validation check:
http_requests_in_flight ≈ rate(http_requests_started_total) - rate(http_requests_completed_total)
This multi-metric approach provides comprehensive visibility and cross-validation opportunities.
Histograms multiply cardinality. A histogram with 10 buckets and labels {method, endpoint, status} where you have 5 methods × 100 endpoints × 5 status codes = 2500 label combinations × 12 series per histogram (10 buckets + sum + count) = 30,000 time series from a single metric. Design labels carefully.
Metric types are the foundation of observability. Choosing the right type enables powerful analysis; choosing the wrong type creates constant friction with your tools.
What's Next:
Now that you understand the fundamental metric types, the next page explores Prometheus architecture—the most widely adopted metrics collection system. You'll learn how Prometheus's pull-based model, time-series database, and powerful query language work together to make metrics collection practical at scale.
You now understand the three fundamental metric types and when to use each. This knowledge forms the vocabulary of observability—every metric you create or query will be classified by these types. Next, we'll explore how Prometheus operationalizes these concepts at scale.