Database Management SystemsQuery Profiling

Query Profiling

LevelAdvanced

Duration90 mins

TopicQuery Profiling

1 / 5

Query Profiler Tools

The Foundation of Performance Analysis

When a SQL query runs slower than expected, the natural instinct is to stare at the query text, hoping to spot the problem through sheer intuition. This approach occasionally works for simple cases, but it fundamentally cannot scale to complex queries, production workloads, or subtle performance regressions.

Query profiler tools represent the systematic alternative—they instrument the database engine itself, capturing detailed telemetry about what actually happens during query execution. Rather than guessing where time is spent, you measure with precision. Rather than hypothesizing about resource consumption, you observe directly.

The difference between a developer who 'thinks' a query is slow because of a missing index versus one who can definitively demonstrate that 72% of execution time is spent on a sequential scan of a specific table is the difference between software development and software engineering. Profiler tools bridge that gap.

What You Will Learn

By the end of this page, you will understand the architecture and capabilities of query profiler tools across major database systems. You'll learn to activate, configure, and interpret profiling output—transforming raw performance data into actionable optimization insights. This foundation enables every subsequent technique in the query profiling discipline.

Understanding Query Profiling

Query profiling is the practice of instrumenting SQL statement execution to collect detailed metrics about resource consumption, timing, and internal operations. Unlike EXPLAIN plans which predict behavior, profilers measure actual execution—capturing what truly happened when your query ran.

This distinction is crucial. EXPLAIN provides the optimizer's plan—its best prediction of how a query will execute. Profiling provides reality—what actually occurred, including time spent, rows processed, and resources consumed. Plans can be wrong; profiling data reflects truth.

Core Profiling Dimensions

•Temporal Metrics — Elapsed wall-clock time, CPU time, wait time, and time distribution across execution phases (parsing, optimization, execution, result transmission).
•Row Metrics — Rows examined versus rows returned, rows filtered at various stages, intermediate result set sizes between operators.
•I/O Metrics — Physical disk reads, buffer pool hits/misses, write operations for temporary storage, pages accessed.
•Memory Metrics — Memory allocated for sorting, hashing, temporary tables, and query workspace buffers.
•Lock Metrics — Lock acquisition counts, lock wait times, deadlock involvement, blocking chain participation.
•Network Metrics — Bytes sent to client, round-trip counts, result set serialization overhead.

The Measurement Hierarchy

Profiling operates at multiple granularities: statement-level (entire query), operator-level (individual plan nodes), and system-level (database-wide aggregates). Effective performance analysis requires navigating between these levels—zooming out to identify problematic queries, then drilling down to isolate specific operators causing issues.

Why profiling matters more than intuition:

Consider a query joining five tables that takes 3 seconds to execute. Without profiling, you might spend hours optimizing the 'obvious' problem—perhaps adding indexes to the largest table. With profiling, you might discover in seconds that 2.8 of those 3 seconds are spent waiting for a lock held by a different transaction, and the query itself is highly efficient.

Profilers prevent the most common performance optimization failure: solving the wrong problem. They ensure your optimization effort targets genuine bottlenecks, not perceived ones.

Profiler Architecture

Understanding how profilers work internally helps you interpret their output correctly and configure them appropriately. Most database profilers share a common architectural pattern with three key components:

1. Instrumentation Points

The database engine contains numerous instrumentation points—code locations where profiling data is captured. These include:

Parser entry/exit — When query text parsing begins and completes
Optimizer phases — Cardinality estimation, join ordering, access path selection
Executor lifecycle — Operator initialization, row processing, resource allocation
Buffer pool operations — Page reads, evictions, dirty page flushes
Lock manager events — Lock acquisition, release, waiting, deadlock detection
Network layer — Result serialization, transmission timing

Each instrumentation point records timestamps, counters, or state snapshots. The density of instrumentation varies by database system and configuration—more instrumentation provides more detail but incurs overhead.

profiler_architecture_conceptual.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
-- Conceptual view of instrumentation flow during query execution
-- (This illustrates the internal process, not actual executable SQL)
 
/*
  Query Lifecycle with Profiling Instrumentation:
  
  ┌────────────────────────────────────────────────────────────────┐
  │                    CLIENT SENDS QUERY                          │
  └────────────────────────────────────────────────────────────────┘
                               │
                               ▼
  ┌────────────────────────────────────────────────────────────────┐
  │  PHASE 1: PARSING                                              │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  │  📊 Instrumentation:                                           │
  │     • parse_start_time = NOW()                                 │
  │     • token_count++                                            │
  │     • syntax_tree_nodes++                                      │
  │     • parse_end_time = NOW()                                   │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  └────────────────────────────────────────────────────────────────┘
                               │
                               ▼
  ┌────────────────────────────────────────────────────────────────┐
  │  PHASE 2: OPTIMIZATION                                         │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  │  📊 Instrumentation:                                           │
  │     • plans_evaluated++                                        │
  │     • cardinality_estimates[]                                  │
  │     • cost_calculations[]                                      │
  │     • optimization_elapsed_ms                                  │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  └────────────────────────────────────────────────────────────────┘
                               │
                               ▼
  ┌────────────────────────────────────────────────────────────────┐
  │  PHASE 3: EXECUTION                                            │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  │  📊 Instrumentation (per operator):                            │
  │     • rows_examined++                                          │
  │     • rows_returned++                                          │
  │     • buffer_reads++                                           │
  │     • buffer_hits++                                            │
  │     • temp_space_used++                                        │
  │     • operator_elapsed_ns                                      │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  └────────────────────────────────────────────────────────────────┘
                               │
                               ▼
  ┌────────────────────────────────────────────────────────────────┐
  │  PHASE 4: RESULT TRANSMISSION                                  │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  │  📊 Instrumentation:                                           │
  │     • bytes_sent++                                             │
  │     • network_round_trips++                                    │
  │     • client_wait_time_ms                                      │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  └────────────────────────────────────────────────────────────────┘
*/

2. Collection Infrastructure

Instrumentation points feed data into collection infrastructure—typically memory buffers, ring buffers, or shared memory segments designed for minimal-overhead data capture. Key design considerations include:

Low contention — Profiling data collection must not serialize query execution
Fixed memory — Buffers have bounded size to prevent memory exhaustion
Overflow handling — When buffers fill, systems either drop old data or pause collection

3. Access Interfaces

Collected profiling data is exposed through various interfaces:

Virtual tables / system views — Query profiling data like regular tables
Built-in commands — Special syntax to request profiling (EXPLAIN ANALYZE, SHOW PROFILE)
External files — Logs, trace files, or dump files for offline analysis
APIs — Programmatic access for monitoring tools and dashboards

Profiling Overhead

Profiling is not free. Instrumentation adds overhead—typically 2-15% depending on granularity. High-resolution profiling (timing every row processed) can add 30%+ overhead. For production analysis, use session-based profiling on specific connections rather than database-wide instrumentation. Never leave detailed profiling enabled continuously in production.

MySQL Profiling Tools

MySQL provides a rich ecosystem of profiling capabilities that evolved significantly across versions. Understanding the available tools and their appropriate use cases is essential for MySQL performance analysis.

Performance Schema is MySQL's primary performance monitoring infrastructure—a comprehensive instrumentation framework built into the server. Unlike deprecated alternatives, Performance Schema is designed for production use with configurable overhead.

mysql_performance_schema.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
-- ================================================================
-- MySQL Performance Schema: Core Profiling Infrastructure
-- ================================================================
 
-- Check if Performance Schema is enabled
SHOW VARIABLES LIKE 'performance_schema';
 
-- Result: Should show 'ON' for profiling capabilities
 
-- ----------------------------------------------------------------
-- Configure statement instrumentation
-- ----------------------------------------------------------------
 
-- Enable statement event consumption
UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES'
WHERE NAME LIKE 'events_statements%';
 
-- Enable statement instrumentation
UPDATE performance_schema.setup_instruments
SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME LIKE 'statement/%';
 
-- ----------------------------------------------------------------
-- Query recently executed statements with timing
-- ----------------------------------------------------------------
 
-- View current session's last statement details
SELECT
    THREAD_ID,
    EVENT_ID,
    SQL_TEXT,
    DIGEST_TEXT,
    TIMER_WAIT / 1000000000 AS elapsed_seconds,
    TIMER_START / 1000000000 AS start_seconds,
    LOCK_TIME / 1000000000 AS lock_seconds,
    ROWS_EXAMINED,
    ROWS_SENT,
    ROWS_AFFECTED,
    CREATED_TMP_TABLES,
    CREATED_TMP_DISK_TABLES,
    SELECT_FULL_JOIN,
    SELECT_SCAN,
    SORT_ROWS,
    NO_INDEX_USED,
    NO_GOOD_INDEX_USED
FROM performance_schema.events_statements_current
WHERE THREAD_ID = (
    SELECT THREAD_ID 
    FROM performance_schema.threads 
    WHERE PROCESSLIST_ID = CONNECTION_ID()
);
 
-- View statement history for current thread
SELECT
    SQL_TEXT,
    TIMER_WAIT / 1000000000000 AS elapsed_ms,
    ROWS_EXAMINED,
    ROWS_SENT,
    CREATED_TMP_TABLES,
    CREATED_TMP_DISK_TABLES,
    NO_INDEX_USED
FROM performance_schema.events_statements_history
WHERE THREAD_ID = (
    SELECT THREAD_ID 
    FROM performance_schema.threads 
    WHERE PROCESSLIST_ID = CONNECTION_ID()
)
ORDER BY EVENT_ID DESC
LIMIT 10;

Key Performance Schema Tables for Query Analysis:

Table	Purpose	Key Metrics
`events_statements_current`	Currently executing statements	Real-time execution state
`events_statements_history`	Recent statements per thread	Per-session query history
`events_statements_history_long`	Recent statements globally	Cross-session query history
`events_statements_summary_by_digest`	Aggregated by normalized query	Pattern-based analysis
`events_stages_current`	Current execution stages	Phase-level timing
`events_waits_current`	Current wait events	Resource contention details

mysql_stage_profiling.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
-- ================================================================
-- Stage-Level Profiling: Understanding Execution Phases
-- ================================================================
 
-- Enable stage instrumentation for detailed phase timing
UPDATE performance_schema.setup_instruments
SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME LIKE 'stage/%';
 
UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES'
WHERE NAME LIKE 'events_stages%';
 
-- Execute a query to profile
 
-- Then view stage breakdown for the statement
SELECT
    EVENT_NAME AS stage_name,
    TIMER_WAIT / 1000000000 AS duration_ms,
    NESTING_EVENT_ID
FROM performance_schema.events_stages_history
WHERE NESTING_EVENT_ID = (
    SELECT EVENT_ID
    FROM performance_schema.events_statements_history
    WHERE THREAD_ID = (
        SELECT THREAD_ID 
        FROM performance_schema.threads 
        WHERE PROCESSLIST_ID = CONNECTION_ID()
    )
    ORDER BY EVENT_ID DESC
    LIMIT 1
)
ORDER BY EVENT_ID;
 
-- Sample output understanding:
-- ┌──────────────────────────────────────┬─────────────┐
-- │ stage_name                           │ duration_ms │
-- ├──────────────────────────────────────┼─────────────┤
-- │ stage/sql/starting                   │ 0.025       │
-- │ stage/sql/checking permissions       │ 0.008       │
-- │ stage/sql/Opening tables             │ 0.145       │
-- │ stage/sql/init                       │ 0.012       │
-- │ stage/sql/System lock                │ 0.005       │
-- │ stage/sql/optimizing                 │ 0.089       │
-- │ stage/sql/statistics                 │ 12.450      │ ← Cardinality estimation
-- │ stage/sql/preparing                  │ 0.034       │
-- │ stage/sql/executing                  │ 847.230     │ ← Actual row processing
-- │ stage/sql/Sending data               │ 2.340       │
-- │ stage/sql/end                        │ 0.003       │
-- │ stage/sql/query end                  │ 0.004       │
-- │ stage/sql/closing tables             │ 0.018       │
-- │ stage/sql/freeing items              │ 0.012       │
-- │ stage/sql/cleaning up                │ 0.002       │
-- └──────────────────────────────────────┴─────────────┘
 
-- ================================================================
-- Sys Schema: Simplified Performance Analysis Views
-- ================================================================
 
-- The sys schema provides human-readable views over Performance Schema
 
-- Top queries by total execution time
SELECT * FROM sys.statements_with_runtimes_in_95th_percentile;
 
-- Queries with full table scans
SELECT * FROM sys.statements_with_full_table_scans
ORDER BY exec_count DESC
LIMIT 10;
 
-- Queries sorted by total latency
SELECT 
    query,
    db,
    exec_count,
    total_latency,
    avg_latency,
    rows_sent_avg,
    rows_examined_avg
FROM sys.statement_analysis
ORDER BY total_latency DESC
LIMIT 10;
 
-- View I/O statistics per table
SELECT * FROM sys.schema_table_statistics
WHERE table_schema = 'your_database'
ORDER BY total_latency DESC;

Legacy MySQL SHOW PROFILE

MySQL's SHOW PROFILE command is deprecated as of 5.7 and should not be used for new development. It remains available but Performance Schema provides superior capabilities with production-appropriate overhead. Migrate existing SHOW PROFILE workflows to Performance Schema queries for future compatibility.

PostgreSQL Profiling Tools

PostgreSQL takes a different architectural approach to query profiling—integrating timing information directly into EXPLAIN output rather than maintaining separate profiling infrastructure. This design provides operator-level detail without requiring separate configuration or memory allocation.

EXPLAIN (ANALYZE) is PostgreSQL's primary query profiling mechanism, combining execution plan display with actual runtime measurements.

postgresql_explain_analyze.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
-- ================================================================
-- PostgreSQL EXPLAIN ANALYZE: Integrated Query Profiling
-- ================================================================
 
-- Basic EXPLAIN ANALYZE: Plan + Actual Execution Metrics
EXPLAIN ANALYZE
SELECT 
    c.customer_name,
    COUNT(o.order_id) AS order_count,
    SUM(o.total_amount) AS total_spent
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE c.created_at >= CURRENT_DATE - INTERVAL '1 year'
GROUP BY c.customer_id, c.customer_name
HAVING COUNT(o.order_id) > 5
ORDER BY total_spent DESC
LIMIT 100;
 
/*
Output with ANALYZE includes actual timing:
 
Hash Join  (cost=312.50..1245.80 rows=500 width=68)
           (actual time=15.234..89.456 rows=487 loops=1)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                    ACTUAL execution metrics
                    
  Hash Cond: (o.customer_id = c.customer_id)
  -> Seq Scan on orders o  (cost=0.00..850.00 rows=25000 width=24)
                           (actual time=0.015..35.234 rows=24856 loops=1)
                                              ^^^^^^^^^^^^^^^^^^^^^^
                                              Actual rows vs estimated
  -> Hash  (cost=250.00..250.00 rows=5000 width=44)
           (actual time=12.890..12.890 rows=4892 loops=1)
     -> Seq Scan on customers c  (cost=0.00..250.00 rows=5000 width=44)
                                 (actual time=0.008..8.234 rows=4892 loops=1)
           Filter: (created_at >= (CURRENT_DATE - '1 year'::interval))
           Rows Removed by Filter: 1108
                                   ^^^^
                                   Rows filtered (not returned)
*/
 
-- ================================================================
-- EXPLAIN Options for Comprehensive Profiling
-- ================================================================
 
-- Full profiling with all available metrics
EXPLAIN (
    ANALYZE,     -- Actually execute the query (required for actual metrics)
    BUFFERS,     -- Show buffer/page access statistics
    TIMING,      -- Show timing information (can be disabled for less overhead)
    VERBOSE,     -- Show additional detail (output columns, schema names)
    FORMAT JSON  -- Machine-parseable output format
)
SELECT product_id, SUM(quantity) 
FROM order_items 
GROUP BY product_id
HAVING SUM(quantity) > 100;
 
/*
Key metrics from BUFFERS option:
 
Seq Scan on order_items  (cost=0.00..4000.00 rows=200000 width=12)
                         (actual time=0.015..125.890 rows=198500 loops=1)
  Buffers: shared hit=1234 read=890
                 ^^^^^^^^      ^^^^
           Cache hits    Physical disk reads
           (fast)        (slow - disk I/O)
 
When shared read >> shared hit: Table not cached, I/O bottleneck
When shared hit >> shared read: Well-cached, CPU-bound execution
*/
 
-- ================================================================
-- Understanding Key Metrics in EXPLAIN ANALYZE Output
-- ================================================================
 
/*
CRITICAL METRICS TO ANALYZE:
 
1. ACTUAL TIME (startup..total)
   - startup: Time to return first row
   - total: Time to return all rows
   - "actual time=0.015..125.890" means 0.015ms to first row, 125ms total
   
2. ROWS (actual vs estimated)
   - Significant mismatch indicates stale statistics or complex predicates
   - "rows=5000" (estimated) vs "rows=4892" (actual) = good estimate
   - "rows=100" (estimated) vs "rows=45000" (actual) = BAD, reanalyze table
   
3. LOOPS
   - Number of times operator executed (nested loop iterations)
   - Total time = (actual time) × loops
   - "loops=1000" with "actual time=0.5ms" = 500ms total contribution
   
4. BUFFERS
   - shared hit: Pages read from buffer cache (fast)
   - shared read: Pages read from disk (slow)
   - temp read/written: Pages in temporary files (very slow)
   
5. ROWS REMOVED BY FILTER
   - Rows examined but not returned
   - High value suggests potential index opportunity
*/
 
-- Practical example with buffer analysis
EXPLAIN (ANALYZE, BUFFERS, TIMING)
SELECT * FROM large_table 
WHERE indexed_column = 42 AND unindexed_column = 'value';
 
/*
Output analysis:
Index Scan using idx_indexed_column on large_table
  (actual time=0.045..125.890 rows=50 loops=1)
  Index Cond: (indexed_column = 42)
  Filter: (unindexed_column = 'value')
  Rows Removed by Filter: 9950
                          ^^^^
  High filter removal = index found 10000 rows, but only 50 matched
  SOLUTION: Consider composite index (indexed_column, unindexed_column)
  
  Buffers: shared hit=150 read=25
           ^^^^^^^^^^^^^^^^
           175 total pages accessed; 25 required disk I/O
*/

pg_stat_statements: Aggregate Query Statistics

While EXPLAIN ANALYZE profiles individual query executions, pg_stat_statements provides aggregate statistics across all executions of a query pattern—essential for identifying which queries consume the most resources over time.

pg_stat_statements.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
-- ================================================================
-- pg_stat_statements: Workload-Level Query Profiling
-- ================================================================
 
-- Enable the extension (requires superuser, typically in postgresql.conf)
-- shared_preload_libraries = 'pg_stat_statements'
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
 
-- View parameters controlling statistics collection
SHOW pg_stat_statements.max;        -- Maximum tracked statements
SHOW pg_stat_statements.track;      -- What to track: top, all, none
 
-- ================================================================
-- Finding Resource-Intensive Queries
-- ================================================================
 
-- Top queries by total execution time
SELECT 
    calls,
    total_exec_time::NUMERIC(12,2) AS total_time_ms,
    mean_exec_time::NUMERIC(12,2) AS avg_time_ms,
    stddev_exec_time::NUMERIC(12,2) AS stddev_ms,
    rows,
    shared_blks_hit,
    shared_blks_read,
    ROUND(100.0 * shared_blks_hit / 
          NULLIF(shared_blks_hit + shared_blks_read, 0), 2) AS cache_hit_pct,
    LEFT(query, 100) AS query_preview
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
 
-- Top queries by I/O (disk reads)
SELECT 
    calls,
    shared_blks_read AS disk_reads,
    shared_blks_hit AS cache_hits,
    rows,
    total_exec_time::NUMERIC(10,2) AS total_ms,
    LEFT(query, 100) AS query_preview
FROM pg_stat_statements
WHERE shared_blks_read > 0
ORDER BY shared_blks_read DESC
LIMIT 20;
 
-- Queries with high variability (inconsistent performance)
SELECT 
    calls,
    mean_exec_time::NUMERIC(10,2) AS avg_ms,
    stddev_exec_time::NUMERIC(10,2) AS stddev_ms,
    min_exec_time::NUMERIC(10,2) AS min_ms,
    max_exec_time::NUMERIC(10,2) AS max_ms,
    (max_exec_time / NULLIF(min_exec_time, 0))::NUMERIC(10,1) AS variance_ratio,
    LEFT(query, 80) AS query_preview
FROM pg_stat_statements
WHERE calls > 100  -- Sufficient sample size
  AND mean_exec_time > 10  -- Non-trivial queries
ORDER BY stddev_exec_time DESC
LIMIT 20;
 
-- Reset statistics (typically after optimization round)
SELECT pg_stat_statements_reset();
 
-- ================================================================
-- Normalized Query Analysis
-- ================================================================
 
/*
pg_stat_statements normalizes queries by replacing literal values:
 
Original queries:
  SELECT * FROM users WHERE id = 123;
  SELECT * FROM users WHERE id = 456;
  SELECT * FROM users WHERE id = 789;
 
Normalized as:
  SELECT * FROM users WHERE id = $1;
 
This groups equivalent queries for aggregate analysis,
revealing which query PATTERNS consume resources—
not just which specific parameter values are slow.
*/
 
-- View query patterns with parameter placeholders
SELECT queryid, query, calls, mean_exec_time
FROM pg_stat_statements
WHERE query LIKE '%users%'
ORDER BY total_exec_time DESC;

Combining EXPLAIN ANALYZE with pg_stat_statements

Use pg_stat_statements to identify which queries consume the most resources across your workload. Then use EXPLAIN ANALYZE on those specific queries to understand why they're expensive and how to optimize them. This two-phase approach ensures you optimize the right queries.

SQL Server Profiling Tools

Microsoft SQL Server provides an extensive profiling ecosystem that has evolved from trace-based approaches to lightweight DMV-based analysis. Understanding these tools and their appropriate use cases is essential for SQL Server performance work.

Extended Events represent SQL Server's modern profiling infrastructure—a flexible, low-overhead event capture system that replaced SQL Trace and Profiler for production use.

sqlserver_extended_events.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
-- ================================================================
-- SQL Server Extended Events: Modern Query Profiling
-- ================================================================
 
-- Create an Extended Events session for query profiling
CREATE EVENT SESSION QueryProfiling ON SERVER
ADD EVENT sqlserver.sql_statement_completed (
    ACTION (
        sqlserver.sql_text,
        sqlserver.query_hash,
        sqlserver.query_plan_hash,
        sqlserver.database_name,
        sqlserver.username,
        sqlserver.client_app_name
    )
    WHERE (
        duration > 1000000  -- Queries taking > 1 second (microseconds)
        AND database_name = N'YourDatabase'
    )
),
ADD EVENT sqlserver.rpc_completed (
    ACTION (
        sqlserver.sql_text,
        sqlserver.query_hash,
        sqlserver.database_name
    )
    WHERE (
        duration > 1000000
    )
)
ADD TARGET package0.ring_buffer (
    SET max_memory = 4096  -- KB
)
WITH (
    MAX_MEMORY = 4096 KB,
    EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
    MAX_DISPATCH_LATENCY = 5 SECONDS,
    STARTUP_STATE = OFF
);
 
-- Start the session
ALTER EVENT SESSION QueryProfiling ON SERVER STATE = START;
 
-- View captured events
SELECT 
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    event_data.value('(event/@timestamp)[1]', 'datetime2') AS event_time,
    event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') / 1000000.0 AS duration_seconds,
    event_data.value('(event/data[@name="cpu_time"]/value)[1]', 'bigint') / 1000000.0 AS cpu_seconds,
    event_data.value('(event/data[@name="logical_reads"]/value)[1]', 'bigint') AS logical_reads,
    event_data.value('(event/data[@name="physical_reads"]/value)[1]', 'bigint') AS physical_reads,
    event_data.value('(event/data[@name="writes"]/value)[1]', 'bigint') AS writes,
    event_data.value('(event/data[@name="row_count"]/value)[1]', 'bigint') AS row_count,
    event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(max)') AS sql_text
FROM (
    SELECT CAST(target_data AS XML) AS target_xml
    FROM sys.dm_xe_session_targets xst
    JOIN sys.dm_xe_sessions xs ON xs.address = xst.event_session_address
    WHERE xs.name = 'QueryProfiling'
) AS raw_data
CROSS APPLY target_xml.nodes('RingBufferTarget/event') AS events(event_data)
ORDER BY event_time DESC;
 
-- Stop and drop session when done
ALTER EVENT SESSION QueryProfiling ON SERVER STATE = STOP;
DROP EVENT SESSION QueryProfiling ON SERVER;

Dynamic Management Views (DMVs) provide real-time and historical query statistics without explicit session configuration—ideal for ad-hoc investigation and continuous monitoring.

sqlserver_dmv_profiling.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
-- ================================================================
-- SQL Server DMV-Based Query Profiling
-- ================================================================
 
-- ----------------------------------------------------------------
-- sys.dm_exec_query_stats: Aggregated Query Statistics
-- ----------------------------------------------------------------
 
-- Top queries by total CPU time
SELECT TOP 20
    qs.total_worker_time / 1000000.0 AS total_cpu_seconds,
    qs.execution_count,
    qs.total_worker_time / qs.execution_count / 1000.0 AS avg_cpu_ms,
    qs.total_elapsed_time / 1000000.0 AS total_elapsed_seconds,
    qs.total_logical_reads,
    qs.total_logical_reads / qs.execution_count AS avg_logical_reads,
    qs.total_physical_reads,
    qs.last_execution_time,
    SUBSTRING(st.text, 
        (qs.statement_start_offset / 2) + 1,
        ((CASE qs.statement_end_offset 
            WHEN -1 THEN DATALENGTH(st.text) 
            ELSE qs.statement_end_offset 
         END - qs.statement_start_offset) / 2) + 1
    ) AS query_text
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
ORDER BY qs.total_worker_time DESC;
 
-- Top queries by logical reads (memory pressure indicator)
SELECT TOP 20
    qs.total_logical_reads,
    qs.execution_count,
    qs.total_logical_reads / qs.execution_count AS avg_reads_per_exec,
    qs.total_worker_time / 1000000.0 AS total_cpu_sec,
    qs.total_elapsed_time / 1000000.0 AS total_elapsed_sec,
    SUBSTRING(st.text, 
        (qs.statement_start_offset / 2) + 1,
        ((CASE qs.statement_end_offset 
            WHEN -1 THEN DATALENGTH(st.text) 
            ELSE qs.statement_end_offset 
         END - qs.statement_start_offset) / 2) + 1
    ) AS query_text
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
ORDER BY qs.total_logical_reads DESC;
 
-- ----------------------------------------------------------------
-- sys.dm_exec_requests: Currently Executing Queries
-- ----------------------------------------------------------------
 
-- View all currently executing queries with performance metrics
SELECT 
    r.session_id,
    r.status,
    r.command,
    r.wait_type,
    r.wait_time,
    r.blocking_session_id,
    r.cpu_time,
    r.total_elapsed_time / 1000.0 AS elapsed_seconds,
    r.logical_reads,
    r.reads AS physical_reads,
    r.writes,
    r.row_count,
    r.percent_complete,
    s.login_name,
    s.host_name,
    s.program_name,
    db.name AS database_name,
    SUBSTRING(st.text, 
        (r.statement_start_offset / 2) + 1,
        ((CASE r.statement_end_offset 
            WHEN -1 THEN DATALENGTH(st.text) 
            ELSE r.statement_end_offset 
         END - r.statement_start_offset) / 2) + 1
    ) AS current_statement,
    qp.query_plan
FROM sys.dm_exec_requests r
JOIN sys.dm_exec_sessions s ON r.session_id = s.session_id
LEFT JOIN sys.databases db ON r.database_id = db.database_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) st
CROSS APPLY sys.dm_exec_query_plan(r.plan_handle) qp
WHERE r.session_id > 50  -- Exclude system sessions
  AND r.session_id != @@SPID  -- Exclude current session
ORDER BY r.total_elapsed_time DESC;
 
-- ----------------------------------------------------------------
-- Query Store: Historical Query Performance (SQL Server 2016+)
-- ----------------------------------------------------------------
 
-- Enable Query Store on a database
ALTER DATABASE YourDatabase
SET QUERY_STORE = ON (
    OPERATION_MODE = READ_WRITE,
    CLEANUP_POLICY = (STALE_QUERY_THRESHOLD_DAYS = 30),
    DATA_FLUSH_INTERVAL_SECONDS = 900,
    INTERVAL_LENGTH_MINUTES = 60,
    MAX_STORAGE_SIZE_MB = 1000,
    QUERY_CAPTURE_MODE = AUTO,
    SIZE_BASED_CLEANUP_MODE = AUTO
);
 
-- Top resource-consuming queries from Query Store
SELECT TOP 20
    q.query_id,
    qt.query_sql_text,
    SUM(rs.count_executions) AS total_executions,
    SUM(rs.avg_duration * rs.count_executions) / 1000000.0 AS total_duration_sec,
    AVG(rs.avg_duration) / 1000.0 AS avg_duration_ms,
    SUM(rs.avg_cpu_time * rs.count_executions) / 1000000.0 AS total_cpu_sec,
    SUM(rs.avg_logical_io_reads * rs.count_executions) AS total_logical_reads,
    AVG(rs.avg_logical_io_reads) AS avg_logical_reads
FROM sys.query_store_query q
JOIN sys.query_store_query_text qt ON q.query_text_id = qt.query_text_id
JOIN sys.query_store_plan p ON q.query_id = p.query_id
JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id
GROUP BY q.query_id, qt.query_sql_text
ORDER BY total_cpu_sec DESC;

Avoid SQL Profiler for Production

SQL Server Profiler (the GUI tool) uses SQL Trace infrastructure with significant overhead—typically 15-30% performance impact. Never use it on production servers. Extended Events provides equivalent functionality with 1-3% overhead. SQL Profiler should only be used in development or for brief diagnostic captures.

Oracle Profiling Tools

Oracle Database provides the most comprehensive profiling infrastructure among major databases—a reflection of its enterprise heritage and performance engineering focus. Understanding this ecosystem enables deep performance analysis for Oracle deployments.

Automatic Workload Repository (AWR) and Active Session History (ASH) form the foundation of Oracle's performance monitoring architecture.

oracle_profiling_tools.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
-- ================================================================
-- Oracle Database: Comprehensive Profiling Infrastructure
-- ================================================================
 
-- ----------------------------------------------------------------
-- V$SQL: Real-Time SQL Statistics
-- ----------------------------------------------------------------
 
-- Top SQL by CPU consumption
SELECT 
    sql_id,
    child_number,
    plan_hash_value,
    executions,
    ROUND(cpu_time / 1000000, 2) AS cpu_seconds,
    ROUND(elapsed_time / 1000000, 2) AS elapsed_seconds,
    buffer_gets,
    disk_reads,
    rows_processed,
    ROUND(buffer_gets / NULLIF(executions, 0)) AS buffer_gets_per_exec,
    ROUND(cpu_time / NULLIF(executions, 0) / 1000) AS cpu_ms_per_exec,
    SUBSTR(sql_text, 1, 200) AS sql_preview
FROM v$sql
WHERE executions > 0
ORDER BY cpu_time DESC
FETCH FIRST 20 ROWS ONLY;
 
-- ----------------------------------------------------------------
-- V$SQL_PLAN_STATISTICS_ALL: Operator-Level Profiling
-- ----------------------------------------------------------------
 
-- View actual row counts vs estimates per plan operator
-- (Requires STATISTICS_LEVEL=ALL or gather_plan_statistics hint)
SELECT 
    id,
    parent_id,
    operation || ' ' || options AS operation,
    object_name,
    cardinality AS estimated_rows,
    output_rows AS actual_rows,
    CASE 
        WHEN output_rows > cardinality * 10 THEN '⚠️ UNDERESTIMATE'
        WHEN output_rows < cardinality / 10 THEN '⚠️ OVERESTIMATE'
        ELSE '✓'
    END AS estimate_accuracy,
    ROUND(elapsed_time / 1000) AS elapsed_ms,
    starts,
    buffer_gets,
    disk_reads
FROM v$sql_plan_statistics_all
WHERE sql_id = '&sql_id'
  AND child_number = 0
ORDER BY id;
 
-- ----------------------------------------------------------------
-- DBMS_XPLAN: Display Execution Plan with Statistics
-- ----------------------------------------------------------------
 
-- Explain plan with actual statistics
SELECT * FROM TABLE(
    DBMS_XPLAN.DISPLAY_CURSOR(
        sql_id => '&sql_id',
        cursor_child_no => 0,
        format => 'ALLSTATS LAST OUTLINE +PREDICATE +COST'
    )
);
 
/*
Sample DBMS_XPLAN output with statistics:
 
----------------------------------------------------------------------
| Id  | Operation          | Name   | Starts | E-Rows | A-Rows | Buffers |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |      1 |        |    100 |    2500 |
|   1 |  SORT ORDER BY     |        |      1 |    100 |    100 |    2500 |
|*  2 |   HASH JOIN        |        |      1 |    100 |    100 |    2500 |
|   3 |    TABLE ACCESS... | CUST   |      1 |   1000 |   1000 |     250 |
|*  4 |    TABLE ACCESS... | ORDERS |      1 |   100K | 100000 |    2250 |
----------------------------------------------------------------------
 
Key columns:
- Starts: Times operator invoked
- E-Rows: Estimated rows (optimizer prediction)
- A-Rows: Actual rows (measured)
- Buffers: Buffer gets (logical I/O)
*/
 
-- ----------------------------------------------------------------
-- Active Session History (ASH): Session-Level Wait Analysis
-- ----------------------------------------------------------------
 
-- Find what a slow session was doing
SELECT 
    sample_time,
    session_id,
    session_serial#,
    event,
    wait_class,
    sql_id,
    sql_opname,
    ROUND(time_waited / 1000) AS wait_ms,
    blocking_session,
    module,
    action
FROM v$active_session_history
WHERE sample_time > SYSDATE - INTERVAL '30' MINUTE
  AND session_id = &session_id
ORDER BY sample_time DESC;
 
-- Top wait events across all sessions
SELECT 
    event,
    wait_class,
    COUNT(*) AS sample_count,
    ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 2) AS pct_of_db_time,
    MIN(sample_time) AS first_seen,
    MAX(sample_time) AS last_seen
FROM v$active_session_history
WHERE sample_time > SYSDATE - INTERVAL '1' HOUR
  AND wait_class != 'Idle'
GROUP BY event, wait_class
ORDER BY sample_count DESC
FETCH FIRST 20 ROWS ONLY;
 
-- ----------------------------------------------------------------
-- SQL Monitoring: Real-Time Operations Profiling
-- ----------------------------------------------------------------
 
-- View actively monitored long-running operations
SELECT 
    sql_id,
    status,
    sid,
    sql_text,
    ROUND(elapsed_time / 1000000, 2) AS elapsed_sec,
    ROUND(cpu_time / 1000000, 2) AS cpu_sec,
    buffer_gets,
    disk_reads,
    ROUND(buffer_gets / NULLIF(disk_reads, 0), 0) AS hit_ratio
FROM v$sql_monitor
WHERE status IN ('EXECUTING', 'DONE')
ORDER BY elapsed_time DESC
FETCH FIRST 20 ROWS ONLY;
 
-- Generate detailed SQL Monitor report
SELECT DBMS_SQLTUNE.REPORT_SQL_MONITOR(
    sql_id => '&sql_id',
    type => 'HTML',
    report_level => 'ALL'  -- BASIC, TYPICAL, ALL
) AS report
FROM dual;

Oracle Real-Time SQL Monitoring

SQL Monitoring automatically activates for queries expected to run >5 seconds or those using parallelism. It provides real-time visibility into execution progress, including completion percentage, time distribution, and operator-level statistics—invaluable for diagnosing long-running queries without waiting for completion.

Cross-Platform Profiling Patterns

While each database system implements profiling differently, common analytical patterns apply universally. Mastering these patterns enables effective performance analysis regardless of platform.

Universal Profiling Patterns

•Top-N Analysis — Identify the N most resource-consuming queries by CPU, I/O, or elapsed time. These typically represent 80%+ of optimization opportunity with 20% of effort.
•Estimate vs. Actual Comparison — Compare optimizer estimates against actual row counts at each execution stage. Large discrepancies indicate statistics issues or optimization failures.
•Wait Analysis — Determine whether queries are spending time computing (CPU-bound) or waiting (I/O-bound, lock-bound). The bottleneck type dictates the optimization approach.
•Temporal Pattern Recognition — Compare query performance across time periods to identify regression, improvement, or correlation with external factors (backups, reports, user activity).
•Workload Characterization — Categorize queries by type (SELECT/INSERT/UPDATE), access pattern (point lookup/range scan/full scan), and frequency. Optimization priorities differ by category.

Profiling Tool Selection by Analysis Need
Analysis Need	MySQL	PostgreSQL	SQL Server	Oracle
Single query deep dive	EXPLAIN ANALYZE + Performance Schema stages	EXPLAIN (ANALYZE, BUFFERS)	Actual Execution Plan + SET STATISTICS IO ON	DBMS_XPLAN.DISPLAY_CURSOR + SQL Monitor
Aggregate workload analysis	sys.statement_analysis, Performance Schema digests	pg_stat_statements	Query Store, sys.dm_exec_query_stats	AWR, V$SQL
Real-time monitoring	Performance Schema events_current	pg_stat_activity	sys.dm_exec_requests	V$SESSION, V$SQL_MONITOR
Wait/blocking analysis	Performance Schema events_waits_*	pg_stat_activity.wait_event	Extended Events, sys.dm_os_wait_stats	ASH, V$SESSION_WAIT
Historical trending	Events tables with timestamp filtering	pg_stat_statements (with manual snapshots)	Query Store	AWR snapshots

Summary: Query Profiler Tools

Query profiler tools transform performance analysis from guesswork into engineering. By instrumenting the database engine and exposing detailed execution metrics, they enable precise diagnosis of performance issues that would otherwise require hours of speculation.

Key Takeaways

•Profiling measures reality — Unlike EXPLAIN plans which predict, profilers capture what actually happened during execution.
•Multiple granularities exist — Statement-level, operator-level, and system-level profiling serve different analytical needs.
•Profiling has overhead — Configure appropriately for your environment; never leave detailed profiling enabled continuously in production.
•Each platform differs — MySQL uses Performance Schema, PostgreSQL embeds timing in EXPLAIN ANALYZE, SQL Server provides Extended Events/DMVs, Oracle offers AWR/ASH/SQL Monitor.
•Common patterns transcend platforms — Top-N analysis, estimate-vs-actual comparison, and wait analysis apply universally.
•Tools complement each other — Use aggregate statistics to identify problem queries, then detailed profiling to understand why they're slow.

What's next:

Now that you understand the profiling tools available, the next page examines execution statistics in depth—the specific metrics profilers collect, what each metric reveals about query behavior, and how to interpret them to identify performance bottlenecks.

Page Complete

You now understand the architecture and capabilities of query profiler tools across major database systems. You can activate profiling, navigate the relevant system views and commands, and recognize which tools apply to different analytical scenarios. This foundation enables the detailed execution statistics analysis covered next.

1 / 5

Loading learning content...

Database Management SystemsQuery Profiling

Query Profiling

LevelAdvanced

Duration90 mins

TopicQuery Profiling

1 / 5

Query Profiler Tools

The Foundation of Performance Analysis

What You Will Learn

Understanding Query Profiling

Core Profiling Dimensions

•Temporal Metrics — Elapsed wall-clock time, CPU time, wait time, and time distribution across execution phases (parsing, optimization, execution, result transmission).
•Row Metrics — Rows examined versus rows returned, rows filtered at various stages, intermediate result set sizes between operators.
•I/O Metrics — Physical disk reads, buffer pool hits/misses, write operations for temporary storage, pages accessed.
•Memory Metrics — Memory allocated for sorting, hashing, temporary tables, and query workspace buffers.
•Lock Metrics — Lock acquisition counts, lock wait times, deadlock involvement, blocking chain participation.
•Network Metrics — Bytes sent to client, round-trip counts, result set serialization overhead.

The Measurement Hierarchy

Why profiling matters more than intuition:

Profilers prevent the most common performance optimization failure: solving the wrong problem. They ensure your optimization effort targets genuine bottlenecks, not perceived ones.

Profiler Architecture

1. Instrumentation Points

The database engine contains numerous instrumentation points—code locations where profiling data is captured. These include:

Parser entry/exit — When query text parsing begins and completes
Optimizer phases — Cardinality estimation, join ordering, access path selection
Executor lifecycle — Operator initialization, row processing, resource allocation
Buffer pool operations — Page reads, evictions, dirty page flushes
Lock manager events — Lock acquisition, release, waiting, deadlock detection
Network layer — Result serialization, transmission timing

profiler_architecture_conceptual.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
-- Conceptual view of instrumentation flow during query execution
-- (This illustrates the internal process, not actual executable SQL)
 
/*
  Query Lifecycle with Profiling Instrumentation:
  
  ┌────────────────────────────────────────────────────────────────┐
  │                    CLIENT SENDS QUERY                          │
  └────────────────────────────────────────────────────────────────┘
                               │
                               ▼
  ┌────────────────────────────────────────────────────────────────┐
  │  PHASE 1: PARSING                                              │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  │  📊 Instrumentation:                                           │
  │     • parse_start_time = NOW()                                 │
  │     • token_count++                                            │
  │     • syntax_tree_nodes++                                      │
  │     • parse_end_time = NOW()                                   │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  └────────────────────────────────────────────────────────────────┘
                               │
                               ▼
  ┌────────────────────────────────────────────────────────────────┐
  │  PHASE 2: OPTIMIZATION                                         │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  │  📊 Instrumentation:                                           │
  │     • plans_evaluated++                                        │
  │     • cardinality_estimates[]                                  │
  │     • cost_calculations[]                                      │
  │     • optimization_elapsed_ms                                  │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  └────────────────────────────────────────────────────────────────┘
                               │
                               ▼
  ┌────────────────────────────────────────────────────────────────┐
  │  PHASE 3: EXECUTION                                            │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  │  📊 Instrumentation (per operator):                            │
  │     • rows_examined++                                          │
  │     • rows_returned++                                          │
  │     • buffer_reads++                                           │
  │     • buffer_hits++                                            │
  │     • temp_space_used++                                        │
  │     • operator_elapsed_ns                                      │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  └────────────────────────────────────────────────────────────────┘
                               │
                               ▼
  ┌────────────────────────────────────────────────────────────────┐
  │  PHASE 4: RESULT TRANSMISSION                                  │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  │  📊 Instrumentation:                                           │
  │     • bytes_sent++                                             │
  │     • network_round_trips++                                    │
  │     • client_wait_time_ms                                      │
  │  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │
  └────────────────────────────────────────────────────────────────┘
*/

2. Collection Infrastructure

Low contention — Profiling data collection must not serialize query execution
Fixed memory — Buffers have bounded size to prevent memory exhaustion
Overflow handling — When buffers fill, systems either drop old data or pause collection

3. Access Interfaces

Collected profiling data is exposed through various interfaces:

Virtual tables / system views — Query profiling data like regular tables
Built-in commands — Special syntax to request profiling (EXPLAIN ANALYZE, SHOW PROFILE)
External files — Logs, trace files, or dump files for offline analysis
APIs — Programmatic access for monitoring tools and dashboards

Profiling Overhead

MySQL Profiling Tools

mysql_performance_schema.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
-- ================================================================
-- MySQL Performance Schema: Core Profiling Infrastructure
-- ================================================================
 
-- Check if Performance Schema is enabled
SHOW VARIABLES LIKE 'performance_schema';
 
-- Result: Should show 'ON' for profiling capabilities
 
-- ----------------------------------------------------------------
-- Configure statement instrumentation
-- ----------------------------------------------------------------
 
-- Enable statement event consumption
UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES'
WHERE NAME LIKE 'events_statements%';
 
-- Enable statement instrumentation
UPDATE performance_schema.setup_instruments
SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME LIKE 'statement/%';
 
-- ----------------------------------------------------------------
-- Query recently executed statements with timing
-- ----------------------------------------------------------------
 
-- View current session's last statement details
SELECT
    THREAD_ID,
    EVENT_ID,
    SQL_TEXT,
    DIGEST_TEXT,
    TIMER_WAIT / 1000000000 AS elapsed_seconds,
    TIMER_START / 1000000000 AS start_seconds,
    LOCK_TIME / 1000000000 AS lock_seconds,
    ROWS_EXAMINED,
    ROWS_SENT,
    ROWS_AFFECTED,
    CREATED_TMP_TABLES,
    CREATED_TMP_DISK_TABLES,
    SELECT_FULL_JOIN,
    SELECT_SCAN,
    SORT_ROWS,
    NO_INDEX_USED,
    NO_GOOD_INDEX_USED
FROM performance_schema.events_statements_current
WHERE THREAD_ID = (
    SELECT THREAD_ID 
    FROM performance_schema.threads 
    WHERE PROCESSLIST_ID = CONNECTION_ID()
);
 
-- View statement history for current thread
SELECT
    SQL_TEXT,
    TIMER_WAIT / 1000000000000 AS elapsed_ms,
    ROWS_EXAMINED,
    ROWS_SENT,
    CREATED_TMP_TABLES,
    CREATED_TMP_DISK_TABLES,
    NO_INDEX_USED
FROM performance_schema.events_statements_history
WHERE THREAD_ID = (
    SELECT THREAD_ID 
    FROM performance_schema.threads 
    WHERE PROCESSLIST_ID = CONNECTION_ID()
)
ORDER BY EVENT_ID DESC
LIMIT 10;

Key Performance Schema Tables for Query Analysis:

Table	Purpose	Key Metrics
`events_statements_current`	Currently executing statements	Real-time execution state
`events_statements_history`	Recent statements per thread	Per-session query history
`events_statements_history_long`	Recent statements globally	Cross-session query history
`events_statements_summary_by_digest`	Aggregated by normalized query	Pattern-based analysis
`events_stages_current`	Current execution stages	Phase-level timing
`events_waits_current`	Current wait events	Resource contention details

mysql_stage_profiling.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
-- ================================================================
-- Stage-Level Profiling: Understanding Execution Phases
-- ================================================================
 
-- Enable stage instrumentation for detailed phase timing
UPDATE performance_schema.setup_instruments
SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME LIKE 'stage/%';
 
UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES'
WHERE NAME LIKE 'events_stages%';
 
-- Execute a query to profile
 
-- Then view stage breakdown for the statement
SELECT
    EVENT_NAME AS stage_name,
    TIMER_WAIT / 1000000000 AS duration_ms,
    NESTING_EVENT_ID
FROM performance_schema.events_stages_history
WHERE NESTING_EVENT_ID = (
    SELECT EVENT_ID
    FROM performance_schema.events_statements_history
    WHERE THREAD_ID = (
        SELECT THREAD_ID 
        FROM performance_schema.threads 
        WHERE PROCESSLIST_ID = CONNECTION_ID()
    )
    ORDER BY EVENT_ID DESC
    LIMIT 1
)
ORDER BY EVENT_ID;
 
-- Sample output understanding:
-- ┌──────────────────────────────────────┬─────────────┐
-- │ stage_name                           │ duration_ms │
-- ├──────────────────────────────────────┼─────────────┤
-- │ stage/sql/starting                   │ 0.025       │
-- │ stage/sql/checking permissions       │ 0.008       │
-- │ stage/sql/Opening tables             │ 0.145       │
-- │ stage/sql/init                       │ 0.012       │
-- │ stage/sql/System lock                │ 0.005       │
-- │ stage/sql/optimizing                 │ 0.089       │
-- │ stage/sql/statistics                 │ 12.450      │ ← Cardinality estimation
-- │ stage/sql/preparing                  │ 0.034       │
-- │ stage/sql/executing                  │ 847.230     │ ← Actual row processing
-- │ stage/sql/Sending data               │ 2.340       │
-- │ stage/sql/end                        │ 0.003       │
-- │ stage/sql/query end                  │ 0.004       │
-- │ stage/sql/closing tables             │ 0.018       │
-- │ stage/sql/freeing items              │ 0.012       │
-- │ stage/sql/cleaning up                │ 0.002       │
-- └──────────────────────────────────────┴─────────────┘
 
-- ================================================================
-- Sys Schema: Simplified Performance Analysis Views
-- ================================================================
 
-- The sys schema provides human-readable views over Performance Schema
 
-- Top queries by total execution time
SELECT * FROM sys.statements_with_runtimes_in_95th_percentile;
 
-- Queries with full table scans
SELECT * FROM sys.statements_with_full_table_scans
ORDER BY exec_count DESC
LIMIT 10;
 
-- Queries sorted by total latency
SELECT 
    query,
    db,
    exec_count,
    total_latency,
    avg_latency,
    rows_sent_avg,
    rows_examined_avg
FROM sys.statement_analysis
ORDER BY total_latency DESC
LIMIT 10;
 
-- View I/O statistics per table
SELECT * FROM sys.schema_table_statistics
WHERE table_schema = 'your_database'
ORDER BY total_latency DESC;

Legacy MySQL SHOW PROFILE

PostgreSQL Profiling Tools

EXPLAIN (ANALYZE) is PostgreSQL's primary query profiling mechanism, combining execution plan display with actual runtime measurements.

postgresql_explain_analyze.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
-- ================================================================
-- PostgreSQL EXPLAIN ANALYZE: Integrated Query Profiling
-- ================================================================
 
-- Basic EXPLAIN ANALYZE: Plan + Actual Execution Metrics
EXPLAIN ANALYZE
SELECT 
    c.customer_name,
    COUNT(o.order_id) AS order_count,
    SUM(o.total_amount) AS total_spent
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE c.created_at >= CURRENT_DATE - INTERVAL '1 year'
GROUP BY c.customer_id, c.customer_name
HAVING COUNT(o.order_id) > 5
ORDER BY total_spent DESC
LIMIT 100;
 
/*
Output with ANALYZE includes actual timing:
 
Hash Join  (cost=312.50..1245.80 rows=500 width=68)
           (actual time=15.234..89.456 rows=487 loops=1)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                    ACTUAL execution metrics
                    
  Hash Cond: (o.customer_id = c.customer_id)
  -> Seq Scan on orders o  (cost=0.00..850.00 rows=25000 width=24)
                           (actual time=0.015..35.234 rows=24856 loops=1)
                                              ^^^^^^^^^^^^^^^^^^^^^^
                                              Actual rows vs estimated
  -> Hash  (cost=250.00..250.00 rows=5000 width=44)
           (actual time=12.890..12.890 rows=4892 loops=1)
     -> Seq Scan on customers c  (cost=0.00..250.00 rows=5000 width=44)
                                 (actual time=0.008..8.234 rows=4892 loops=1)
           Filter: (created_at >= (CURRENT_DATE - '1 year'::interval))
           Rows Removed by Filter: 1108
                                   ^^^^
                                   Rows filtered (not returned)
*/
 
-- ================================================================
-- EXPLAIN Options for Comprehensive Profiling
-- ================================================================
 
-- Full profiling with all available metrics
EXPLAIN (
    ANALYZE,     -- Actually execute the query (required for actual metrics)
    BUFFERS,     -- Show buffer/page access statistics
    TIMING,      -- Show timing information (can be disabled for less overhead)
    VERBOSE,     -- Show additional detail (output columns, schema names)
    FORMAT JSON  -- Machine-parseable output format
)
SELECT product_id, SUM(quantity) 
FROM order_items 
GROUP BY product_id
HAVING SUM(quantity) > 100;
 
/*
Key metrics from BUFFERS option:
 
Seq Scan on order_items  (cost=0.00..4000.00 rows=200000 width=12)
                         (actual time=0.015..125.890 rows=198500 loops=1)
  Buffers: shared hit=1234 read=890
                 ^^^^^^^^      ^^^^
           Cache hits    Physical disk reads
           (fast)        (slow - disk I/O)
 
When shared read >> shared hit: Table not cached, I/O bottleneck
When shared hit >> shared read: Well-cached, CPU-bound execution
*/
 
-- ================================================================
-- Understanding Key Metrics in EXPLAIN ANALYZE Output
-- ================================================================
 
/*
CRITICAL METRICS TO ANALYZE:
 
1. ACTUAL TIME (startup..total)
   - startup: Time to return first row
   - total: Time to return all rows
   - "actual time=0.015..125.890" means 0.015ms to first row, 125ms total
   
2. ROWS (actual vs estimated)
   - Significant mismatch indicates stale statistics or complex predicates
   - "rows=5000" (estimated) vs "rows=4892" (actual) = good estimate
   - "rows=100" (estimated) vs "rows=45000" (actual) = BAD, reanalyze table
   
3. LOOPS
   - Number of times operator executed (nested loop iterations)
   - Total time = (actual time) × loops
   - "loops=1000" with "actual time=0.5ms" = 500ms total contribution
   
4. BUFFERS
   - shared hit: Pages read from buffer cache (fast)
   - shared read: Pages read from disk (slow)
   - temp read/written: Pages in temporary files (very slow)
   
5. ROWS REMOVED BY FILTER
   - Rows examined but not returned
   - High value suggests potential index opportunity
*/
 
-- Practical example with buffer analysis
EXPLAIN (ANALYZE, BUFFERS, TIMING)
SELECT * FROM large_table 
WHERE indexed_column = 42 AND unindexed_column = 'value';
 
/*
Output analysis:
Index Scan using idx_indexed_column on large_table
  (actual time=0.045..125.890 rows=50 loops=1)
  Index Cond: (indexed_column = 42)
  Filter: (unindexed_column = 'value')
  Rows Removed by Filter: 9950
                          ^^^^
  High filter removal = index found 10000 rows, but only 50 matched
  SOLUTION: Consider composite index (indexed_column, unindexed_column)
  
  Buffers: shared hit=150 read=25
           ^^^^^^^^^^^^^^^^
           175 total pages accessed; 25 required disk I/O
*/

pg_stat_statements: Aggregate Query Statistics

pg_stat_statements.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
-- ================================================================
-- pg_stat_statements: Workload-Level Query Profiling
-- ================================================================
 
-- Enable the extension (requires superuser, typically in postgresql.conf)
-- shared_preload_libraries = 'pg_stat_statements'
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
 
-- View parameters controlling statistics collection
SHOW pg_stat_statements.max;        -- Maximum tracked statements
SHOW pg_stat_statements.track;      -- What to track: top, all, none
 
-- ================================================================
-- Finding Resource-Intensive Queries
-- ================================================================
 
-- Top queries by total execution time
SELECT 
    calls,
    total_exec_time::NUMERIC(12,2) AS total_time_ms,
    mean_exec_time::NUMERIC(12,2) AS avg_time_ms,
    stddev_exec_time::NUMERIC(12,2) AS stddev_ms,
    rows,
    shared_blks_hit,
    shared_blks_read,
    ROUND(100.0 * shared_blks_hit / 
          NULLIF(shared_blks_hit + shared_blks_read, 0), 2) AS cache_hit_pct,
    LEFT(query, 100) AS query_preview
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
 
-- Top queries by I/O (disk reads)
SELECT 
    calls,
    shared_blks_read AS disk_reads,
    shared_blks_hit AS cache_hits,
    rows,
    total_exec_time::NUMERIC(10,2) AS total_ms,
    LEFT(query, 100) AS query_preview
FROM pg_stat_statements
WHERE shared_blks_read > 0
ORDER BY shared_blks_read DESC
LIMIT 20;
 
-- Queries with high variability (inconsistent performance)
SELECT 
    calls,
    mean_exec_time::NUMERIC(10,2) AS avg_ms,
    stddev_exec_time::NUMERIC(10,2) AS stddev_ms,
    min_exec_time::NUMERIC(10,2) AS min_ms,
    max_exec_time::NUMERIC(10,2) AS max_ms,
    (max_exec_time / NULLIF(min_exec_time, 0))::NUMERIC(10,1) AS variance_ratio,
    LEFT(query, 80) AS query_preview
FROM pg_stat_statements
WHERE calls > 100  -- Sufficient sample size
  AND mean_exec_time > 10  -- Non-trivial queries
ORDER BY stddev_exec_time DESC
LIMIT 20;
 
-- Reset statistics (typically after optimization round)
SELECT pg_stat_statements_reset();
 
-- ================================================================
-- Normalized Query Analysis
-- ================================================================
 
/*
pg_stat_statements normalizes queries by replacing literal values:
 
Original queries:
  SELECT * FROM users WHERE id = 123;
  SELECT * FROM users WHERE id = 456;
  SELECT * FROM users WHERE id = 789;
 
Normalized as:
  SELECT * FROM users WHERE id = $1;
 
This groups equivalent queries for aggregate analysis,
revealing which query PATTERNS consume resources—
not just which specific parameter values are slow.
*/
 
-- View query patterns with parameter placeholders
SELECT queryid, query, calls, mean_exec_time
FROM pg_stat_statements
WHERE query LIKE '%users%'
ORDER BY total_exec_time DESC;

Combining EXPLAIN ANALYZE with pg_stat_statements

SQL Server Profiling Tools

Extended Events represent SQL Server's modern profiling infrastructure—a flexible, low-overhead event capture system that replaced SQL Trace and Profiler for production use.

sqlserver_extended_events.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
-- ================================================================
-- SQL Server Extended Events: Modern Query Profiling
-- ================================================================
 
-- Create an Extended Events session for query profiling
CREATE EVENT SESSION QueryProfiling ON SERVER
ADD EVENT sqlserver.sql_statement_completed (
    ACTION (
        sqlserver.sql_text,
        sqlserver.query_hash,
        sqlserver.query_plan_hash,
        sqlserver.database_name,
        sqlserver.username,
        sqlserver.client_app_name
    )
    WHERE (
        duration > 1000000  -- Queries taking > 1 second (microseconds)
        AND database_name = N'YourDatabase'
    )
),
ADD EVENT sqlserver.rpc_completed (
    ACTION (
        sqlserver.sql_text,
        sqlserver.query_hash,
        sqlserver.database_name
    )
    WHERE (
        duration > 1000000
    )
)
ADD TARGET package0.ring_buffer (
    SET max_memory = 4096  -- KB
)
WITH (
    MAX_MEMORY = 4096 KB,
    EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,
    MAX_DISPATCH_LATENCY = 5 SECONDS,
    STARTUP_STATE = OFF
);
 
-- Start the session
ALTER EVENT SESSION QueryProfiling ON SERVER STATE = START;
 
-- View captured events
SELECT 
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    event_data.value('(event/@timestamp)[1]', 'datetime2') AS event_time,
    event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') / 1000000.0 AS duration_seconds,
    event_data.value('(event/data[@name="cpu_time"]/value)[1]', 'bigint') / 1000000.0 AS cpu_seconds,
    event_data.value('(event/data[@name="logical_reads"]/value)[1]', 'bigint') AS logical_reads,
    event_data.value('(event/data[@name="physical_reads"]/value)[1]', 'bigint') AS physical_reads,
    event_data.value('(event/data[@name="writes"]/value)[1]', 'bigint') AS writes,
    event_data.value('(event/data[@name="row_count"]/value)[1]', 'bigint') AS row_count,
    event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(max)') AS sql_text
FROM (
    SELECT CAST(target_data AS XML) AS target_xml
    FROM sys.dm_xe_session_targets xst
    JOIN sys.dm_xe_sessions xs ON xs.address = xst.event_session_address
    WHERE xs.name = 'QueryProfiling'
) AS raw_data
CROSS APPLY target_xml.nodes('RingBufferTarget/event') AS events(event_data)
ORDER BY event_time DESC;
 
-- Stop and drop session when done
ALTER EVENT SESSION QueryProfiling ON SERVER STATE = STOP;
DROP EVENT SESSION QueryProfiling ON SERVER;

Dynamic Management Views (DMVs) provide real-time and historical query statistics without explicit session configuration—ideal for ad-hoc investigation and continuous monitoring.

sqlserver_dmv_profiling.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
-- ================================================================
-- SQL Server DMV-Based Query Profiling
-- ================================================================
 
-- ----------------------------------------------------------------
-- sys.dm_exec_query_stats: Aggregated Query Statistics
-- ----------------------------------------------------------------
 
-- Top queries by total CPU time
SELECT TOP 20
    qs.total_worker_time / 1000000.0 AS total_cpu_seconds,
    qs.execution_count,
    qs.total_worker_time / qs.execution_count / 1000.0 AS avg_cpu_ms,
    qs.total_elapsed_time / 1000000.0 AS total_elapsed_seconds,
    qs.total_logical_reads,
    qs.total_logical_reads / qs.execution_count AS avg_logical_reads,
    qs.total_physical_reads,
    qs.last_execution_time,
    SUBSTRING(st.text, 
        (qs.statement_start_offset / 2) + 1,
        ((CASE qs.statement_end_offset 
            WHEN -1 THEN DATALENGTH(st.text) 
            ELSE qs.statement_end_offset 
         END - qs.statement_start_offset) / 2) + 1
    ) AS query_text
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
ORDER BY qs.total_worker_time DESC;
 
-- Top queries by logical reads (memory pressure indicator)
SELECT TOP 20
    qs.total_logical_reads,
    qs.execution_count,
    qs.total_logical_reads / qs.execution_count AS avg_reads_per_exec,
    qs.total_worker_time / 1000000.0 AS total_cpu_sec,
    qs.total_elapsed_time / 1000000.0 AS total_elapsed_sec,
    SUBSTRING(st.text, 
        (qs.statement_start_offset / 2) + 1,
        ((CASE qs.statement_end_offset 
            WHEN -1 THEN DATALENGTH(st.text) 
            ELSE qs.statement_end_offset 
         END - qs.statement_start_offset) / 2) + 1
    ) AS query_text
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
ORDER BY qs.total_logical_reads DESC;
 
-- ----------------------------------------------------------------
-- sys.dm_exec_requests: Currently Executing Queries
-- ----------------------------------------------------------------
 
-- View all currently executing queries with performance metrics
SELECT 
    r.session_id,
    r.status,
    r.command,
    r.wait_type,
    r.wait_time,
    r.blocking_session_id,
    r.cpu_time,
    r.total_elapsed_time / 1000.0 AS elapsed_seconds,
    r.logical_reads,
    r.reads AS physical_reads,
    r.writes,
    r.row_count,
    r.percent_complete,
    s.login_name,
    s.host_name,
    s.program_name,
    db.name AS database_name,
    SUBSTRING(st.text, 
        (r.statement_start_offset / 2) + 1,
        ((CASE r.statement_end_offset 
            WHEN -1 THEN DATALENGTH(st.text) 
            ELSE r.statement_end_offset 
         END - r.statement_start_offset) / 2) + 1
    ) AS current_statement,
    qp.query_plan
FROM sys.dm_exec_requests r
JOIN sys.dm_exec_sessions s ON r.session_id = s.session_id
LEFT JOIN sys.databases db ON r.database_id = db.database_id
CROSS APPLY sys.dm_exec_sql_text(r.sql_handle) st
CROSS APPLY sys.dm_exec_query_plan(r.plan_handle) qp
WHERE r.session_id > 50  -- Exclude system sessions
  AND r.session_id != @@SPID  -- Exclude current session
ORDER BY r.total_elapsed_time DESC;
 
-- ----------------------------------------------------------------
-- Query Store: Historical Query Performance (SQL Server 2016+)
-- ----------------------------------------------------------------
 
-- Enable Query Store on a database
ALTER DATABASE YourDatabase
SET QUERY_STORE = ON (
    OPERATION_MODE = READ_WRITE,
    CLEANUP_POLICY = (STALE_QUERY_THRESHOLD_DAYS = 30),
    DATA_FLUSH_INTERVAL_SECONDS = 900,
    INTERVAL_LENGTH_MINUTES = 60,
    MAX_STORAGE_SIZE_MB = 1000,
    QUERY_CAPTURE_MODE = AUTO,
    SIZE_BASED_CLEANUP_MODE = AUTO
);
 
-- Top resource-consuming queries from Query Store
SELECT TOP 20
    q.query_id,
    qt.query_sql_text,
    SUM(rs.count_executions) AS total_executions,
    SUM(rs.avg_duration * rs.count_executions) / 1000000.0 AS total_duration_sec,
    AVG(rs.avg_duration) / 1000.0 AS avg_duration_ms,
    SUM(rs.avg_cpu_time * rs.count_executions) / 1000000.0 AS total_cpu_sec,
    SUM(rs.avg_logical_io_reads * rs.count_executions) AS total_logical_reads,
    AVG(rs.avg_logical_io_reads) AS avg_logical_reads
FROM sys.query_store_query q
JOIN sys.query_store_query_text qt ON q.query_text_id = qt.query_text_id
JOIN sys.query_store_plan p ON q.query_id = p.query_id
JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id
GROUP BY q.query_id, qt.query_sql_text
ORDER BY total_cpu_sec DESC;

Avoid SQL Profiler for Production

Oracle Profiling Tools

Automatic Workload Repository (AWR) and Active Session History (ASH) form the foundation of Oracle's performance monitoring architecture.

oracle_profiling_tools.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
-- ================================================================
-- Oracle Database: Comprehensive Profiling Infrastructure
-- ================================================================
 
-- ----------------------------------------------------------------
-- V$SQL: Real-Time SQL Statistics
-- ----------------------------------------------------------------
 
-- Top SQL by CPU consumption
SELECT 
    sql_id,
    child_number,
    plan_hash_value,
    executions,
    ROUND(cpu_time / 1000000, 2) AS cpu_seconds,
    ROUND(elapsed_time / 1000000, 2) AS elapsed_seconds,
    buffer_gets,
    disk_reads,
    rows_processed,
    ROUND(buffer_gets / NULLIF(executions, 0)) AS buffer_gets_per_exec,
    ROUND(cpu_time / NULLIF(executions, 0) / 1000) AS cpu_ms_per_exec,
    SUBSTR(sql_text, 1, 200) AS sql_preview
FROM v$sql
WHERE executions > 0
ORDER BY cpu_time DESC
FETCH FIRST 20 ROWS ONLY;
 
-- ----------------------------------------------------------------
-- V$SQL_PLAN_STATISTICS_ALL: Operator-Level Profiling
-- ----------------------------------------------------------------
 
-- View actual row counts vs estimates per plan operator
-- (Requires STATISTICS_LEVEL=ALL or gather_plan_statistics hint)
SELECT 
    id,
    parent_id,
    operation || ' ' || options AS operation,
    object_name,
    cardinality AS estimated_rows,
    output_rows AS actual_rows,
    CASE 
        WHEN output_rows > cardinality * 10 THEN '⚠️ UNDERESTIMATE'
        WHEN output_rows < cardinality / 10 THEN '⚠️ OVERESTIMATE'
        ELSE '✓'
    END AS estimate_accuracy,
    ROUND(elapsed_time / 1000) AS elapsed_ms,
    starts,
    buffer_gets,
    disk_reads
FROM v$sql_plan_statistics_all
WHERE sql_id = '&sql_id'
  AND child_number = 0
ORDER BY id;
 
-- ----------------------------------------------------------------
-- DBMS_XPLAN: Display Execution Plan with Statistics
-- ----------------------------------------------------------------
 
-- Explain plan with actual statistics
SELECT * FROM TABLE(
    DBMS_XPLAN.DISPLAY_CURSOR(
        sql_id => '&sql_id',
        cursor_child_no => 0,
        format => 'ALLSTATS LAST OUTLINE +PREDICATE +COST'
    )
);
 
/*
Sample DBMS_XPLAN output with statistics:
 
----------------------------------------------------------------------
| Id  | Operation          | Name   | Starts | E-Rows | A-Rows | Buffers |
----------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |      1 |        |    100 |    2500 |
|   1 |  SORT ORDER BY     |        |      1 |    100 |    100 |    2500 |
|*  2 |   HASH JOIN        |        |      1 |    100 |    100 |    2500 |
|   3 |    TABLE ACCESS... | CUST   |      1 |   1000 |   1000 |     250 |
|*  4 |    TABLE ACCESS... | ORDERS |      1 |   100K | 100000 |    2250 |
----------------------------------------------------------------------
 
Key columns:
- Starts: Times operator invoked
- E-Rows: Estimated rows (optimizer prediction)
- A-Rows: Actual rows (measured)
- Buffers: Buffer gets (logical I/O)
*/
 
-- ----------------------------------------------------------------
-- Active Session History (ASH): Session-Level Wait Analysis
-- ----------------------------------------------------------------
 
-- Find what a slow session was doing
SELECT 
    sample_time,
    session_id,
    session_serial#,
    event,
    wait_class,
    sql_id,
    sql_opname,
    ROUND(time_waited / 1000) AS wait_ms,
    blocking_session,
    module,
    action
FROM v$active_session_history
WHERE sample_time > SYSDATE - INTERVAL '30' MINUTE
  AND session_id = &session_id
ORDER BY sample_time DESC;
 
-- Top wait events across all sessions
SELECT 
    event,
    wait_class,
    COUNT(*) AS sample_count,
    ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 2) AS pct_of_db_time,
    MIN(sample_time) AS first_seen,
    MAX(sample_time) AS last_seen
FROM v$active_session_history
WHERE sample_time > SYSDATE - INTERVAL '1' HOUR
  AND wait_class != 'Idle'
GROUP BY event, wait_class
ORDER BY sample_count DESC
FETCH FIRST 20 ROWS ONLY;
 
-- ----------------------------------------------------------------
-- SQL Monitoring: Real-Time Operations Profiling
-- ----------------------------------------------------------------
 
-- View actively monitored long-running operations
SELECT 
    sql_id,
    status,
    sid,
    sql_text,
    ROUND(elapsed_time / 1000000, 2) AS elapsed_sec,
    ROUND(cpu_time / 1000000, 2) AS cpu_sec,
    buffer_gets,
    disk_reads,
    ROUND(buffer_gets / NULLIF(disk_reads, 0), 0) AS hit_ratio
FROM v$sql_monitor
WHERE status IN ('EXECUTING', 'DONE')
ORDER BY elapsed_time DESC
FETCH FIRST 20 ROWS ONLY;
 
-- Generate detailed SQL Monitor report
SELECT DBMS_SQLTUNE.REPORT_SQL_MONITOR(
    sql_id => '&sql_id',
    type => 'HTML',
    report_level => 'ALL'  -- BASIC, TYPICAL, ALL
) AS report
FROM dual;

Oracle Real-Time SQL Monitoring

Cross-Platform Profiling Patterns

While each database system implements profiling differently, common analytical patterns apply universally. Mastering these patterns enables effective performance analysis regardless of platform.

Universal Profiling Patterns

•Top-N Analysis — Identify the N most resource-consuming queries by CPU, I/O, or elapsed time. These typically represent 80%+ of optimization opportunity with 20% of effort.
•Estimate vs. Actual Comparison — Compare optimizer estimates against actual row counts at each execution stage. Large discrepancies indicate statistics issues or optimization failures.
•Wait Analysis — Determine whether queries are spending time computing (CPU-bound) or waiting (I/O-bound, lock-bound). The bottleneck type dictates the optimization approach.
•Temporal Pattern Recognition — Compare query performance across time periods to identify regression, improvement, or correlation with external factors (backups, reports, user activity).
•Workload Characterization — Categorize queries by type (SELECT/INSERT/UPDATE), access pattern (point lookup/range scan/full scan), and frequency. Optimization priorities differ by category.

Profiling Tool Selection by Analysis Need
Analysis Need	MySQL	PostgreSQL	SQL Server	Oracle
Single query deep dive	EXPLAIN ANALYZE + Performance Schema stages	EXPLAIN (ANALYZE, BUFFERS)	Actual Execution Plan + SET STATISTICS IO ON	DBMS_XPLAN.DISPLAY_CURSOR + SQL Monitor
Aggregate workload analysis	sys.statement_analysis, Performance Schema digests	pg_stat_statements	Query Store, sys.dm_exec_query_stats	AWR, V$SQL
Real-time monitoring	Performance Schema events_current	pg_stat_activity	sys.dm_exec_requests	V$SESSION, V$SQL_MONITOR
Wait/blocking analysis	Performance Schema events_waits_*	pg_stat_activity.wait_event	Extended Events, sys.dm_os_wait_stats	ASH, V$SESSION_WAIT
Historical trending	Events tables with timestamp filtering	pg_stat_statements (with manual snapshots)	Query Store	AWR snapshots

Summary: Query Profiler Tools

Key Takeaways

•Profiling measures reality — Unlike EXPLAIN plans which predict, profilers capture what actually happened during execution.
•Multiple granularities exist — Statement-level, operator-level, and system-level profiling serve different analytical needs.
•Profiling has overhead — Configure appropriately for your environment; never leave detailed profiling enabled continuously in production.
•Each platform differs — MySQL uses Performance Schema, PostgreSQL embeds timing in EXPLAIN ANALYZE, SQL Server provides Extended Events/DMVs, Oracle offers AWR/ASH/SQL Monitor.
•Common patterns transcend platforms — Top-N analysis, estimate-vs-actual comparison, and wait analysis apply universally.
•Tools complement each other — Use aggregate statistics to identify problem queries, then detailed profiling to understand why they're slow.

What's next:

Page Complete

1 / 5