Loading learning content...
When a SQL query runs slower than expected, the natural instinct is to stare at the query text, hoping to spot the problem through sheer intuition. This approach occasionally works for simple cases, but it fundamentally cannot scale to complex queries, production workloads, or subtle performance regressions.
Query profiler tools represent the systematic alternative—they instrument the database engine itself, capturing detailed telemetry about what actually happens during query execution. Rather than guessing where time is spent, you measure with precision. Rather than hypothesizing about resource consumption, you observe directly.
The difference between a developer who 'thinks' a query is slow because of a missing index versus one who can definitively demonstrate that 72% of execution time is spent on a sequential scan of a specific table is the difference between software development and software engineering. Profiler tools bridge that gap.
By the end of this page, you will understand the architecture and capabilities of query profiler tools across major database systems. You'll learn to activate, configure, and interpret profiling output—transforming raw performance data into actionable optimization insights. This foundation enables every subsequent technique in the query profiling discipline.
Query profiling is the practice of instrumenting SQL statement execution to collect detailed metrics about resource consumption, timing, and internal operations. Unlike EXPLAIN plans which predict behavior, profilers measure actual execution—capturing what truly happened when your query ran.
This distinction is crucial. EXPLAIN provides the optimizer's plan—its best prediction of how a query will execute. Profiling provides reality—what actually occurred, including time spent, rows processed, and resources consumed. Plans can be wrong; profiling data reflects truth.
Profiling operates at multiple granularities: statement-level (entire query), operator-level (individual plan nodes), and system-level (database-wide aggregates). Effective performance analysis requires navigating between these levels—zooming out to identify problematic queries, then drilling down to isolate specific operators causing issues.
Why profiling matters more than intuition:
Consider a query joining five tables that takes 3 seconds to execute. Without profiling, you might spend hours optimizing the 'obvious' problem—perhaps adding indexes to the largest table. With profiling, you might discover in seconds that 2.8 of those 3 seconds are spent waiting for a lock held by a different transaction, and the query itself is highly efficient.
Profilers prevent the most common performance optimization failure: solving the wrong problem. They ensure your optimization effort targets genuine bottlenecks, not perceived ones.
Understanding how profilers work internally helps you interpret their output correctly and configure them appropriately. Most database profilers share a common architectural pattern with three key components:
1. Instrumentation Points
The database engine contains numerous instrumentation points—code locations where profiling data is captured. These include:
Each instrumentation point records timestamps, counters, or state snapshots. The density of instrumentation varies by database system and configuration—more instrumentation provides more detail but incurs overhead.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
-- Conceptual view of instrumentation flow during query execution-- (This illustrates the internal process, not actual executable SQL) /* Query Lifecycle with Profiling Instrumentation: ┌────────────────────────────────────────────────────────────────┐ │ CLIENT SENDS QUERY │ └────────────────────────────────────────────────────────────────┘ │ ▼ ┌────────────────────────────────────────────────────────────────┐ │ PHASE 1: PARSING │ │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │ │ 📊 Instrumentation: │ │ • parse_start_time = NOW() │ │ • token_count++ │ │ • syntax_tree_nodes++ │ │ • parse_end_time = NOW() │ │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │ └────────────────────────────────────────────────────────────────┘ │ ▼ ┌────────────────────────────────────────────────────────────────┐ │ PHASE 2: OPTIMIZATION │ │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │ │ 📊 Instrumentation: │ │ • plans_evaluated++ │ │ • cardinality_estimates[] │ │ • cost_calculations[] │ │ • optimization_elapsed_ms │ │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │ └────────────────────────────────────────────────────────────────┘ │ ▼ ┌────────────────────────────────────────────────────────────────┐ │ PHASE 3: EXECUTION │ │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │ │ 📊 Instrumentation (per operator): │ │ • rows_examined++ │ │ • rows_returned++ │ │ • buffer_reads++ │ │ • buffer_hits++ │ │ • temp_space_used++ │ │ • operator_elapsed_ns │ │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │ └────────────────────────────────────────────────────────────────┘ │ ▼ ┌────────────────────────────────────────────────────────────────┐ │ PHASE 4: RESULT TRANSMISSION │ │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │ │ 📊 Instrumentation: │ │ • bytes_sent++ │ │ • network_round_trips++ │ │ • client_wait_time_ms │ │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ │ └────────────────────────────────────────────────────────────────┘*/2. Collection Infrastructure
Instrumentation points feed data into collection infrastructure—typically memory buffers, ring buffers, or shared memory segments designed for minimal-overhead data capture. Key design considerations include:
3. Access Interfaces
Collected profiling data is exposed through various interfaces:
Profiling is not free. Instrumentation adds overhead—typically 2-15% depending on granularity. High-resolution profiling (timing every row processed) can add 30%+ overhead. For production analysis, use session-based profiling on specific connections rather than database-wide instrumentation. Never leave detailed profiling enabled continuously in production.
MySQL provides a rich ecosystem of profiling capabilities that evolved significantly across versions. Understanding the available tools and their appropriate use cases is essential for MySQL performance analysis.
Performance Schema is MySQL's primary performance monitoring infrastructure—a comprehensive instrumentation framework built into the server. Unlike deprecated alternatives, Performance Schema is designed for production use with configurable overhead.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
-- ================================================================-- MySQL Performance Schema: Core Profiling Infrastructure-- ================================================================ -- Check if Performance Schema is enabledSHOW VARIABLES LIKE 'performance_schema'; -- Result: Should show 'ON' for profiling capabilities -- ------------------------------------------------------------------ Configure statement instrumentation-- ---------------------------------------------------------------- -- Enable statement event consumptionUPDATE performance_schema.setup_consumersSET ENABLED = 'YES'WHERE NAME LIKE 'events_statements%'; -- Enable statement instrumentationUPDATE performance_schema.setup_instrumentsSET ENABLED = 'YES', TIMED = 'YES'WHERE NAME LIKE 'statement/%'; -- ------------------------------------------------------------------ Query recently executed statements with timing-- ---------------------------------------------------------------- -- View current session's last statement detailsSELECT THREAD_ID, EVENT_ID, SQL_TEXT, DIGEST_TEXT, TIMER_WAIT / 1000000000 AS elapsed_seconds, TIMER_START / 1000000000 AS start_seconds, LOCK_TIME / 1000000000 AS lock_seconds, ROWS_EXAMINED, ROWS_SENT, ROWS_AFFECTED, CREATED_TMP_TABLES, CREATED_TMP_DISK_TABLES, SELECT_FULL_JOIN, SELECT_SCAN, SORT_ROWS, NO_INDEX_USED, NO_GOOD_INDEX_USEDFROM performance_schema.events_statements_currentWHERE THREAD_ID = ( SELECT THREAD_ID FROM performance_schema.threads WHERE PROCESSLIST_ID = CONNECTION_ID()); -- View statement history for current threadSELECT SQL_TEXT, TIMER_WAIT / 1000000000000 AS elapsed_ms, ROWS_EXAMINED, ROWS_SENT, CREATED_TMP_TABLES, CREATED_TMP_DISK_TABLES, NO_INDEX_USEDFROM performance_schema.events_statements_historyWHERE THREAD_ID = ( SELECT THREAD_ID FROM performance_schema.threads WHERE PROCESSLIST_ID = CONNECTION_ID())ORDER BY EVENT_ID DESCLIMIT 10;Key Performance Schema Tables for Query Analysis:
| Table | Purpose | Key Metrics |
|---|---|---|
events_statements_current | Currently executing statements | Real-time execution state |
events_statements_history | Recent statements per thread | Per-session query history |
events_statements_history_long | Recent statements globally | Cross-session query history |
events_statements_summary_by_digest | Aggregated by normalized query | Pattern-based analysis |
events_stages_current | Current execution stages | Phase-level timing |
events_waits_current | Current wait events | Resource contention details |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586
-- ================================================================-- Stage-Level Profiling: Understanding Execution Phases-- ================================================================ -- Enable stage instrumentation for detailed phase timingUPDATE performance_schema.setup_instrumentsSET ENABLED = 'YES', TIMED = 'YES'WHERE NAME LIKE 'stage/%'; UPDATE performance_schema.setup_consumersSET ENABLED = 'YES'WHERE NAME LIKE 'events_stages%'; -- Execute a query to profile -- Then view stage breakdown for the statementSELECT EVENT_NAME AS stage_name, TIMER_WAIT / 1000000000 AS duration_ms, NESTING_EVENT_IDFROM performance_schema.events_stages_historyWHERE NESTING_EVENT_ID = ( SELECT EVENT_ID FROM performance_schema.events_statements_history WHERE THREAD_ID = ( SELECT THREAD_ID FROM performance_schema.threads WHERE PROCESSLIST_ID = CONNECTION_ID() ) ORDER BY EVENT_ID DESC LIMIT 1)ORDER BY EVENT_ID; -- Sample output understanding:-- ┌──────────────────────────────────────┬─────────────┐-- │ stage_name │ duration_ms │-- ├──────────────────────────────────────┼─────────────┤-- │ stage/sql/starting │ 0.025 │-- │ stage/sql/checking permissions │ 0.008 │-- │ stage/sql/Opening tables │ 0.145 │-- │ stage/sql/init │ 0.012 │-- │ stage/sql/System lock │ 0.005 │-- │ stage/sql/optimizing │ 0.089 │-- │ stage/sql/statistics │ 12.450 │ ← Cardinality estimation-- │ stage/sql/preparing │ 0.034 │-- │ stage/sql/executing │ 847.230 │ ← Actual row processing-- │ stage/sql/Sending data │ 2.340 │-- │ stage/sql/end │ 0.003 │-- │ stage/sql/query end │ 0.004 │-- │ stage/sql/closing tables │ 0.018 │-- │ stage/sql/freeing items │ 0.012 │-- │ stage/sql/cleaning up │ 0.002 │-- └──────────────────────────────────────┴─────────────┘ -- ================================================================-- Sys Schema: Simplified Performance Analysis Views-- ================================================================ -- The sys schema provides human-readable views over Performance Schema -- Top queries by total execution timeSELECT * FROM sys.statements_with_runtimes_in_95th_percentile; -- Queries with full table scansSELECT * FROM sys.statements_with_full_table_scansORDER BY exec_count DESCLIMIT 10; -- Queries sorted by total latencySELECT query, db, exec_count, total_latency, avg_latency, rows_sent_avg, rows_examined_avgFROM sys.statement_analysisORDER BY total_latency DESCLIMIT 10; -- View I/O statistics per tableSELECT * FROM sys.schema_table_statisticsWHERE table_schema = 'your_database'ORDER BY total_latency DESC;MySQL's SHOW PROFILE command is deprecated as of 5.7 and should not be used for new development. It remains available but Performance Schema provides superior capabilities with production-appropriate overhead. Migrate existing SHOW PROFILE workflows to Performance Schema queries for future compatibility.
PostgreSQL takes a different architectural approach to query profiling—integrating timing information directly into EXPLAIN output rather than maintaining separate profiling infrastructure. This design provides operator-level detail without requiring separate configuration or memory allocation.
EXPLAIN (ANALYZE) is PostgreSQL's primary query profiling mechanism, combining execution plan display with actual runtime measurements.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124
-- ================================================================-- PostgreSQL EXPLAIN ANALYZE: Integrated Query Profiling-- ================================================================ -- Basic EXPLAIN ANALYZE: Plan + Actual Execution MetricsEXPLAIN ANALYZESELECT c.customer_name, COUNT(o.order_id) AS order_count, SUM(o.total_amount) AS total_spentFROM customers cLEFT JOIN orders o ON c.customer_id = o.customer_idWHERE c.created_at >= CURRENT_DATE - INTERVAL '1 year'GROUP BY c.customer_id, c.customer_nameHAVING COUNT(o.order_id) > 5ORDER BY total_spent DESCLIMIT 100; /*Output with ANALYZE includes actual timing: Hash Join (cost=312.50..1245.80 rows=500 width=68) (actual time=15.234..89.456 rows=487 loops=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ACTUAL execution metrics Hash Cond: (o.customer_id = c.customer_id) -> Seq Scan on orders o (cost=0.00..850.00 rows=25000 width=24) (actual time=0.015..35.234 rows=24856 loops=1) ^^^^^^^^^^^^^^^^^^^^^^ Actual rows vs estimated -> Hash (cost=250.00..250.00 rows=5000 width=44) (actual time=12.890..12.890 rows=4892 loops=1) -> Seq Scan on customers c (cost=0.00..250.00 rows=5000 width=44) (actual time=0.008..8.234 rows=4892 loops=1) Filter: (created_at >= (CURRENT_DATE - '1 year'::interval)) Rows Removed by Filter: 1108 ^^^^ Rows filtered (not returned)*/ -- ================================================================-- EXPLAIN Options for Comprehensive Profiling-- ================================================================ -- Full profiling with all available metricsEXPLAIN ( ANALYZE, -- Actually execute the query (required for actual metrics) BUFFERS, -- Show buffer/page access statistics TIMING, -- Show timing information (can be disabled for less overhead) VERBOSE, -- Show additional detail (output columns, schema names) FORMAT JSON -- Machine-parseable output format)SELECT product_id, SUM(quantity) FROM order_items GROUP BY product_idHAVING SUM(quantity) > 100; /*Key metrics from BUFFERS option: Seq Scan on order_items (cost=0.00..4000.00 rows=200000 width=12) (actual time=0.015..125.890 rows=198500 loops=1) Buffers: shared hit=1234 read=890 ^^^^^^^^ ^^^^ Cache hits Physical disk reads (fast) (slow - disk I/O) When shared read >> shared hit: Table not cached, I/O bottleneckWhen shared hit >> shared read: Well-cached, CPU-bound execution*/ -- ================================================================-- Understanding Key Metrics in EXPLAIN ANALYZE Output-- ================================================================ /*CRITICAL METRICS TO ANALYZE: 1. ACTUAL TIME (startup..total) - startup: Time to return first row - total: Time to return all rows - "actual time=0.015..125.890" means 0.015ms to first row, 125ms total 2. ROWS (actual vs estimated) - Significant mismatch indicates stale statistics or complex predicates - "rows=5000" (estimated) vs "rows=4892" (actual) = good estimate - "rows=100" (estimated) vs "rows=45000" (actual) = BAD, reanalyze table 3. LOOPS - Number of times operator executed (nested loop iterations) - Total time = (actual time) × loops - "loops=1000" with "actual time=0.5ms" = 500ms total contribution 4. BUFFERS - shared hit: Pages read from buffer cache (fast) - shared read: Pages read from disk (slow) - temp read/written: Pages in temporary files (very slow) 5. ROWS REMOVED BY FILTER - Rows examined but not returned - High value suggests potential index opportunity*/ -- Practical example with buffer analysisEXPLAIN (ANALYZE, BUFFERS, TIMING)SELECT * FROM large_table WHERE indexed_column = 42 AND unindexed_column = 'value'; /*Output analysis:Index Scan using idx_indexed_column on large_table (actual time=0.045..125.890 rows=50 loops=1) Index Cond: (indexed_column = 42) Filter: (unindexed_column = 'value') Rows Removed by Filter: 9950 ^^^^ High filter removal = index found 10000 rows, but only 50 matched SOLUTION: Consider composite index (indexed_column, unindexed_column) Buffers: shared hit=150 read=25 ^^^^^^^^^^^^^^^^ 175 total pages accessed; 25 required disk I/O*/pg_stat_statements: Aggregate Query Statistics
While EXPLAIN ANALYZE profiles individual query executions, pg_stat_statements provides aggregate statistics across all executions of a query pattern—essential for identifying which queries consume the most resources over time.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788
-- ================================================================-- pg_stat_statements: Workload-Level Query Profiling-- ================================================================ -- Enable the extension (requires superuser, typically in postgresql.conf)-- shared_preload_libraries = 'pg_stat_statements'CREATE EXTENSION IF NOT EXISTS pg_stat_statements; -- View parameters controlling statistics collectionSHOW pg_stat_statements.max; -- Maximum tracked statementsSHOW pg_stat_statements.track; -- What to track: top, all, none -- ================================================================-- Finding Resource-Intensive Queries-- ================================================================ -- Top queries by total execution timeSELECT calls, total_exec_time::NUMERIC(12,2) AS total_time_ms, mean_exec_time::NUMERIC(12,2) AS avg_time_ms, stddev_exec_time::NUMERIC(12,2) AS stddev_ms, rows, shared_blks_hit, shared_blks_read, ROUND(100.0 * shared_blks_hit / NULLIF(shared_blks_hit + shared_blks_read, 0), 2) AS cache_hit_pct, LEFT(query, 100) AS query_previewFROM pg_stat_statementsORDER BY total_exec_time DESCLIMIT 20; -- Top queries by I/O (disk reads)SELECT calls, shared_blks_read AS disk_reads, shared_blks_hit AS cache_hits, rows, total_exec_time::NUMERIC(10,2) AS total_ms, LEFT(query, 100) AS query_previewFROM pg_stat_statementsWHERE shared_blks_read > 0ORDER BY shared_blks_read DESCLIMIT 20; -- Queries with high variability (inconsistent performance)SELECT calls, mean_exec_time::NUMERIC(10,2) AS avg_ms, stddev_exec_time::NUMERIC(10,2) AS stddev_ms, min_exec_time::NUMERIC(10,2) AS min_ms, max_exec_time::NUMERIC(10,2) AS max_ms, (max_exec_time / NULLIF(min_exec_time, 0))::NUMERIC(10,1) AS variance_ratio, LEFT(query, 80) AS query_previewFROM pg_stat_statementsWHERE calls > 100 -- Sufficient sample size AND mean_exec_time > 10 -- Non-trivial queriesORDER BY stddev_exec_time DESCLIMIT 20; -- Reset statistics (typically after optimization round)SELECT pg_stat_statements_reset(); -- ================================================================-- Normalized Query Analysis-- ================================================================ /*pg_stat_statements normalizes queries by replacing literal values: Original queries: SELECT * FROM users WHERE id = 123; SELECT * FROM users WHERE id = 456; SELECT * FROM users WHERE id = 789; Normalized as: SELECT * FROM users WHERE id = $1; This groups equivalent queries for aggregate analysis,revealing which query PATTERNS consume resources—not just which specific parameter values are slow.*/ -- View query patterns with parameter placeholdersSELECT queryid, query, calls, mean_exec_timeFROM pg_stat_statementsWHERE query LIKE '%users%'ORDER BY total_exec_time DESC;Use pg_stat_statements to identify which queries consume the most resources across your workload. Then use EXPLAIN ANALYZE on those specific queries to understand why they're expensive and how to optimize them. This two-phase approach ensures you optimize the right queries.
Microsoft SQL Server provides an extensive profiling ecosystem that has evolved from trace-based approaches to lightweight DMV-based analysis. Understanding these tools and their appropriate use cases is essential for SQL Server performance work.
Extended Events represent SQL Server's modern profiling infrastructure—a flexible, low-overhead event capture system that replaced SQL Trace and Profiler for production use.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
-- ================================================================-- SQL Server Extended Events: Modern Query Profiling-- ================================================================ -- Create an Extended Events session for query profilingCREATE EVENT SESSION QueryProfiling ON SERVERADD EVENT sqlserver.sql_statement_completed ( ACTION ( sqlserver.sql_text, sqlserver.query_hash, sqlserver.query_plan_hash, sqlserver.database_name, sqlserver.username, sqlserver.client_app_name ) WHERE ( duration > 1000000 -- Queries taking > 1 second (microseconds) AND database_name = N'YourDatabase' )),ADD EVENT sqlserver.rpc_completed ( ACTION ( sqlserver.sql_text, sqlserver.query_hash, sqlserver.database_name ) WHERE ( duration > 1000000 ))ADD TARGET package0.ring_buffer ( SET max_memory = 4096 -- KB)WITH ( MAX_MEMORY = 4096 KB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY = 5 SECONDS, STARTUP_STATE = OFF); -- Start the sessionALTER EVENT SESSION QueryProfiling ON SERVER STATE = START; -- View captured eventsSELECT event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name, event_data.value('(event/@timestamp)[1]', 'datetime2') AS event_time, event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') / 1000000.0 AS duration_seconds, event_data.value('(event/data[@name="cpu_time"]/value)[1]', 'bigint') / 1000000.0 AS cpu_seconds, event_data.value('(event/data[@name="logical_reads"]/value)[1]', 'bigint') AS logical_reads, event_data.value('(event/data[@name="physical_reads"]/value)[1]', 'bigint') AS physical_reads, event_data.value('(event/data[@name="writes"]/value)[1]', 'bigint') AS writes, event_data.value('(event/data[@name="row_count"]/value)[1]', 'bigint') AS row_count, event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(max)') AS sql_textFROM ( SELECT CAST(target_data AS XML) AS target_xml FROM sys.dm_xe_session_targets xst JOIN sys.dm_xe_sessions xs ON xs.address = xst.event_session_address WHERE xs.name = 'QueryProfiling') AS raw_dataCROSS APPLY target_xml.nodes('RingBufferTarget/event') AS events(event_data)ORDER BY event_time DESC; -- Stop and drop session when doneALTER EVENT SESSION QueryProfiling ON SERVER STATE = STOP;DROP EVENT SESSION QueryProfiling ON SERVER;Dynamic Management Views (DMVs) provide real-time and historical query statistics without explicit session configuration—ideal for ad-hoc investigation and continuous monitoring.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119
-- ================================================================-- SQL Server DMV-Based Query Profiling-- ================================================================ -- ------------------------------------------------------------------ sys.dm_exec_query_stats: Aggregated Query Statistics-- ---------------------------------------------------------------- -- Top queries by total CPU timeSELECT TOP 20 qs.total_worker_time / 1000000.0 AS total_cpu_seconds, qs.execution_count, qs.total_worker_time / qs.execution_count / 1000.0 AS avg_cpu_ms, qs.total_elapsed_time / 1000000.0 AS total_elapsed_seconds, qs.total_logical_reads, qs.total_logical_reads / qs.execution_count AS avg_logical_reads, qs.total_physical_reads, qs.last_execution_time, SUBSTRING(st.text, (qs.statement_start_offset / 2) + 1, ((CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(st.text) ELSE qs.statement_end_offset END - qs.statement_start_offset) / 2) + 1 ) AS query_textFROM sys.dm_exec_query_stats qsCROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) stORDER BY qs.total_worker_time DESC; -- Top queries by logical reads (memory pressure indicator)SELECT TOP 20 qs.total_logical_reads, qs.execution_count, qs.total_logical_reads / qs.execution_count AS avg_reads_per_exec, qs.total_worker_time / 1000000.0 AS total_cpu_sec, qs.total_elapsed_time / 1000000.0 AS total_elapsed_sec, SUBSTRING(st.text, (qs.statement_start_offset / 2) + 1, ((CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(st.text) ELSE qs.statement_end_offset END - qs.statement_start_offset) / 2) + 1 ) AS query_textFROM sys.dm_exec_query_stats qsCROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) stORDER BY qs.total_logical_reads DESC; -- ------------------------------------------------------------------ sys.dm_exec_requests: Currently Executing Queries-- ---------------------------------------------------------------- -- View all currently executing queries with performance metricsSELECT r.session_id, r.status, r.command, r.wait_type, r.wait_time, r.blocking_session_id, r.cpu_time, r.total_elapsed_time / 1000.0 AS elapsed_seconds, r.logical_reads, r.reads AS physical_reads, r.writes, r.row_count, r.percent_complete, s.login_name, s.host_name, s.program_name, db.name AS database_name, SUBSTRING(st.text, (r.statement_start_offset / 2) + 1, ((CASE r.statement_end_offset WHEN -1 THEN DATALENGTH(st.text) ELSE r.statement_end_offset END - r.statement_start_offset) / 2) + 1 ) AS current_statement, qp.query_planFROM sys.dm_exec_requests rJOIN sys.dm_exec_sessions s ON r.session_id = s.session_idLEFT JOIN sys.databases db ON r.database_id = db.database_idCROSS APPLY sys.dm_exec_sql_text(r.sql_handle) stCROSS APPLY sys.dm_exec_query_plan(r.plan_handle) qpWHERE r.session_id > 50 -- Exclude system sessions AND r.session_id != @@SPID -- Exclude current sessionORDER BY r.total_elapsed_time DESC; -- ------------------------------------------------------------------ Query Store: Historical Query Performance (SQL Server 2016+)-- ---------------------------------------------------------------- -- Enable Query Store on a databaseALTER DATABASE YourDatabaseSET QUERY_STORE = ON ( OPERATION_MODE = READ_WRITE, CLEANUP_POLICY = (STALE_QUERY_THRESHOLD_DAYS = 30), DATA_FLUSH_INTERVAL_SECONDS = 900, INTERVAL_LENGTH_MINUTES = 60, MAX_STORAGE_SIZE_MB = 1000, QUERY_CAPTURE_MODE = AUTO, SIZE_BASED_CLEANUP_MODE = AUTO); -- Top resource-consuming queries from Query StoreSELECT TOP 20 q.query_id, qt.query_sql_text, SUM(rs.count_executions) AS total_executions, SUM(rs.avg_duration * rs.count_executions) / 1000000.0 AS total_duration_sec, AVG(rs.avg_duration) / 1000.0 AS avg_duration_ms, SUM(rs.avg_cpu_time * rs.count_executions) / 1000000.0 AS total_cpu_sec, SUM(rs.avg_logical_io_reads * rs.count_executions) AS total_logical_reads, AVG(rs.avg_logical_io_reads) AS avg_logical_readsFROM sys.query_store_query qJOIN sys.query_store_query_text qt ON q.query_text_id = qt.query_text_idJOIN sys.query_store_plan p ON q.query_id = p.query_idJOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_idGROUP BY q.query_id, qt.query_sql_textORDER BY total_cpu_sec DESC;SQL Server Profiler (the GUI tool) uses SQL Trace infrastructure with significant overhead—typically 15-30% performance impact. Never use it on production servers. Extended Events provides equivalent functionality with 1-3% overhead. SQL Profiler should only be used in development or for brief diagnostic captures.
Oracle Database provides the most comprehensive profiling infrastructure among major databases—a reflection of its enterprise heritage and performance engineering focus. Understanding this ecosystem enables deep performance analysis for Oracle deployments.
Automatic Workload Repository (AWR) and Active Session History (ASH) form the foundation of Oracle's performance monitoring architecture.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151
-- ================================================================-- Oracle Database: Comprehensive Profiling Infrastructure-- ================================================================ -- ------------------------------------------------------------------ V$SQL: Real-Time SQL Statistics-- ---------------------------------------------------------------- -- Top SQL by CPU consumptionSELECT sql_id, child_number, plan_hash_value, executions, ROUND(cpu_time / 1000000, 2) AS cpu_seconds, ROUND(elapsed_time / 1000000, 2) AS elapsed_seconds, buffer_gets, disk_reads, rows_processed, ROUND(buffer_gets / NULLIF(executions, 0)) AS buffer_gets_per_exec, ROUND(cpu_time / NULLIF(executions, 0) / 1000) AS cpu_ms_per_exec, SUBSTR(sql_text, 1, 200) AS sql_previewFROM v$sqlWHERE executions > 0ORDER BY cpu_time DESCFETCH FIRST 20 ROWS ONLY; -- ------------------------------------------------------------------ V$SQL_PLAN_STATISTICS_ALL: Operator-Level Profiling-- ---------------------------------------------------------------- -- View actual row counts vs estimates per plan operator-- (Requires STATISTICS_LEVEL=ALL or gather_plan_statistics hint)SELECT id, parent_id, operation || ' ' || options AS operation, object_name, cardinality AS estimated_rows, output_rows AS actual_rows, CASE WHEN output_rows > cardinality * 10 THEN '⚠️ UNDERESTIMATE' WHEN output_rows < cardinality / 10 THEN '⚠️ OVERESTIMATE' ELSE '✓' END AS estimate_accuracy, ROUND(elapsed_time / 1000) AS elapsed_ms, starts, buffer_gets, disk_readsFROM v$sql_plan_statistics_allWHERE sql_id = '&sql_id' AND child_number = 0ORDER BY id; -- ------------------------------------------------------------------ DBMS_XPLAN: Display Execution Plan with Statistics-- ---------------------------------------------------------------- -- Explain plan with actual statisticsSELECT * FROM TABLE( DBMS_XPLAN.DISPLAY_CURSOR( sql_id => '&sql_id', cursor_child_no => 0, format => 'ALLSTATS LAST OUTLINE +PREDICATE +COST' )); /*Sample DBMS_XPLAN output with statistics: ----------------------------------------------------------------------| Id | Operation | Name | Starts | E-Rows | A-Rows | Buffers |----------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | | 100 | 2500 || 1 | SORT ORDER BY | | 1 | 100 | 100 | 2500 ||* 2 | HASH JOIN | | 1 | 100 | 100 | 2500 || 3 | TABLE ACCESS... | CUST | 1 | 1000 | 1000 | 250 ||* 4 | TABLE ACCESS... | ORDERS | 1 | 100K | 100000 | 2250 |---------------------------------------------------------------------- Key columns:- Starts: Times operator invoked- E-Rows: Estimated rows (optimizer prediction)- A-Rows: Actual rows (measured)- Buffers: Buffer gets (logical I/O)*/ -- ------------------------------------------------------------------ Active Session History (ASH): Session-Level Wait Analysis-- ---------------------------------------------------------------- -- Find what a slow session was doingSELECT sample_time, session_id, session_serial#, event, wait_class, sql_id, sql_opname, ROUND(time_waited / 1000) AS wait_ms, blocking_session, module, actionFROM v$active_session_historyWHERE sample_time > SYSDATE - INTERVAL '30' MINUTE AND session_id = &session_idORDER BY sample_time DESC; -- Top wait events across all sessionsSELECT event, wait_class, COUNT(*) AS sample_count, ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 2) AS pct_of_db_time, MIN(sample_time) AS first_seen, MAX(sample_time) AS last_seenFROM v$active_session_historyWHERE sample_time > SYSDATE - INTERVAL '1' HOUR AND wait_class != 'Idle'GROUP BY event, wait_classORDER BY sample_count DESCFETCH FIRST 20 ROWS ONLY; -- ------------------------------------------------------------------ SQL Monitoring: Real-Time Operations Profiling-- ---------------------------------------------------------------- -- View actively monitored long-running operationsSELECT sql_id, status, sid, sql_text, ROUND(elapsed_time / 1000000, 2) AS elapsed_sec, ROUND(cpu_time / 1000000, 2) AS cpu_sec, buffer_gets, disk_reads, ROUND(buffer_gets / NULLIF(disk_reads, 0), 0) AS hit_ratioFROM v$sql_monitorWHERE status IN ('EXECUTING', 'DONE')ORDER BY elapsed_time DESCFETCH FIRST 20 ROWS ONLY; -- Generate detailed SQL Monitor reportSELECT DBMS_SQLTUNE.REPORT_SQL_MONITOR( sql_id => '&sql_id', type => 'HTML', report_level => 'ALL' -- BASIC, TYPICAL, ALL) AS reportFROM dual;SQL Monitoring automatically activates for queries expected to run >5 seconds or those using parallelism. It provides real-time visibility into execution progress, including completion percentage, time distribution, and operator-level statistics—invaluable for diagnosing long-running queries without waiting for completion.
While each database system implements profiling differently, common analytical patterns apply universally. Mastering these patterns enables effective performance analysis regardless of platform.
| Analysis Need | MySQL | PostgreSQL | SQL Server | Oracle |
|---|---|---|---|---|
| Single query deep dive | EXPLAIN ANALYZE + Performance Schema stages | EXPLAIN (ANALYZE, BUFFERS) | Actual Execution Plan + SET STATISTICS IO ON | DBMS_XPLAN.DISPLAY_CURSOR + SQL Monitor |
| Aggregate workload analysis | sys.statement_analysis, Performance Schema digests | pg_stat_statements | Query Store, sys.dm_exec_query_stats | AWR, V$SQL |
| Real-time monitoring | Performance Schema events_current | pg_stat_activity | sys.dm_exec_requests | V$SESSION, V$SQL_MONITOR |
| Wait/blocking analysis | Performance Schema events_waits_* | pg_stat_activity.wait_event | Extended Events, sys.dm_os_wait_stats | ASH, V$SESSION_WAIT |
| Historical trending | Events tables with timestamp filtering | pg_stat_statements (with manual snapshots) | Query Store | AWR snapshots |
Query profiler tools transform performance analysis from guesswork into engineering. By instrumenting the database engine and exposing detailed execution metrics, they enable precise diagnosis of performance issues that would otherwise require hours of speculation.
What's next:
Now that you understand the profiling tools available, the next page examines execution statistics in depth—the specific metrics profilers collect, what each metric reveals about query behavior, and how to interpret them to identify performance bottlenecks.
You now understand the architecture and capabilities of query profiler tools across major database systems. You can activate profiling, navigate the relevant system views and commands, and recognize which tools apply to different analytical scenarios. This foundation enables the detailed execution statistics analysis covered next.