Loading content...
We've established that query compilation and prepared statements eliminate redundant parsing and optimization within a session. But what happens when multiple users execute identical queries? When the database server restarts? When thousands of concurrent connections all need the same execution plan?
Query caching addresses these scenarios by storing and reusing query-related computations across a broader scope than individual sessions. Modern databases implement multiple layers of caching—from parsed ASTs to optimized plans to compiled code to actual result sets—each with distinct tradeoffs and use cases.
Query caching is not a single feature but an ecosystem of caching mechanisms that, when properly understood and configured, can transform database performance from acceptable to exceptional.
By the end of this page, you will understand the different types of query caches (plan cache, result cache, compiled code cache), how databases make cache lookup decisions, cache invalidation strategies, and how to monitor and tune cache behavior for optimal performance.
Database systems implement multiple distinct caches, each storing different artifacts of query processing. Understanding these cache types is essential for effective performance tuning.
1. Parse Cache / AST Cache
Stores the Abstract Syntax Tree produced by parsing. Eliminates lexical analysis and syntax parsing for repeated query text. Least significant benefit since parsing is typically fast.
2. Plan Cache / Query Plan Cache
Stores optimized execution plans keyed by query text (or a normalized form). This is the most impactful cache—optimization is expensive, and plans are reusable across sessions.
3. Compiled Code Cache
For databases using JIT compilation, stores the compiled machine code or bytecode. Avoids repeated compilation overhead.
4. Result Cache / Query Result Cache
Stores actual query results. Most dramatic performance benefit when data is unchanged, but invalidation is complex.
5. Buffer Pool / Page Cache
Not strictly a query cache, but caches data pages in memory. Query execution benefits from hot data being in memory.
| Cache Type | What's Stored | Key | Scope | Invalidation Complexity |
|---|---|---|---|---|
| Parse Cache | AST | Query text hash | Session/Global | Low (text changes) |
| Plan Cache | Execution plan | Normalized query + params | Global | Medium (schema/stats changes) |
| Compiled Code Cache | Machine code/bytecode | Plan hash | Global | Medium (plan changes) |
| Result Cache | Query output rows | Query + params | Global | High (data changes) |
| Buffer Pool | Data pages | Page ID | Global | Automatic (LRU) |
Think of these caches as a pyramid. At the bottom is the buffer pool (caching raw data), then result cache (caching query outputs), then plan cache (caching how to execute), then code cache (caching compiled execution), and finally parse cache (caching parsing). Each layer above depends on the layers below, and invalidation propagates upward.
Plan caching is the most critical query cache for OLTP workloads. Query optimization is computationally expensive—exploring join orders, evaluating access paths, estimating costs—and the same query shape often recurs with different parameter values.
How plan caching works:
Query normalization example:
-- These queries normalize to the same cache key:
SELECT * FROM users WHERE id = 42;
SELECT * FROM users WHERE id = 123;
SELECT * FROM users WHERE id = 999999;
-- Normalized form:
SELECT * FROM users WHERE id = $1;
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101
// Conceptual Plan Cache Implementation #include <unordered_map>#include <shared_mutex>#include <memory>#include <string> struct ExecutionPlan { std::string normalized_query; std::vector<OperatorNode*> operators; double estimated_cost; int64_t creation_time; std::atomic<int64_t> execution_count{0}; std::atomic<int64_t> total_execution_time{0};}; class PlanCache {private: std::unordered_map<size_t, std::shared_ptr<ExecutionPlan>> cache_; mutable std::shared_mutex mutex_; // Read-write lock size_t max_entries_; std::atomic<int64_t> hits_{0}; std::atomic<int64_t> misses_{0}; public: PlanCache(size_t max_entries) : max_entries_(max_entries) {} // Lookup by query hash - read lock only std::shared_ptr<ExecutionPlan> lookup(const std::string& query) { size_t hash = computeNormalizedHash(query); std::shared_lock lock(mutex_); // Multiple readers OK auto it = cache_.find(hash); if (it != cache_.end()) { hits_++; return it->second; } misses_++; return nullptr; } // Insert new plan - write lock required void insert(const std::string& query, std::shared_ptr<ExecutionPlan> plan) { size_t hash = computeNormalizedHash(query); std::unique_lock lock(mutex_); // Exclusive access // Evict if at capacity if (cache_.size() >= max_entries_) { evictLeastValuable(); } cache_[hash] = std::move(plan); } // Invalidate plans for a table (schema change) void invalidateTable(const std::string& table_name) { std::unique_lock lock(mutex_); for (auto it = cache_.begin(); it != cache_.end(); ) { if (planReferencesTable(it->second, table_name)) { it = cache_.erase(it); } else { ++it; } } } double hitRate() const { int64_t total = hits_ + misses_; return total > 0 ? (double)hits_ / total : 0.0; } private: size_t computeNormalizedHash(const std::string& query) { // 1. Replace literals with placeholders // 2. Normalize whitespace // 3. Standardize case // 4. Compute hash std::string normalized = normalizeQuery(query); return std::hash<std::string>{}(normalized); } void evictLeastValuable() { // Eviction policy: LRU, execution frequency, or combined // This example uses lowest execution count auto victim = std::min_element( cache_.begin(), cache_.end(), [](const auto& a, const auto& b) { return a.second->execution_count < b.second->execution_count; } ); if (victim != cache_.end()) { cache_.erase(victim); } }};Result caching (also called query result cache or query cache) stores the actual output rows of queries. When an identical query is executed, the cached results are returned directly without accessing tables or executing plans.
The appeal: If the same query with the same parameters is executed repeatedly and the underlying data hasn't changed, why recompute?
The challenge: Knowing when cached results are stale. Any modification to tables referenced by the cached query invalidates the result.
Historical note: MySQL famously included a query cache (removed in MySQL 8.0) that became a scalability bottleneck. Every DML operation acquired a global lock to invalidate potentially-affected cached results. For write-heavy workloads, the cache caused more harm than good.
Modern result caching approaches are more sophisticated, using fine-grained invalidation or time-based expiration.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
-- Oracle Database Result Cache Examples -- Enable result cache for a specific query using hintSELECT /*+ RESULT_CACHE */ department_id, COUNT(*) as employee_count, AVG(salary) as avg_salaryFROM employeesWHERE hire_date > DATE '2020-01-01'GROUP BY department_id; -- The first execution computes results; subsequent executions return cached -- Create a table with result cache enabled by defaultCREATE TABLE product_catalog ( product_id NUMBER PRIMARY KEY, name VARCHAR2(100), price NUMBER, category_id NUMBER) RESULT_CACHE (MODE FORCE); -- Queries on this table are automatically cachedSELECT * FROM product_catalog WHERE category_id = 5; -- Monitor result cache usageSELECT name, valueFROM v$result_cache_statistics; -- View cached resultsSELECT id, type, status, name, scan_count, row_count, invalidationsFROM v$result_cache_objectsWHERE type = 'Result'; -- Manually invalidate all cached resultsEXEC DBMS_RESULT_CACHE.FLUSH; -- Invalidate results for specific tableEXEC DBMS_RESULT_CACHE.INVALIDATE('SCHEMA', 'PRODUCT_CATALOG');The effectiveness of any cache depends on its key design. For query caches, the key must uniquely identify "the same query" while maximizing hit rates.
Components of a cache key:
Normalization trade-offs:
Aggressive normalization (replacing all literals) maximizes cache hits but may produce suboptimal plans for skewed data distributions. Conservative normalization (keeping some literals) reduces hit rates but allows parameter-specific optimization.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788
# Query Normalization for Cache Key Generation import reimport hashlibfrom typing import Tuple def normalize_query(sql: str) -> Tuple[str, list]: """ Normalize a SQL query for cache key generation. Returns (normalized_query, extracted_literals). Same logical query with different literal values should produce identical normalized forms. """ # Step 1: Normalize whitespace normalized = ' '.join(sql.split()) # Step 2: Convert to uppercase (SQL is case-insensitive) normalized = normalized.upper() # Step 3: Extract and replace string literals # 'hello' -> $1, 'world' -> $2 literals = [] string_pattern = r"'[^']*'" def replace_string_literal(match): literals.append(match.group(0)) return f"${len(literals)}" normalized = re.sub(string_pattern, replace_string_literal, normalized) # Step 4: Extract and replace numeric literals # 42 -> $N, 3.14 -> $M number_pattern = r'd+.?d*' def replace_number_literal(match): # Skip if part of identifier (e.g., table1) literals.append(match.group(0)) return f"${len(literals)}" normalized = re.sub(number_pattern, replace_number_literal, normalized) # Step 5: Normalize IN lists (order doesn't matter for caching) # IN (3, 1, 2) -> IN ($PARAMS) in_pattern = r'INs*(s*($d+(?:s*,s*$d+)*)s*)' normalized = re.sub(in_pattern, 'IN ($PARAMS)', normalized) return normalized, literals def compute_cache_key( normalized_sql: str, schema_version: int, search_path: str, user_id: int) -> str: """ Compute deterministic cache key from query and context. """ key_components = [ normalized_sql, str(schema_version), search_path, str(user_id) ] key_string = '|'.join(key_components) return hashlib.sha256(key_string.encode()).hexdigest() # Example usage:queries = [ "SELECT * FROM users WHERE id = 42", "SELECT * FROM users WHERE id = 999", "select * from USERS where ID = 1", # Different case "SELECT * FROM users WHERE id = 42", # Extra whitespace] for q in queries: normalized, literals = normalize_query(q) print(f"Original: {q}") print(f"Normalized: {normalized}") print(f"Literals: {literals}") print() # All produce: "SELECT * FROM USERS WHERE ID = $1"# -> Same cache key, same cached planWhen a plan is cached based on the first execution's parameters, subsequent executions with different parameter values may suffer. For example, a plan optimized for status='active' (99% of rows) may be terrible for status='deleted' (1% of rows). This is called 'parameter sniffing' and is a major challenge in plan caching.
As the famous saying goes: "There are only two hard things in Computer Science: cache invalidation and naming things."
Query cache invalidation must balance correctness (never serving stale data) with efficiency (minimizing unnecessary invalidation).
Events requiring invalidation:
For Plan Caches:
For Result Caches:
| Strategy | How It Works | Pros | Cons |
|---|---|---|---|
| Immediate/Eager | Invalidate on every DML | Always correct | High overhead; lock contention |
| Lazy/Deferred | Mark invalid; purge later | Lower overhead | Stale data window |
| Time-Based (TTL) | Expire after duration | Predictable; simple | May serve stale; may expire valid |
| Dependency Tracking | Track table→query deps | Precise invalidation | Complex; memory overhead |
| Version-Based | Schema/data version numbers | Lockless; fast | Requires version infrastructure |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108
# Cache Invalidation Patterns from typing import Dict, Setfrom dataclasses import dataclass, fieldfrom threading import RLockimport time @dataclassclass CachedPlan: query_hash: str plan: object tables_referenced: Set[str] created_at: float schema_version: int execution_count: int = 0 class PlanCacheWithInvalidation: """ Plan cache with dependency-based invalidation. """ def __init__(self, ttl_seconds: float = 3600): self._cache: Dict[str, CachedPlan] = {} self._table_to_plans: Dict[str, Set[str]] = {} # Reverse index self._lock = RLock() self._ttl = ttl_seconds self._current_schema_version = 1 def get(self, query_hash: str) -> Optional[object]: with self._lock: entry = self._cache.get(query_hash) if entry is None: return None # Check TTL if time.time() - entry.created_at > self._ttl: self._remove_entry(query_hash) return None # Check schema version if entry.schema_version != self._current_schema_version: self._remove_entry(query_hash) return None entry.execution_count += 1 return entry.plan def put( self, query_hash: str, plan: object, tables: Set[str] ): with self._lock: entry = CachedPlan( query_hash=query_hash, plan=plan, tables_referenced=tables, created_at=time.time(), schema_version=self._current_schema_version ) self._cache[query_hash] = entry # Maintain reverse index for invalidation for table in tables: if table not in self._table_to_plans: self._table_to_plans[table] = set() self._table_to_plans[table].add(query_hash) def invalidate_table(self, table_name: str): """Invalidate all plans that reference a table.""" with self._lock: affected_plans = self._table_to_plans.get(table_name, set()) for query_hash in list(affected_plans): self._remove_entry(query_hash) def invalidate_schema_change(self): """Invalidate all plans (schema DDL occurred).""" with self._lock: self._current_schema_version += 1 # Lazy invalidation: plans will fail version check on next access # Alternative: clear all immediately # self._cache.clear() # self._table_to_plans.clear() def _remove_entry(self, query_hash: str): entry = self._cache.pop(query_hash, None) if entry: for table in entry.tables_referenced: self._table_to_plans.get(table, set()).discard(query_hash) # Usage with database eventscache = PlanCacheWithInvalidation() # Query executionplan = cache.get("hash_abc123")if plan is None: plan = optimize_query(sql) tables = extract_table_references(sql) cache.put("hash_abc123", plan, tables) # On UPDATE/INSERT/DELETE to 'orders' tablecache.invalidate_table("orders") # On CREATE INDEX / ALTER TABLEcache.invalidate_schema_change()Each major database system implements query caching differently. Understanding these implementations helps in tuning and troubleshooting.
PostgreSQL:
SQL Server:
Oracle:
MySQL/InnoDB:
1234567891011121314151617181920212223242526272829303132333435363738394041
-- PostgreSQL Plan Cache Configuration -- Check prepared statement cache for current sessionSELECT name, statement, prepare_time, parameter_typesFROM pg_prepared_statements; -- Generic vs. Custom Plans-- PostgreSQL creates custom plans for first 5 executions,-- then switches to generic plan if cost is similar -- Force generic plans (ignore parameter values for planning)SET plan_cache_mode = 'force_generic_plan'; -- Force custom plans (re-optimize for each execution)SET plan_cache_mode = 'force_custom_plan'; -- Automatic (default) - switch based on cost comparisonSET plan_cache_mode = 'auto'; -- JIT compilation settingsSET jit = on;SET jit_above_cost = 100000; -- Only JIT expensive queriesSET jit_inline_above_cost = 500000;SET jit_optimize_above_cost = 500000; -- Memory for JIT compilation cache-- (per-session, part of work_mem) -- View pg_stat_statements for query statisticsSELECT substring(query, 1, 50) as query_preview, calls, mean_exec_time as avg_ms, rows, shared_blks_hit, shared_blks_read, (shared_blks_hit::float / nullif(shared_blks_hit + shared_blks_read, 0)) * 100 as cache_hit_pctFROM pg_stat_statementsORDER BY calls DESCLIMIT 10;Effective cache tuning requires continuous monitoring and adjustment based on workload characteristics. Key metrics to track:
Cache Hit Rate:
The percentage of requests satisfied from cache. High hit rates (>90%) indicate effective caching; low rates suggest poor cache key design, insufficient size, or volatile data.
Cache Size and Utilization:
Total cache memory and percentage used. Under-utilized caches waste memory; over-utilized caches cause eviction pressure.
Eviction Rate:
How frequently entries are evicted. High eviction rates with high utilization indicate cache is too small.
Invalidation Rate:
How often entries are invalidated. High rates may indicate write-heavy workloads poorly suited for result caching.
Plan Compilation Time:
Time spent optimizing queries. If significant despite caching, investigate cache misses.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
-- PostgreSQL Cache Monitoring Dashboard -- 1. Buffer Cache Hit Rate (data pages)SELECT sum(heap_blks_hit) as heap_cache_hits, sum(heap_blks_read) as heap_disk_reads, sum(heap_blks_hit) / nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0) * 100 as heap_cache_hit_ratio, sum(idx_blks_hit) as index_cache_hits, sum(idx_blks_read) as index_disk_reads, sum(idx_blks_hit) / nullif(sum(idx_blks_hit) + sum(idx_blks_read), 0) * 100 as index_cache_hit_ratioFROM pg_statio_all_tables; -- 2. Shared Buffers usageSELECT pg_size_pretty( count(*) * (SELECT setting::int FROM pg_settings WHERE name = 'block_size') ) as total_buffer_usage, count(*) as buffers_in_useFROM pg_buffercacheWHERE relfilenode IS NOT NULL; -- 3. Most frequently executed queries (candidates for optimization)SELECT substring(query, 1, 80) as query_preview, calls, mean_exec_time::numeric(10,2) as avg_ms, total_exec_time::numeric(10,2) as total_ms, rows, 100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) as cache_hit_pctFROM pg_stat_statementsORDER BY calls DESCLIMIT 20; -- 4. Queries with poor cache utilizationSELECT substring(query, 1, 80) as query_preview, calls, shared_blks_read as disk_reads, shared_blks_hit as cache_hits, 100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) as cache_hit_pctFROM pg_stat_statementsWHERE shared_blks_read > 1000ORDER BY shared_blks_read DESCLIMIT 10; -- 5. JIT compilation statisticsSELECT sum(jit_functions) as total_jit_functions, sum(jit_generation_time) as total_generation_ms, sum(jit_optimization_time) as total_optimization_ms, sum(jit_emission_time) as total_emission_msFROM pg_stat_statementsWHERE jit_functions > 0;We've explored the multi-layered world of query caching—from parse caches to plan caches to result caches. Let's consolidate the key insights:
What's next:
Query caching ensures we don't repeat expensive work. But cached plans must still be applicable to new parameter values. The next page explores plan reuse—how databases decide when a cached plan is suitable, when to generate fresh plans, and how to handle parameter-sensitive queries.
You now have comprehensive knowledge of query caching mechanisms in database systems. You understand the different cache types, their purposes, invalidation strategies, and how to monitor and tune cache performance. Next, we'll explore plan reuse strategies.