Loading content...
You've created a perfect index. The column it covers is exactly what your WHERE clause needs. You run the query—and the database ignores your index completely, scanning the entire table instead.
The culprit? A function applied to the indexed column.
This is one of the most common and frustrating performance anti-patterns in SQL. It's frustrating because:
The issue is that functions transform column values, and the database can't navigate an index using transformed values. It's like trying to use a phone book organized by last name to find someone by their initials—the organizational structure exists, but it can't help with your specific search pattern.
This pattern is so common that it has earned its own name: the 'non-SARGable predicate' problem. SARG stands for Search ARGument, and a SARGable predicate is one that can be evaluated using an index seek. Applying a function to a column makes the predicate non-SARGable.
By the end of this page, you will understand exactly why functions prevent index usage, recognize the most common function-on-column patterns, master rewrite techniques to restore index eligibility, and learn when function-based indexes can solve problems that rewrites cannot.
To understand why functions break index lookups, we need to examine how indexes enable efficient searching.
How B-Tree Index Seeks Work:
A B-tree index stores column values in sorted order. When you search for WHERE email = 'john@example.com', the database:
This navigation is possible because the search value and the indexed values are directly comparable. The database knows that 'john@example.com' sorts between 'james@example.com' and 'kate@example.com' without examining every value.
The Function Problem:
When you write WHERE UPPER(email) = 'JOHN@EXAMPLE.COM', you're not searching for an email value—you're searching for the result of UPPER() applied to email values.
The index stores:
You're searching for 'JOHN@EXAMPLE.COM'. The database cannot navigate the index because:
12345678910111213141516171819202122232425262728293031
-- Table with indexed columnCREATE TABLE employees ( employee_id INT PRIMARY KEY, email VARCHAR(255) NOT NULL, hire_date DATE NOT NULL); CREATE INDEX idx_employees_email ON employees(email);CREATE INDEX idx_employees_hire_date ON employees(hire_date); -- SEARCHABLE (SARGable) - Uses index seekSELECT * FROM employees WHERE email = 'john.doe@company.com';-- Database navigates index directly to 'john.doe@company.com' -- NON-SEARCHABLE (Non-SARGable) - Full scan despite indexSELECT * FROM employees WHERE UPPER(email) = 'JOHN.DOE@COMPANY.COM';-- Database must:-- 1. Read every row in the index/table-- 2. Apply UPPER() to each email value-- 3. Compare result to 'JOHN.DOE@COMPANY.COM'-- The index exists but provides NO navigational benefit -- Another non-SARGable exampleSELECT * FROM employees WHERE YEAR(hire_date) = 2023;-- Must evaluate YEAR() on every hire_date value-- Cannot seek to "2023" in the hire_date index -- The fundamental rule:-- function(column) = value ← NON-SARGable, full scan-- column = value ← SARGable, index seek-- column = function(value) ← SARGable, index seek (function on the right is OK)A predicate is SARGable (index-eligible) when the indexed column appears 'naked' on one side of the comparison. Any expression, function, or operation that wraps the column makes the predicate non-SARGable. Functions on the other side of the comparison (the value side) are fine—those are evaluated once, not per row.
Function-on-column anti-patterns appear in many forms. Some are obvious; others are subtle. Let's examine the most common offenders.
Pattern 1: String Case Functions
The most common pattern—using UPPER(), LOWER(), or similar functions for case-insensitive searches:
12345678910111213141516171819202122232425262728293031323334
-- PROBLEM: Case-insensitive search using functionSELECT * FROM customers WHERE UPPER(email) = UPPER('John@Example.com');SELECT * FROM products WHERE LOWER(name) LIKE LOWER('wireless%'); -- Why this happens:-- Developers want case-insensitive matching-- Default collations are often case-sensitive-- UPPER/LOWER seems like an obvious solution -- The performance cost:-- 1 million customers → 1 million UPPER() calls-- Index completely ignored -- SOLUTIONS: -- Solution 1: Use case-insensitive collation at column levelALTER TABLE customers ALTER COLUMN email VARCHAR(255) COLLATE Latin1_General_CI_AS;-- CI = Case Insensitive-- Now: WHERE email = 'John@Example.com' works case-insensitively -- Solution 2: Use case-insensitive collation in query (SQL Server)SELECT * FROM customers WHERE email COLLATE Latin1_General_CI_AS = 'John@Example.com'; -- Solution 3: PostgreSQL ILIKE for case-insensitive LIKESELECT * FROM products WHERE name ILIKE 'wireless%';-- Still may not use index efficiently; consider citext extension -- Solution 4: Store normalized version in separate indexed columnALTER TABLE customers ADD email_lower VARCHAR(255);UPDATE customers SET email_lower = LOWER(email);CREATE INDEX idx_customers_email_lower ON customers(email_lower);-- Query: WHERE email_lower = LOWER('John@Example.com')Pattern 2: Date/Time Extraction Functions
Extracting components from dates (year, month, day) is extremely common and extremely problematic:
123456789101112131415161718192021222324252627282930313233343536
-- PROBLEMS: Extracting date components -- Finding all orders from 2023SELECT * FROM orders WHERE YEAR(order_date) = 2023; -- Finding all events in MarchSELECT * FROM events WHERE MONTH(event_date) = 3; -- Finding Monday eventsSELECT * FROM events WHERE DATEPART(dw, event_date) = 2; -- All of these apply functions to the indexed column-- All trigger full table scans -- SOLUTIONS: Rewrite as range predicates -- Solution for YEAR() - convert to rangeSELECT * FROM orders WHERE order_date >= '2023-01-01' AND order_date < '2024-01-01';-- Index on order_date can seek to start of 2023, scan until 2024 -- Solution for MONTH() - also range (but may span years)SELECT * FROM eventsWHERE (order_date >= '2023-03-01' AND order_date < '2023-04-01') OR (order_date >= '2024-03-01' AND order_date < '2024-04-01');-- Multiple ranges if spanning multiple years -- Alternative: Computed column for month (if frequent query pattern)ALTER TABLE events ADD event_month AS MONTH(event_date) PERSISTED;CREATE INDEX idx_events_month ON events(event_month);-- Now: WHERE event_month = 3 uses the index -- Solution for day of week - harder, may need computed columnALTER TABLE events ADD event_dow AS DATEPART(dw, event_date) PERSISTED;CREATE INDEX idx_events_dow ON events(event_dow);Pattern 3: String Manipulation Functions
Substring extraction, trimming, and concatenation on indexed columns:
1234567891011121314151617181920212223242526272829303132
-- PROBLEMS: String manipulation -- Finding by area code (first 3 chars of phone)SELECT * FROM customers WHERE LEFT(phone, 3) = '415';SELECT * FROM customers WHERE SUBSTRING(phone, 1, 3) = '415'; -- Finding SKUs in a category (prefix match)SELECT * FROM products WHERE LEFT(sku, 2) = 'EL'; -- Trimming whitespaceSELECT * FROM items WHERE TRIM(code) = 'ABC123'; -- SOLUTIONS: -- For prefix matching: Use LIKE with trailing wildcardSELECT * FROM customers WHERE phone LIKE '415%';-- B-tree indexes support prefix matching (leading wildcard is the problem) SELECT * FROM products WHERE sku LIKE 'EL%';-- Index seek to 'EL', scan until values no longer start with 'EL' -- For trim issues: Clean data on insert, not on query-- Add CHECK constraint and trigger to ensure no leading/trailing whitespaceALTER TABLE items ADD CONSTRAINT chk_code_trimmed CHECK (code = TRIM(code)); -- If legacy data has whitespace, clean it once:UPDATE items SET code = TRIM(code) WHERE code != TRIM(code); -- Alternatively, create computed columnALTER TABLE items ADD code_trimmed AS TRIM(code) PERSISTED;CREATE INDEX idx_items_code_trimmed ON items(code_trimmed);Pattern 4: Mathematical Operations
Arithmetic on indexed numeric columns:
1234567891011121314151617181920212223242526272829303132333435363738
-- PROBLEMS: Arithmetic on columns -- Finding products with 20% marginSELECT * FROM products WHERE price - cost > price * 0.20;-- Operations on both price and cost columns -- Finding transactions over a threshold after feeSELECT * FROM transactions WHERE amount - 2.50 > 100; -- Dividing to find rateSELECT * FROM metrics WHERE total / count > 50; -- SOLUTIONS: Algebraic manipulation to isolate column -- Margin calculation: Rearrange inequality-- price - cost > price * 0.20-- price - cost > 0.2 * price-- price - 0.2 * price > cost-- 0.8 * price > cost-- price > cost / 0.8-- price > cost * 1.25 SELECT * FROM products WHERE price > cost * 1.25;-- Still has operation on cost column...-- If you can't avoid operations, consider computed column -- Transaction threshold: Move constant to right side-- amount - 2.50 > 100-- amount > 102.50SELECT * FROM transactions WHERE amount > 102.50;-- Now index on amount can be used! -- Rate calculation: Multiply both sides-- total / count > 50-- total > 50 * countSELECT * FROM metrics WHERE total > 50 * count;-- Operation still on count... -- For complex cases, computed columns may be necessaryObject-Relational Mappers (ORMs) sometimes generate non-SARGable queries without you realizing it. Methods like .ToLower(), .Trim(), .Year, or date manipulations in LINQ/Entity Framework/Hibernate may translate to function calls on columns. Always inspect generated SQL for production-critical queries.
The primary solution for function-on-column problems is query rewriting—restructuring the predicate so the indexed column appears 'naked' on one side of the comparison.
The General Principle:
NON-SARGable: function(column) = value
SARGable: column = inverse_function(value) -- if inverse exists
SARGable: column IN (range derived from value)
Let's work through systematic rewrite techniques:
| Non-SARGable Pattern | SARGable Rewrite | Notes |
|---|---|---|
YEAR(date_col) = 2023 | date_col >= '2023-01-01' AND date_col < '2024-01-01' | Range predicate on indexed column |
MONTH(date_col) = 6 | date_col >= '2023-06-01' AND date_col < '2023-07-01' | Need to specify year(s) |
DATE(datetime_col) = '2023-05-15' | datetime_col >= '2023-05-15' AND datetime_col < '2023-05-16' | Range for exact date match |
LEFT(col, 3) = 'ABC' | col LIKE 'ABC%' | Prefix match uses index |
SUBSTRING(col, 1, 2) = 'XY' | col LIKE 'XY%' | Same as LEFT() |
UPPER(col) = 'VALUE' | col = 'VALUE' with CI collation | Or use citext/function index |
col + 10 = 50 | col = 40 | Move constant to right side |
col * 2 > 100 | col > 50 | Divide both sides |
COALESCE(col, 0) = 5 | col = 5 | If seeking non-NULL anyway |
DATEDIFF(day, col, GETDATE()) < 7 | col > DATEADD(day, -7, GETDATE()) | Compute boundary date instead |
Deep Dive: Date Range Rewrites
Date-related functions are the most frequently encountered non-SARGable patterns. Here's a comprehensive rewrite reference:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192
-- ========================================-- YEAR() extraction-- ======================================== -- Non-SARGableSELECT * FROM orders WHERE YEAR(order_date) = 2023; -- SARGableSELECT * FROM orders WHERE order_date >= '2023-01-01' AND order_date < '2024-01-01'; -- ========================================-- MONTH() extraction (requires year context)-- ======================================== -- Non-SARGableSELECT * FROM orders WHERE MONTH(order_date) = 12; -- SARGable (assuming specific year or range of years)SELECT * FROM orders WHERE order_date >= '2023-12-01' AND order_date < '2024-01-01'; -- For all Decembers across multiple years:SELECT * FROM orders WHERE (order_date >= '2022-12-01' AND order_date < '2023-01-01') OR (order_date >= '2023-12-01' AND order_date < '2024-01-01');-- Consider partitioning or computed column if this is common -- ========================================-- DAY() extraction-- ======================================== -- Non-SARGable (find all orders on 15th of any month)SELECT * FROM orders WHERE DAY(order_date) = 15; -- This is hard to rewrite efficiently-- Options:-- 1. Computed persisted column: ALTER TABLE ADD order_day AS DAY(order_date)-- 2. Accept full scan for rare queries-- 3. Partition by day if pattern is critical -- ========================================-- DATE() or CAST to DATE from DATETIME-- ======================================== -- Non-SARGableSELECT * FROM events WHERE DATE(event_time) = '2023-06-15';SELECT * FROM events WHERE CAST(event_time AS DATE) = '2023-06-15'; -- SARGableSELECT * FROM events WHERE event_time >= '2023-06-15 00:00:00' AND event_time < '2023-06-16 00:00:00'; -- Or more precisely:SELECT * FROM events WHERE event_time >= '2023-06-15' AND event_time < DATEADD(day, 1, '2023-06-15'); -- ========================================-- DATEDIFF() for "recent" records-- ======================================== -- Non-SARGable (find orders in last 30 days)SELECT * FROM orders WHERE DATEDIFF(day, order_date, GETDATE()) <= 30; -- SARGable (compute the boundary date once)SELECT * FROM orders WHERE order_date >= DATEADD(day, -30, GETDATE());-- DATEADD is computed once; order_date is naked -- ========================================-- Time component extraction-- ======================================== -- Non-SARGable (find events at noon)SELECT * FROM events WHERE DATEPART(hour, event_time) = 12; -- SARGable requires range within each day-- Complex if spanning multiple days; consider computed column -- ========================================-- Date arithmetic-- ======================================== -- Non-SARGableSELECT * FROM subscriptions WHERE end_date + 30 >= GETDATE(); -- SARGable (move constant to right side)SELECT * FROM subscriptions WHERE end_date >= DATEADD(day, -30, GETDATE());Functions on the VALUE side of a comparison are fine—they're evaluated once, not per row. WHERE order_date >= DATEADD(day, -30, GETDATE()) is SARGable because DATEADD is applied to GETDATE(), not to order_date. The result is computed once and used for index navigation.
When query rewrites are impossible or impractical, computed columns and function-based indexes can make non-SARGable patterns searchable.
Computed/Generated Columns:
A computed column stores the result of an expression, allowing you to index it like any other column:
1234567891011121314151617181920212223242526272829303132333435363738394041424344
-- SQL Server: Persisted Computed Column-- The PERSISTED keyword stores the computed value physicallyALTER TABLE employeesADD hire_year AS YEAR(hire_date) PERSISTED; CREATE INDEX idx_employees_hire_year ON employees(hire_year); -- Now this query uses the index:SELECT * FROM employees WHERE hire_year = 2023;-- Translates to seeking on idx_employees_hire_year -- PostgreSQL: Generated Column (stored)ALTER TABLE employeesADD COLUMN hire_year INT GENERATED ALWAYS AS (EXTRACT(YEAR FROM hire_date)) STORED; CREATE INDEX idx_employees_hire_year ON employees(hire_year); -- MySQL 5.7+: Generated ColumnALTER TABLE employeesADD COLUMN hire_year INT AS (YEAR(hire_date)) STORED; CREATE INDEX idx_employees_hire_year ON employees(hire_year); -- ========================================-- Common computed column patterns-- ======================================== -- Case-insensitive search columnALTER TABLE customersADD email_lower AS LOWER(email) PERSISTED;CREATE INDEX idx_customers_email_lower ON customers(email_lower);-- Query: WHERE email_lower = LOWER(@input) -- Area code extractionALTER TABLE contactsADD phone_area_code AS LEFT(phone, 3) PERSISTED;CREATE INDEX idx_contacts_area_code ON contacts(phone_area_code);-- Query: WHERE phone_area_code = '415' -- Full name for searchingALTER TABLE usersADD full_name AS CONCAT(first_name, ' ', last_name) PERSISTED;CREATE INDEX idx_users_full_name ON users(full_name);-- Query: WHERE full_name LIKE 'John Doe%'Function-Based Indexes (Expression Indexes):
Some databases support indexes directly on expressions, without requiring a computed column:
12345678910111213141516171819202122232425262728293031323334353637383940414243
-- PostgreSQL: Expression IndexCREATE INDEX idx_customers_email_lower ON customers(LOWER(email)); -- The optimizer automatically uses this index when query matches:SELECT * FROM customers WHERE LOWER(email) = 'john@example.com';-- No query rewrite needed; index directly supports the expression -- More PostgreSQL expression index examplesCREATE INDEX idx_orders_year ON orders(EXTRACT(YEAR FROM order_date));CREATE INDEX idx_products_name_upper ON products(UPPER(name));CREATE INDEX idx_events_date ON events(DATE(event_time)); -- Oracle: Function-Based IndexCREATE INDEX idx_customers_email_lower ON customers(LOWER(email)); CREATE INDEX idx_employees_hire_year ON employees(EXTRACT(YEAR FROM hire_date)); -- The function in the index MUST match the function in the query exactly-- LOWER(email) index won't help with UPPER(email) query -- MySQL 8.0+: Functional Index (as of 8.0.13)CREATE INDEX idx_customers_email_lower ON customers((LOWER(email)));-- Note the extra parentheses around the expression CREATE INDEX idx_orders_year ON orders((YEAR(order_date))); -- ========================================-- Virtual vs Persisted/Stored-- ======================================== -- Virtual computed columns (SQL Server, non-persisted):ALTER TABLE employees ADD hire_year AS YEAR(hire_date);-- Not stored; computed on each access-- CAN be indexed, but index stores the value anyway -- Expression indexes don't store a column, just the index:-- Storage: Only index structure, not an extra column-- Downside: Can't SELECT the computed value directlyFor a function-based index to be used, the expression in your query must match the indexed expression exactly. LOWER(email) index won't be used for UPPER(email) query. Data type, collation, and function must all match precisely.
When to Use Each Approach:
| Criteria | Computed Column | Function-Based Index |
|---|---|---|
| Need to SELECT the computed value | ✅ Yes, directly queryable | ❌ Must apply function in SELECT |
| Storage overhead | Higher (column + index) | Lower (index only) |
| Multiple expressions on same column | Multiple columns needed | Multiple indexes possible |
| Database support | Broad (most databases) | Varies (PostgreSQL, Oracle: excellent; MySQL: 8.0+) |
| Query rewrite required | Still need to query the computed column | Query can stay unchanged |
| Maintenance complexity | Lower (simpler concept) | Higher (must match expression exactly) |
One of the most common reasons for using UPPER()/LOWER() functions is to perform case-insensitive searches. Understanding collation allows you to achieve case-insensitivity without functions.
What is Collation?
Collation defines the rules for string comparison: how characters are sorted and whether comparisons are case-sensitive, accent-sensitive, etc.
Collation Suffixes (SQL Server example):
_CS = Case Sensitive_CI = Case Insensitive_AS = Accent Sensitive_AI = Accent InsensitiveExample: Latin1_General_CI_AS = Latin1 General, Case Insensitive, Accent Sensitive
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
-- ========================================-- SQL Server Collation Solutions-- ======================================== -- Check current column collationSELECT COLUMN_NAME, COLLATION_NAMEFROM INFORMATION_SCHEMA.COLUMNSWHERE TABLE_NAME = 'customers' AND COLUMN_NAME = 'email'; -- Change column to case-insensitive collationALTER TABLE customersALTER COLUMN email VARCHAR(255) COLLATE Latin1_General_CI_AS NOT NULL; -- Now these queries are equivalent (and both use index):SELECT * FROM customers WHERE email = 'John@Example.com';SELECT * FROM customers WHERE email = 'john@example.com';SELECT * FROM customers WHERE email = 'JOHN@EXAMPLE.COM';-- All find the same rows; index is used -- Query-level collation override (if you can't change column)SELECT * FROM customers WHERE email COLLATE Latin1_General_CI_AS = 'John@Example.com';-- Warning: This may still not use index efficiently-- Column-level collation is better -- ========================================-- PostgreSQL Case-Insensitivity-- ======================================== -- Option 1: Use ILIKE for case-insensitive pattern matchingSELECT * FROM products WHERE name ILIKE 'widget%';-- ILIKE may not use index efficiently for B-tree indexes -- Option 2: Use citext extension (case-insensitive text type)CREATE EXTENSION IF NOT EXISTS citext; ALTER TABLE customers ALTER COLUMN email TYPE citext;-- Now equality comparisons are case-insensitive SELECT * FROM customers WHERE email = 'John@Example.com';-- Matches 'john@example.com' and uses index -- Option 3: Expression index on LOWER()CREATE INDEX idx_customers_email_lower ON customers(LOWER(email));SELECT * FROM customers WHERE LOWER(email) = LOWER('John@Example.com');-- Expression index handles this efficiently -- ========================================-- MySQL Case-Insensitivity-- ======================================== -- MySQL default collations are often case-insensitive-- Check current collation:SHOW FULL COLUMNS FROM customers LIKE 'email'; -- Common case-insensitive collations:-- utf8mb4_general_ci (case insensitive)-- utf8mb4_unicode_ci (case insensitive, more accurate sorting) -- Change column collation:ALTER TABLE customers MODIFY email VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; -- Binary collation for case-sensitive when needed:ALTER TABLE codesMODIFY code VARCHAR(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;The best time to set collation is during schema design. Changing collation on existing columns with data requires careful planning—it can lock tables, invalidate indexes, and cause character set conversion issues. Plan your case-sensitivity requirements before the first INSERT.
Identifying non-SARGable predicates requires examining execution plans and query patterns. Here are detection strategies for major databases.
12345678910111213141516171819202122232425262728293031323334353637383940
-- Check execution plan for table/index scansSET SHOWPLAN_TEXT ON;GOSELECT * FROM employees WHERE YEAR(hire_date) = 2023;GOSET SHOWPLAN_TEXT OFF; -- Look for:-- |--Index Scan or Table Scan (instead of Seek) -- In XML plan, look for SCAN operations with Predicate containing functions:-- <Predicate>-- <ScalarOperator ScalarString="datepart(year,[hire_date])=(2023)"> -- Query plan cache for scans with predicatesSELECT qs.total_elapsed_time / qs.execution_count AS avg_duration_us, qs.execution_count, SUBSTRING(qt.text, qs.statement_start_offset/2 + 1, (CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(qt.text) ELSE qs.statement_end_offset END - qs.statement_start_offset)/2 + 1) AS query_text, qp.query_planFROM sys.dm_exec_query_stats qsCROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) qtCROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qpWHERE CAST(qp.query_plan AS NVARCHAR(MAX)) LIKE '%Index Scan%' OR CAST(qp.query_plan AS NVARCHAR(MAX)) LIKE '%Table Scan%'ORDER BY avg_duration_us DESC; -- Look for common function patterns in query textSELECT textFROM sys.dm_exec_cached_plans cpCROSS APPLY sys.dm_exec_sql_text(cp.plan_handle)WHERE text LIKE '%YEAR(%' OR text LIKE '%MONTH(%' OR text LIKE '%UPPER(%' OR text LIKE '%LOWER(%' OR text LIKE '%DATEPART(%';In execution plans, look for where the predicate is applied. 'Index Cond' or 'Seek Predicate' means the condition is used for index navigation (good). 'Filter' means the condition is applied after reading data (potentially bad if it's filtering most rows from a scan).
Object-Relational Mappers (ORMs) like Entity Framework, Hibernate, and ActiveRecord can generate non-SARGable queries without the developer realizing it. Understanding common patterns helps you avoid these traps.
Common ORM Pitfalls:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657
// C# / Entity Framework Examples // PROBLEM 1: String manipulation methods// Developer writes:var customers = context.Customers .Where(c => c.Email.ToLower() == email.ToLower()) .ToList(); // EF generates:// SELECT * FROM Customers WHERE LOWER(Email) = LOWER(@email)// Function on column → full scan // SOLUTION: Use case-insensitive collation or EF.Functionsvar customers = context.Customers .Where(c => EF.Functions.Like(c.Email, email)) // Uses LIKE, respects collation .ToList(); // Or configure collation in model:modelBuilder.Entity<Customer>() .Property(c => c.Email) .UseCollation("SQL_Latin1_General_CP1_CI_AS"); // ---------------------------------------- // PROBLEM 2: Date component properties// Developer writes:var orders = context.Orders .Where(o => o.OrderDate.Year == 2023) .ToList(); // EF generates:// SELECT * FROM Orders WHERE DATEPART(year, OrderDate) = 2023// Function on column → full scan // SOLUTION: Use date rangeDateTime startDate = new DateTime(2023, 1, 1);DateTime endDate = new DateTime(2024, 1, 1);var orders = context.Orders .Where(o => o.OrderDate >= startDate && o.OrderDate < endDate) .ToList(); // ---------------------------------------- // PROBLEM 3: String length comparisons// Developer writes:var longNames = context.Products .Where(p => p.Name.Length > 50) .ToList(); // EF generates:// SELECT * FROM Products WHERE LEN(Name) > 50// Function on column → full scan // SOLUTION: Computed column in database// ALTER TABLE Products ADD NameLength AS LEN(Name) PERSISTED;// CREATE INDEX idx_name_length ON Products(NameLength);// Then map to entity property12345678910111213141516171819202122232425262728293031323334353637
// Java / Hibernate / JPA Examples // PROBLEM 1: JPQL with functions// Developer writes:String jpql = "SELECT c FROM Customer c WHERE LOWER(c.email) = LOWER(:email)";Query query = em.createQuery(jpql);// Generates: SELECT * FROM customers WHERE LOWER(email) = LOWER(?)// Full scan // SOLUTION: Use database-level case-insensitivity// Or use native query with collation-aware comparison // PROBLEM 2: Criteria API with functionsCriteriaBuilder cb = em.getCriteriaBuilder();CriteriaQuery<Customer> cq = cb.createQuery(Customer.class);Root<Customer> root = cq.from(Customer.class); // This applies function to column:cq.where(cb.equal(cb.lower(root.get("email")), email.toLowerCase()));// Generates: WHERE LOWER(email) = ? // SOLUTION: Ensure data is normalized on insert// Store email as lowercase, search with lowercasecq.where(cb.equal(root.get("email"), email.toLowerCase()));// Requires: email always stored as lowercase // PROBLEM 3: Date criteria// This is problematic:cq.where(cb.equal(cb.function("YEAR", Integer.class, root.get("hireDate")), 2023)); // SOLUTION: Date rangeCalendar start = new GregorianCalendar(2023, 0, 1);Calendar end = new GregorianCalendar(2024, 0, 1);cq.where(cb.and( cb.greaterThanOrEqualTo(root.get("hireDate"), start.getTime()), cb.lessThan(root.get("hireDate"), end.getTime())));ORMs abstract away SQL, but you remain responsible for performance. For any query that runs frequently or against large tables, log and review the generated SQL. Configure your ORM to log queries in development, and check for function-on-column patterns.
Applying functions to indexed columns is one of the most common and easily preventable SQL performance anti-patterns. Mastering this concept is essential for writing performant database queries.
column = value uses index; function(column) = value cannot.YEAR(col) = 2023 becomes col >= '2023-01-01' AND col < '2024-01-01'.You now understand why functions on indexed columns prevent efficient searches and how to rewrite queries for optimal index usage. Next, we'll examine correlated subquery performance issues—another pattern that can devastate query performance.