Loading content...
The asterisk (*) in SQL is simultaneously the most convenient and the most potentially dangerous feature of the SELECT statement. With a single character, you can retrieve every column from every table in your query—no typing, no remembering column names, no updates needed when the schema changes.
This convenience comes at a cost. **SELECT *** is the source of countless production bugs, performance disasters, and maintenance nightmares. Yet when used appropriately, it remains a valuable tool for exploration, debugging, and specific technical patterns.
This page provides a comprehensive examination of the asterisk operator—its exact behavior, appropriate use cases, serious pitfalls, and the nuanced judgment required to use it wisely.
By the end of this page, you will understand every aspect of SELECT *—how it expands, how it behaves in joins and subqueries, when it's appropriate to use, and why experienced practitioners treat it with caution in production code.
The asterisk (*) is a special symbol in the SELECT clause that means "all columns from the table(s) in the FROM clause." At query execution time, the asterisk is expanded into a complete list of column names before the query runs.
Expansion behavior:
When you write SELECT * FROM employees, the database internally expands this to something like:
SELECT employee_id, first_name, last_name, email, phone, department_id, ...
FROM employees
This expansion happens based on the table's current schema definition. The order of columns in the expansion follows the table's column definition order.
123456789101112131415161718192021222324252627282930
-- Basic asterisk usageSELECT * FROM employees;-- Expands to all columns in employees table-- Column order matches table definition order -- Equivalent to (in this example):SELECT employee_id, first_name, last_name, email, phone, hire_date, salary, commission_pct, manager_id, department_idFROM employees; -- Asterisk with other columns (mixing is allowed)SELECT *, 'Active' AS statusFROM employees;-- Returns: all columns from employees PLUS a literal column SELECT employee_id, *, salary * 12 AS annual_salaryFROM employees;-- Returns: employee_id, then all columns, then annual_salary-- Note: employee_id appears twice (once explicit, once from *) -- Qualified asterisk: table.* syntaxSELECT employees.*FROM employees;-- Explicitly: all columns from employees (same as SELECT * here) -- Qualified asterisk becomes essential in joinsSELECT employees.*, departments.*FROM employeesJOIN departments ON employees.department_id = departments.department_id;-- Returns: all employee columns, then all department columnsThe order of columns returned by SELECT * is not random—it follows the column definition order in the table schema (typically the order columns were created via CREATE TABLE or ALTER TABLE ADD). However, relying on this order in application code is fragile, as it can change with schema modifications.
| Query Pattern | Expansion | Column Order |
|---|---|---|
SELECT * FROM t | All columns from t | Table definition order |
SELECT t.* FROM t | All columns from t (qualified) | Table definition order |
SELECT *, col FROM t | All cols + col (duplicate if col in t) | All cols, then col |
SELECT col, * FROM t | col + all cols (duplicate if col in t) | col, then all cols |
SELECT * FROM t1 JOIN t2 | All from t1, then all from t2 | t1 cols, then t2 cols |
When a query involves multiple tables, the asterisk's behavior becomes more complex—and more potentially problematic. Understanding how * expands across joins is critical for avoiding bugs and confusion.
Multi-table expansion:
In a join query, SELECT * expands to include columns from all tables in the FROM clause, in the order they appear. This can result in:
12345678910111213141516171819202122232425262728293031323334353637383940
-- Two-table join with asteriskSELECT *FROM employeesJOIN departments ON employees.department_id = departments.department_id; -- Expands to (approximately):-- employees.employee_id, employees.first_name, employees.department_id, ...-- departments.department_id, departments.department_name, departments.location_id, ...-- Note: department_id appears TWICE (from both tables) -- Qualified asterisk for specific table's columnsSELECT employees.*, departments.department_nameFROM employeesJOIN departments ON employees.department_id = departments.department_id;-- Returns: all employee columns + department_name only -- Mixing qualified asterisksSELECT employees.*, departments.*, locations.cityFROM employeesJOIN departments ON employees.department_id = departments.department_idJOIN locations ON departments.location_id = locations.location_id; -- NATURAL JOIN asterisk behavior: Removes duplicate join columnsSELECT *FROM employeesNATURAL JOIN departments;-- Returns: shared columns once, then remaining from employees, then remaining from departments-- department_id appears only ONCE (at the beginning) -- USING clause asterisk behavior: Similar to NATURAL JOINSELECT *FROM employeesJOIN departments USING (department_id);-- department_id appears only once -- Outer join with asterisk shows NULLsSELECT *FROM employeesLEFT JOIN departments ON employees.department_id = departments.department_id;-- Employees without departments: department columns are all NULLMany applications cannot handle duplicate column names in a result set. If employees.id and departments.id both appear in SELECT * with a join, the application may only see one 'id' column (usually the last one). This is a common source of subtle bugs that work in development but fail in production.
The asterisk appears in several subquery patterns with different meanings and implications. Understanding these contexts helps you use * appropriately and recognize its effects.
*Key subquery contexts for :
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061
-- EXISTS: * is conventional and correctSELECT d.department_nameFROM departments dWHERE EXISTS ( SELECT * -- Convention: just checking existence, not column values FROM employees e WHERE e.department_id = d.department_id);-- Semantically equivalent to:SELECT d.department_nameFROM departments dWHERE EXISTS (SELECT 1 FROM employees e WHERE e.department_id = d.department_id); -- Derived table: * passes all columns throughSELECT dt.employee_id, dt.full_nameFROM ( SELECT employee_id, CONCAT(first_name, ' ', last_name) AS full_name, * -- Includes all original columns too FROM employees) AS dtWHERE dt.salary > 50000; -- Better: Only include needed columns in derived tableSELECT dt.employee_id, dt.full_nameFROM ( SELECT employee_id, CONCAT(first_name, ' ', last_name) AS full_name, salary FROM employees) AS dtWHERE dt.salary > 50000; -- IN subquery: * is WRONG (returns multiple columns)-- INCORRECT:SELECT department_nameFROM departmentsWHERE department_id IN (SELECT * FROM employees); -- ERROR! -- CORRECT: Return single columnSELECT department_nameFROM departmentsWHERE department_id IN (SELECT department_id FROM employees); -- Scalar subquery: * is WRONG (must return single value)-- INCORRECT:SELECT employee_id, (SELECT * FROM departments WHERE department_id = e.department_id)FROM employees e; -- ERROR! -- CORRECT: Return single columnSELECT employee_id, (SELECT department_name FROM departments d WHERE d.department_id = e.department_id) AS dept_nameFROM employees e; -- CTE with asterisk: Passes through all columnsWITH emp_data AS ( SELECT * FROM employees WHERE department_id = 5)SELECT * FROM emp_data; -- All employee columns available| Subquery Type | Asterisk Usage | Correct? | Notes |
|---|---|---|---|
| EXISTS | WHERE EXISTS (SELECT * ...) | Yes (conventional) | Value never examined; 1 is equivalent |
| NOT EXISTS | WHERE NOT EXISTS (SELECT * ...) | Yes (conventional) | Same as EXISTS |
| Derived table | FROM (SELECT * FROM t) AS dt | Sometimes | Passes all columns; often too many |
| CTE body | WITH cte AS (SELECT * FROM t) | Sometimes | Same considerations as derived tables |
| IN (single col) | WHERE x IN (SELECT * FROM t) | No | ERROR: IN requires single column |
| Scalar subquery | SELECT (SELECT * FROM t WHERE ...) | No | ERROR: Must return single value |
While SELECT * and SELECT 1 are semantically identical in EXISTS (the database only checks if rows exist, not their values), many style guides prefer SELECT 1 to signal intent clearly. Some databases optimize identically; others may marginally prefer SELECT 1. The choice is largely stylistic.
In production application code, SELECT * is widely considered an anti-pattern—a common but harmful practice that should generally be avoided. Understanding why helps you make informed decisions about when to break this rule.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
-- SCENARIO: Application breaks when schema changes -- Original tableCREATE TABLE users ( id INT PRIMARY KEY, username VARCHAR(50), email VARCHAR(100), created_at TIMESTAMP); -- Application code expecting: id, username, email, created_at-- Using SELECT * works initially -- DBA adds confidential columnALTER TABLE users ADD ssn VARCHAR(11); -- Social Security Number -- Now SELECT * returns SSN to the application-- Security breach: SSN exposed to frontend/logs -- Another scenario: Column removedALTER TABLE users DROP COLUMN email; -- Application expecting email column crashes:-- "Column 'email' not found in result set" -- PROPER APPROACH: Explicit columns -- Application query (immune to column changes)SELECT id, username, email, created_atFROM usersWHERE id = ?; -- If email is removed:-- - Query fails at database level with clear error-- - No silent bugs or security issues-- - Can be fixed by updating the query -- SCENARIO: Performance disaster -- Table with large bio TEXT columnCREATE TABLE profiles ( user_id INT PRIMARY KEY, display_name VARCHAR(100), bio TEXT, -- Average 10KB per row avatar BLOB -- Average 500KB per row); -- Query needing only name and ID-- POOR: SELECT * FROM profiles WHERE active = 1;-- Retrieves: ~510KB per row (bio + avatar) -- BETTER: SELECT user_id, display_name FROM profiles WHERE active = 1;-- Retrieves: ~104 bytes per row -- For 10,000 rows:-- SELECT *: 5.1 GB of data transfer-- Explicit: 1 MB of data transfer-- 5000x difference!A major e-commerce site experienced a 60-minute outage when a DBA added a 'metadata' JSON column to their products table. SELECT * queries suddenly retrieved megabytes of JSON per product, overwhelming application servers. The fix was changing SELECT * to explicit columns—a 5-minute code change that required an hour of downtime.
Despite its pitfalls, SELECT * has legitimate uses. The key is understanding the difference between exploratory work and production code.
*Appropriate contexts for SELECT :
123456789101112131415161718192021222324252627282930313233343536373839
-- 1. Ad-hoc exploration in SQL clientSELECT * FROM unknown_table LIMIT 10;-- Perfectly fine for discovery -- 2. Debugging a specific rowSELECT * FROM orders WHERE order_id = 'ORD-12345';-- Examining all columns to find the problem -- 3. EXISTS pattern (conventional)SELECT customer_id, customer_nameFROM customers cWHERE EXISTS ( SELECT * FROM orders o WHERE o.customer_id = c.customer_id); -- 4. Table cloningCREATE TABLE employees_backup AS SELECT * FROM employees; -- 5. Data migrationINSERT INTO new_employees SELECT * FROM old_employees;-- Note: Requires identical schemas -- 6. View for access controlCREATE VIEW public_products ASSELECT * FROM products WHERE is_public = TRUE;-- All columns, but filtered rows -- 7. CTE for readability when all columns neededWITH relevant_orders AS ( SELECT * FROM orders WHERE order_date >= CURRENT_DATE - INTERVAL '30 days')SELECT customer_id, COUNT(*), SUM(total)FROM relevant_ordersGROUP BY customer_id;-- Still recommend explicit columns, but * is understandable here -- 8. Quick prototype (mark for refactoring)-- TODO: Replace with explicit columns before productionSELECT * FROM products WHERE category = 'electronics';| Context | SELECT * OK? | Recommendation |
|---|---|---|
| Interactive SQL session | ✓ Yes | Fine for exploration |
| Application production code | ✗ No | Always use explicit columns |
| EXISTS subquery | ✓ Yes | Conventional usage |
| Stored procedure body | ✗ No | Explicit for maintainability |
| Backup operations | ✓ Yes | Appropriate for schema cloning |
| ORM-generated queries | Depends | Prefer explicit; ORM may handle |
| Unit test setup | ∼ Sometimes | OK for fixtures, not assertions |
A practical workflow: Use SELECT * during initial development and exploration. Before code review or production deployment, replace all SELECT * with explicit column lists. Some teams enforce this with linting rules that flag SELECT * in application code.
The qualified asterisk (table.* or alias.*) offers more control than naked *, allowing you to select all columns from a specific table while being explicit about other tables.
Syntax:
SELECT table_name.* FROM table_name
SELECT alias.* FROM table_name AS alias
This is particularly useful in joins where you want all columns from one table but only specific columns from others.
123456789101112131415161718192021222324252627282930313233343536373839
-- All columns from one table, specific from anotherSELECT e.*, d.department_nameFROM employees AS eJOIN departments AS d ON e.department_id = d.department_id;-- Returns: all employee columns + department_name only -- Multiple qualified asterisksSELECT e.*, d.*, l.city, l.countryFROM employees AS eJOIN departments AS d ON e.department_id = d.department_idJOIN locations AS l ON d.location_id = l.location_id;-- Returns: all from employees, all from departments, specific from locations -- Qualified asterisk in subqueriesSELECT outer_query.*FROM ( SELECT e.*, d.department_name FROM employees AS e JOIN departments AS d ON e.department_id = d.department_id) AS outer_queryWHERE outer_query.salary > 50000; -- EXCEPT pattern (PostgreSQL, SQL Server 2022+)-- Select all columns except specific onesSELECT * EXCEPT (ssn, internal_id) FROM employees; -- Hypothetical syntax-- Note: Not standard SQL, but supported by some databases -- Workaround for EXCEPT pattern: Use a viewCREATE VIEW employees_public ASSELECT employee_id, first_name, last_name, email, department_id, hire_dateFROM employees;-- Excludes: ssn, salary, internal_notes, etc. SELECT * FROM employees_public; -- Safe to use * on filtered view -- Table-qualified * with schema qualificationSELECT schema1.table1.*, schema2.table2.column1FROM schema1.table1JOIN schema2.table2 ON schema1.table1.id = schema2.table2.table1_id;SELECT orders.*, customers.customer_nameSELECT t.*, CURRENT_TIMESTAMP AS logged_atSELECT o.*, oi.* FROM orders o JOIN order_items oiUsing table.* in joins is marginally better than naked *—it makes the source of columns explicit and prevents some accidental inclusion of unneeded columns. However, the best practice remains explicit column listing for production code, even with qualified asterisks.
The asterisk has a special meaning in the COUNT(*) aggregate function that differs fundamentally from its use in SELECT. Understanding this distinction is important for query correctness and performance.
COUNT(*) vs COUNT(column):
COUNT(*) counts all rows, including those with NULL values in any columnCOUNT(column) counts rows where that specific column is not NULLThis distinction is subtle but critical when data contains NULLs.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
-- Sample table with NULLs-- | id | name | email |-- |----|-------|---------------|-- | 1 | Alice | alice@ex.com |-- | 2 | Bob | NULL |-- | 3 | Carol | carol@ex.com |-- | 4 | Dave | NULL | -- COUNT(*): Count all rowsSELECT COUNT(*) FROM users;-- Returns: 4 (all rows, regardless of NULLs) -- COUNT(column): Count non-NULL valuesSELECT COUNT(email) FROM users;-- Returns: 2 (only Alice and Carol have non-NULL emails) -- COUNT(1) vs COUNT(*): Semantically identicalSELECT COUNT(1) FROM users;SELECT COUNT(*) FROM users;-- Both return 4; COUNT(1) is just counting a constant -- Performance: Modern databases optimize COUNT(*) and COUNT(1) identically-- No need to prefer one over the other -- COUNT with GROUP BYSELECT department_id, COUNT(*) AS total, COUNT(manager_id) AS with_managerFROM employeesGROUP BY department_id;-- total: all employees in department-- with_manager: employees who have a manager assigned -- COUNT(*) in subqueriesSELECT d.department_name, (SELECT COUNT(*) FROM employees e WHERE e.department_id = d.department_id) AS emp_countFROM departments d; -- COUNT(*) vs COUNT(DISTINCT column)SELECT COUNT(*) AS total_orders, -- All order rows COUNT(customer_id) AS orders_with_customer, -- Orders where customer_id not NULL COUNT(DISTINCT customer_id) AS unique_customers -- Distinct customer countFROM orders; -- EXISTS with COUNT(*) (inefficient)-- POOR:SELECT department_nameFROM departments dWHERE (SELECT COUNT(*) FROM employees e WHERE e.department_id = d.department_id) > 0; -- BETTER: Use EXISTS (stops at first match)SELECT department_nameFROM departments dWHERE EXISTS (SELECT 1 FROM employees e WHERE e.department_id = d.department_id);| Expression | What It Counts | NULL Handling |
|---|---|---|
COUNT(*) | All rows in the result | Counts all rows including those with only NULLs |
COUNT(column) | Rows where column is not NULL | Excludes NULL values |
COUNT(DISTINCT column) | Distinct non-NULL values | Excludes NULL values |
COUNT(1) or COUNT('x') | All rows (same as COUNT(*)) | Counts all rows |
COUNT(expression) | Rows where expression is not NULL | Excludes NULL results |
Modern databases recognize COUNT() as 'count all rows' and apply special optimizations. In many cases, COUNT() can be served from an index or table statistics without scanning actual data. Don't try to 'optimize' by changing to COUNT(1) or COUNT(primary_key)—the optimizer handles COUNT(*) efficiently.
The asterisk (*) in SELECT is a powerful shortcut with significant implications. Used wisely, it accelerates exploration and fits certain patterns perfectly. Used carelessly in production, it creates fragile, inefficient, and potentially insecure applications.
Let's consolidate the key points:
What's next:
We'll explore expressions in SELECT—how to compute values, transform data, apply functions, and create dynamic columns that go beyond simple column retrieval to transform your queries into powerful data processing tools.
You now understand SELECT * comprehensively—its behavior, appropriate uses, serious pitfalls, and the judgment required to use it wisely. This knowledge helps you avoid common mistakes while still leveraging the asterisk's convenience when appropriate.