Loading learning content...
The power of SQL lies in its ability to retrieve exactly the data you need—no more, no less. Column selection is where this precision is realized. While beginners often reach for SELECT * as a convenient shortcut, expert practitioners understand that thoughtful column selection is a cornerstone of efficient, maintainable, and performant database applications.
Column selection determines:
This page explores column selection comprehensively, from basic syntax through qualified references, computed columns, and best practices that distinguish professional SQL from amateur attempts.
By the end of this page, you will master column selection techniques—from simple column listing through qualified names, ordinal positions, computed columns, and the critical distinctions between development and production query practices.
At its simplest, column selection involves listing the column names you want to retrieve, separated by commas. This projection operation selects specific attributes from the source relation.
Fundamental rules:
12345678910111213141516171819202122232425262728293031
-- Selecting specific columnsSELECT first_name, last_name, emailFROM employees; -- Column order affects result orderSELECT last_name, first_name, email -- Different from aboveFROM employees; -- Case insensitivity (these are equivalent in most DBs)SELECT FirstName, LastName, Email -- PascalCaseFROM Employees; SELECT FIRSTNAME, LASTNAME, EMAIL -- UPPERCASEFROM EMPLOYEES; SELECT firstname, lastname, email -- lowercaseFROM employees; -- Duplicate columns are allowedSELECT first_name, last_name, first_name AS name_againFROM employees;-- Returns: first_name, last_name, name_again (same data, different names) -- Column reference errorSELECT first_name, last_name, middle_name -- Error if middle_name doesn't existFROM employees; -- "Unknown column 'middle_name'" -- Selecting from joined tables requires awareness of available columnsSELECT employee_id, first_name, department_name -- Columns from both tablesFROM employeesJOIN departments ON employees.department_id = departments.department_id;While column names are typically case-insensitive, quoted identifiers preserve exact case. PostgreSQL folds unquoted names to lowercase and preserves quoted names. Oracle folds to uppercase. SQL Server is configuration-dependent. For portability, use consistent lowercase with underscores.
| Database | Unquoted Handling | Quoted Handling | Recommended Style |
|---|---|---|---|
| PostgreSQL | Folded to lowercase | Exact case preserved | snake_case |
| MySQL | Case-insensitive (usually) | Case-sensitive on Linux | snake_case |
| SQL Server | Configuration-dependent | Exact case preserved | PascalCase or snake_case |
| Oracle | Folded to UPPERCASE | Exact case preserved | UPPERCASE or snake_case |
| SQLite | Case-insensitive | Exact case preserved | snake_case |
When a query involves multiple tables (through joins or subqueries), column names may be ambiguous if the same name exists in multiple tables. Qualified column names resolve this ambiguity by prefixing the column with its table name or alias.
Syntax:
table_name.column_name
--or--
alias.column_name
When qualification is required:
When qualification is optional:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
-- Ambiguous column reference (ERROR)SELECT id, name -- Both tables likely have 'id' and 'name'FROM employeesJOIN departments ON employees.department_id = departments.id;-- ERROR: Column 'id' is ambiguous -- Qualified column names resolve ambiguitySELECT employees.id AS employee_id, employees.name AS employee_name, departments.id AS department_id, departments.name AS department_nameFROM employeesJOIN departments ON employees.department_id = departments.id; -- Using table aliases for brevitySELECT e.id AS employee_id, e.name AS employee_name, d.id AS department_id, d.name AS department_nameFROM employees AS eJOIN departments AS d ON e.department_id = d.id; -- Self-join requires qualificationSELECT emp.id AS employee_id, emp.name AS employee_name, mgr.id AS manager_id, mgr.name AS manager_nameFROM employees AS empLEFT JOIN employees AS mgr ON emp.manager_id = mgr.id; -- Best practice: Always qualify in multi-table queriesSELECT e.id, e.first_name, e.last_name, d.department_name, l.city, l.countryFROM employees AS eJOIN departments AS d ON e.department_id = d.idJOIN locations AS l ON d.location_id = l.id; -- Schema qualification for cross-schema queriesSELECT hr.employees.employee_id, hr.employees.first_name, sales.orders.order_date, sales.orders.total_amountFROM hr.employeesJOIN sales.orders ON hr.employees.employee_id = sales.orders.salesperson_id;e for employees, d for departments makes code concise yet readable.e.name AS employee_name prevents confusion in result sets.emp and mgr beat single letters.When you alias a table (FROM employees AS e), you MUST use the alias in column qualification. Writing employees.first_name after aliasing to 'e' will cause an error in most databases. The alias replaces the original name within that query's scope.
SQL supports positional references—referring to columns by their position number (ordinal) in the SELECT list rather than by name. This feature is primarily used in ORDER BY and GROUP BY clauses.
Syntax:
ORDER BY 1, 2 DESC -- Sort by first column, then second descending
GROUP BY 1, 2 -- Group by first and second columns
Important caveats:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
-- Using ordinals in ORDER BYSELECT first_name, last_name, salaryFROM employeesORDER BY 3 DESC; -- Equivalent to: ORDER BY salary DESC -- Multiple ordinalsSELECT department_id, job_title, COUNT(*) AS emp_countFROM employeesGROUP BY 1, 2 -- GROUP BY department_id, job_titleORDER BY 1, 3 DESC; -- ORDER BY department_id, emp_count DESC -- Mixing ordinals and names (valid but inconsistent)SELECT first_name, last_name, hire_date, salaryFROM employeesORDER BY 4 DESC, first_name ASC; -- Works but confusing -- Why ordinals are fragile-- Original query:SELECT first_name, last_name, salaryFROM employeesORDER BY 3 DESC; -- Orders by salary -- After adding a column:SELECT first_name, last_name, email, salaryFROM employeesORDER BY 3 DESC; -- Now orders by email! (silent bug) -- Safe approach: Always use column namesSELECT first_name, last_name, email, salaryFROM employeesORDER BY salary DESC; -- Clear and refactor-safe -- Ordinals can be useful for complex expressionsSELECT department_id, CASE WHEN AVG(salary) > 100000 THEN 'High' WHEN AVG(salary) > 50000 THEN 'Medium' ELSE 'Low' END AS salary_tierFROM employeesGROUP BY 1ORDER BY 2; -- Easier than repeating the CASE expression -- Better: Use alias in ORDER BYSELECT department_id, CASE WHEN AVG(salary) > 100000 THEN 'High' WHEN AVG(salary) > 50000 THEN 'Medium' ELSE 'Low' END AS salary_tierFROM employeesGROUP BY department_idORDER BY salary_tier; -- Uses alias (works in most DBs)| Clause | Ordinals Allowed? | Notes |
|---|---|---|
| SELECT | No | Cannot reference other columns by position |
| WHERE | No | Must use column names or expressions |
| GROUP BY | Yes | Supported in most databases |
| HAVING | No | Must use column names or aggregates |
| ORDER BY | Yes | Widely supported, but discouraged |
Using column ordinals in production code is widely considered an anti-pattern. It creates fragile queries that break silently when columns are added, removed, or reordered. Always prefer explicit column names or aliases for maintainability and clarity.
Beyond simple column references, SELECT can include expressions—computed values derived from columns, literals, operators, and functions. This capability transforms SELECT from a simple projection into a powerful transformation tool.
Types of expressions in SELECT:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980
-- Arithmetic expressionsSELECT product_name, price, quantity, price * quantity AS line_total, price * quantity * 0.08 AS tax_amount, price * quantity * 1.08 AS total_with_taxFROM order_items; -- String expressionsSELECT first_name, last_name, CONCAT(first_name, ' ', last_name) AS full_name, CONCAT(UPPER(LEFT(first_name, 1)), '.', UPPER(LEFT(last_name, 1)), '.') AS initials, LENGTH(first_name) + LENGTH(last_name) AS name_lengthFROM employees; -- Date/time expressionsSELECT order_id, order_date, ship_date, DATEDIFF(ship_date, order_date) AS days_to_ship, -- MySQL syntax DATE_ADD(order_date, INTERVAL 30 DAY) AS payment_due_date, YEAR(order_date) AS order_year, MONTH(order_date) AS order_monthFROM orders; -- PostgreSQL date arithmeticSELECT order_id, order_date, ship_date, ship_date - order_date AS days_to_ship, order_date + INTERVAL '30 days' AS payment_due_date, EXTRACT(YEAR FROM order_date) AS order_yearFROM orders; -- Conditional expressions with CASESELECT employee_id, first_name, salary, CASE WHEN salary >= 100000 THEN 'Executive' WHEN salary >= 70000 THEN 'Senior' WHEN salary >= 40000 THEN 'Mid-Level' ELSE 'Junior' END AS salary_band, CASE department_id WHEN 1 THEN 'Engineering' WHEN 2 THEN 'Sales' WHEN 3 THEN 'Marketing' ELSE 'Other' END AS department_nameFROM employees; -- NULL handling expressionsSELECT employee_id, first_name, commission_pct, COALESCE(commission_pct, 0) AS commission_or_zero, NULLIF(commission_pct, 0) AS null_if_zero, IFNULL(manager_id, 'No Manager') AS manager_display -- MySQLFROM employees; -- Scalar subquery (returns single value per row)SELECT e.employee_id, e.first_name, e.salary, e.department_id, (SELECT AVG(salary) FROM employees e2 WHERE e2.department_id = e.department_id) AS dept_avg_salary, e.salary - (SELECT AVG(salary) FROM employees e2 WHERE e2.department_id = e.department_id) AS salary_vs_avgFROM employees e;Without an alias, computed columns get auto-generated names that vary by database (e.g., 'expr1', 'column1', or the entire expression). Always provide meaningful aliases like salary * 12 AS annual_salary for readable results and reliable application code.
Column aliases rename columns in the result set. While syntactically simple, aliases have important scope rules and database-specific behaviors that affect where and how they can be used.
Alias syntax variations:
column_expression AS alias -- Standard, explicit
column_expression alias -- Shorthand (no AS)
column_expression AS "Alias" -- Quoted for special chars/spaces
column_expression AS [Alias] -- SQL Server bracket notation
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172
-- Standard alias syntaxSELECT first_name AS given_name, last_name AS family_name, salary AS annual_compensationFROM employees; -- Without AS keyword (works but less explicit)SELECT first_name given_name, last_name family_name, salary annual_compensationFROM employees; -- Quoted aliases for spaces and special charactersSELECT first_name AS "First Name", last_name AS "Last Name", salary AS "Annual Salary ($)", hire_date AS "Start Date (mm/dd/yyyy)"FROM employees; -- SQL Server bracket syntaxSELECT first_name AS [First Name], salary AS [Annual Salary]FROM employees; -- Alias scope: Available in ORDER BYSELECT first_name, last_name, salary * 12 AS annual_salaryFROM employeesORDER BY annual_salary DESC; -- Works: ORDER BY is processed after SELECT -- Alias scope: NOT available in WHERE (processed before SELECT)SELECT first_name, salary * 12 AS annual_salaryFROM employeesWHERE annual_salary > 60000; -- ERROR! Alias doesn't exist yet -- Workaround 1: Repeat the expressionSELECT first_name, salary * 12 AS annual_salaryFROM employeesWHERE salary * 12 > 60000; -- Workaround 2: Use a subquery/CTESELECT * FROM ( SELECT first_name, salary * 12 AS annual_salary FROM employees) AS subqWHERE annual_salary > 60000; -- MySQL special case: Allows alias in GROUP BY (non-standard)SELECT YEAR(hire_date) AS hire_year, COUNT(*) AS emp_countFROM employeesGROUP BY hire_year; -- Works in MySQL, fails in many other DBs -- Standard-compliant versionSELECT YEAR(hire_date) AS hire_year, COUNT(*) AS emp_countFROM employeesGROUP BY YEAR(hire_date);| Clause | Can Use Alias? | Processing Order | Notes |
|---|---|---|---|
| FROM | No | 1st | Aliases not defined yet |
| WHERE | No | 2nd | Processed before SELECT |
| GROUP BY | Usually No* | 3rd | *MySQL allows, others don't |
| HAVING | No | 4th | Use aggregate expressions |
| SELECT | Defined here | 5th | Cannot reference other SELECT aliases |
| ORDER BY | Yes | 6th | Processed after SELECT |
| LIMIT/OFFSET | N/A | 7th | Works on final result |
Use snake_case for unquoted aliases (annual_salary), PascalCase or spaces in quoted aliases for user-facing output. Avoid SQL reserved words as aliases. Keep aliases short but meaningful—full_name is better than fn, but employee_full_name_formatted is too verbose.
When your FROM clause includes multiple tables (through joins), you can select columns from any of them. This is where qualified column names become essential, and where understanding join semantics affects what data is available.
Key considerations:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061
-- Inner join: Columns from both tablesSELECT e.employee_id, e.first_name, e.last_name, d.department_name, d.location_idFROM employees AS eINNER JOIN departments AS d ON e.department_id = d.department_id; -- Left outer join: All employees, NULLs for unmatched departmentsSELECT e.employee_id, e.first_name, e.last_name, d.department_name, -- NULL if employee has no department d.location_id -- NULL if employee has no departmentFROM employees AS eLEFT JOIN departments AS d ON e.department_id = d.department_id; -- Multiple joins: Columns from all tablesSELECT e.employee_id, e.first_name, e.last_name, d.department_name, l.city, l.country, m.first_name AS manager_first_name, m.last_name AS manager_last_nameFROM employees AS eJOIN departments AS d ON e.department_id = d.department_idJOIN locations AS l ON d.location_id = l.location_idLEFT JOIN employees AS m ON e.manager_id = m.employee_id; -- Handling NULL from outer joins with COALESCESELECT e.employee_id, e.first_name, COALESCE(d.department_name, 'Unassigned') AS department_name, COALESCE(m.first_name || ' ' || m.last_name, 'No Manager') AS manager_nameFROM employees AS eLEFT JOIN departments AS d ON e.department_id = d.department_idLEFT JOIN employees AS m ON e.manager_id = m.employee_id; -- Cross join: Every combination (N * M rows)SELECT e.first_name, p.project_name, 'Possible Assignment' AS assignment_typeFROM employees AS eCROSS JOIN projects AS p; -- Natural join: Automatically matches same-named columnsSELECT employee_id, -- No qualification needed (only one source) first_name, department_nameFROM employeesNATURAL JOIN departments;-- Warning: NATURAL JOIN is fragile if schemas changeNATURAL JOIN automatically joins on all columns with matching names. This is convenient but dangerous: adding a column with a common name (like 'id' or 'created_at') to any table can silently change join behavior. Prefer explicit JOIN ON conditions in production code.
Column selection has real performance implications that compound at scale. Understanding these effects helps you write queries that not only return correct data but do so efficiently.
Performance factors:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
-- Table: employees (many columns, including a large 'bio' TEXT column)-- Index: idx_emp_dept (department_id, employee_id, first_name, last_name) -- POOR: Selecting all columns forces table accessSELECT *FROM employeesWHERE department_id = 5;-- Must read: all columns including large 'bio' TEXT-- Cannot use covering index -- BETTER: Select only needed columnsSELECT employee_id, first_name, last_nameFROM employeesWHERE department_id = 5;-- Can potentially use covering index (index-only scan)-- Avoids reading 'bio' and other unnecessary columns -- MEASUREABLE DIFFERENCE at scale:-- employees table: 1 million rows, 50 columns, average row size 2KB-- SELECT * retrieves: ~2GB of data-- SELECT employee_id, first_name: ~50MB of data -- Example: Covering index optimization-- Index exists: CREATE INDEX idx_covering ON orders(customer_id, order_date, total); -- Uses covering index (very fast)SELECT order_date, totalFROM ordersWHERE customer_id = 12345; -- Cannot use covering index (must access table)SELECT order_date, total, shipping_addressFROM ordersWHERE customer_id = 12345; -- Large object column impact-- Table: documents (id, title, file_content BLOB) -- SLOW: Forces BLOB loadingSELECT * FROM documents WHERE id = 100; -- FAST: Avoids BLOBSELECT id, title FROM documents WHERE id = 100; -- When you need the BLOB, get just that rowSELECT file_content FROM documents WHERE id = 100;Periodically audit production queries for SELECT * usage. Replace with explicit column lists. This simple change can significantly reduce database load, network traffic, and application memory usage—especially for frequently-executed queries.
| Factor | SELECT * | Explicit Columns | Improvement |
|---|---|---|---|
| Data transfer | All columns (often KB/row) | Needed columns only | Often 50-90% reduction |
| Memory usage | High (entire rows cached) | Low (subset cached) | Proportional to column count |
| Index usage | Rarely covering | Often covering | Orders of magnitude faster |
| Schema coupling | Breaks on column add | Stable | Fewer production incidents |
| Readability | Unclear intent | Self-documenting | Easier maintenance |
Column selection is a fundamental skill that directly impacts query correctness, performance, and maintainability. The precision with which you specify columns distinguishes professional SQL from amateur attempts.
Let's consolidate the key points:
What's next:
We'll examine the asterisk (*) operator for selecting all columns—when it's appropriate, when to avoid it, and the nuances of its behavior in different contexts including joins and subqueries.
You now understand column selection comprehensively—from basic references through qualification, ordinals, expressions, aliases, and performance implications. This knowledge enables you to write precise, efficient, and maintainable SELECT statements.