Loading content...
Every set operation in relational algebra—union, intersection, and difference—requires a critical precondition: union compatibility. Without it, these operations are undefined, because the resulting relation would have no coherent schema.
Union compatibility ensures that when we combine tuples from two relations, the result forms a valid relation with a well-defined structure. It is the gatekeeper that preserves the integrity of the relational model.
This page provides exhaustive coverage of union compatibility: the formal requirements, how database systems evaluate compatibility, common mistakes, and practical techniques for making incompatible relations compatible through projection and renaming.
By the end of this page, you will master the formal definition of union compatibility, understand the difference between structural and semantic compatibility, know how database systems resolve schema differences, learn techniques to achieve compatibility between mismatched relations, and recognize common compatibility errors and how to avoid them.
Union compatibility (also called type compatibility or schema compatibility) is a relationship between two relations that permits set operations. It is a necessary condition—without it, union, intersection, and difference are undefined.
Formal Definition:
Two relations R(A₁, A₂, ..., Aₙ) and S(B₁, B₂, ..., Bₘ) are union-compatible if and only if:
Same Degree (Arity): n = m
Domain Compatibility: For each i from 1 to n:
Notation:
We write R ≈ S to denote that R and S are union-compatible.
If R ≈ S, then:
In pure relational algebra, union compatibility is based on position, not names. The first attribute of R is compared with the first attribute of S, regardless of what they're called. However, in practice (especially SQL), having meaningful, matching names improves query clarity and some systems offer name-based compatibility modes.
| Requirement | Definition | Example |
|---|---|---|
| Same Degree | Both relations have equal number of attributes | R(A,B,C) and S(X,Y,Z) both have degree 3 ✓ |
| Domain Compatibility | For each position i, dom(Rᵢ) ≈ dom(Sᵢ) | INT compatible with INT, VARCHAR with VARCHAR |
| Position-Based | Comparison by attribute position, not name | R.name (pos 1) vs S.product_name (pos 1) |
What Union Compatibility Does NOT Require:
Result Schema:
When R ∪ S (or R ∩ S, R − S) is computed, the result schema is typically determined by:
Domain compatibility is more nuanced than simple type equality. Database systems typically allow compatible types that can be meaningfully compared, even if not identical.
Compatible Type Categories:
Domains are generally compatible if they belong to the same type family and values can be compared without loss of information:
| Type Family | Compatible Types | Notes |
|---|---|---|
| Integer Numeric | TINYINT, SMALLINT, INT, BIGINT | Smaller types widen to larger |
| Decimal Numeric | FLOAT, DOUBLE, DECIMAL, NUMERIC | Precision may vary |
| Fixed String | CHAR(n) with various n | Padding differences handled |
| Variable String | VARCHAR(n) with various n | Length limits may widen |
| Date/Time | DATE, TIME, DATETIME, TIMESTAMP | Usually compatible within family |
| Boolean | BOOLEAN, BIT, TINYINT(1) | Vendor-specific variations |
Domain Resolution (Widening):
When compatible domains differ, the result domain is typically the "wider" type that can accommodate values from both:
INT + BIGINT → BIGINT (larger integer type)
VARCHAR(50) + VARCHAR(100) → VARCHAR(100) (larger capacity)
FLOAT + DOUBLE → DOUBLE (higher precision)
DECIMAL(10,2) + DECIMAL(8,4) → DECIMAL(12,4) (covers both)
Implicit Conversion:
Some databases perform implicit conversion for set operations:
-- This might work (integer to string conversion)
SELECT id, name FROM Table1 -- id is INT
UNION
SELECT product_code, name FROM Table2 -- product_code is VARCHAR
However, relying on implicit conversion is risky and not recommended. Explicit casting ensures clarity and portability.
While some databases allow implicit conversion between types (e.g., INT to VARCHAR), this can lead to subtle bugs. The same integer might become different strings depending on leading zeros, localization settings, or padding. Always use explicit CAST when type conversion is needed.
A critical distinction that many database learners overlook is the difference between structural compatibility (what the database checks) and semantic compatibility (what makes business sense).
Structural Compatibility:
Semantic Compatibility:
A database will happily union 'customer_age' with 'product_price' if both are INTEGER. The query runs without error, but the result is garbage. The database checks syntax; YOU must check semantics.
Examples of Semantic Mismatches:
-- Structurally compatible, semantically WRONG:
-- Example 1: Different entities
SELECT id, name FROM Customers
UNION
SELECT id, title FROM Products; -- Meaningless mix!
-- Example 2: Different units
SELECT product_id, price_usd FROM US_Store
UNION
SELECT product_id, price_eur FROM EU_Store; -- Currency mismatch!
-- Example 3: Different time zones
SELECT event_id, timestamp_utc FROM US_Events
UNION
SELECT event_id, timestamp_local FROM EU_Events; -- Time zone chaos!
-- Example 4: Different scales
SELECT account_id, balance_cents FROM SystemA
UNION
SELECT account_id, balance_dollars FROM SystemB; -- 100x error!
In each case, the query executes successfully but produces nonsensical or dangerous results.
Before executing any set operation, verbalize what you're combining: 'I'm unioning customer records from last year with customer records from this year.' If the statement sounds coherent, proceed. If it sounds like 'I'm unioning customers with products,' stop and reconsider.
When relations are not naturally union-compatible, we can often achieve compatibility through the strategic use of projection (π) and renaming (ρ). These operations allow us to reshape relations for set operations.
Strategy 1: Projection to Shared Schema
When relations have different degrees, project both to a common subset of compatible attributes:
-- Original incompatible relations:
Customers(cust_id, name, email, phone, address)
Suppliers(supp_id, company_name, contact_email)
-- Project to common compatible schema:
π_{cust_id, name}(Customers) ∪ π_{supp_id, company_name}(Suppliers)
-- Now both have degree 2 with compatible domains
Strategy 2: Renaming for Clarity
While attribute names don't affect compatibility, renaming improves clarity when combining relations:
-- Rename for unified result schema:
ρ_{id, name, email}(π_{cust_id, name, email}(Customers))
∪
ρ_{id, name, email}(π_{supp_id, company_name, contact_email}(Suppliers))
-- Result has clean, unified attribute names: (id, name, email)
Strategy 3: Adding Constant Columns
When one relation is missing an attribute present in another, use constants:
-- SQL: Add source indicator
SELECT cust_id, name, email, 'Customer' AS source_type
FROM Customers
UNION
SELECT supp_id, company_name, contact_email, 'Supplier' AS source_type
FROM Suppliers;
123456789101112131415161718192021222324252627282930313233
-- Problem: Relations with different structures-- Customers(id INT, name VARCHAR, email VARCHAR, loyalty_points INT)-- Prospects(lead_id INT, full_name VARCHAR, contact_email VARCHAR) -- Solution 1: Project to common columnsSELECT id, name, email FROM CustomersUNIONSELECT lead_id, full_name, contact_email FROM Prospects; -- Solution 2: Add NULL padding for missing columnsSELECT id, name, email, loyalty_points, 'customer' AS typeFROM CustomersUNIONSELECT lead_id, full_name, contact_email, NULL, 'prospect' AS typeFROM Prospects; -- Solution 3: Use CAST for type compatibility issuesSELECT id, CAST(product_code AS VARCHAR(50)), name FROM Table1UNIONSELECT id, sku, name FROM Table2; -- Solution 4: Reorder columns for positional matching-- If Table1 has (A, B, C) and Table2 has (C, A, B)SELECT A, B, C FROM Table1UNIONSELECT A, B, C FROM Table2; -- Explicitly order to match -- Solution 5: Handle NULL-heavy columns with COALESCESELECT id, name, COALESCE(email, 'no-email') AS emailFROM CustomersUNIONSELECT id, name, COALESCE(contact, 'no-contact') AS emailFROM Vendors;When projecting for union compatibility, ensure columns are listed in the corresponding order. π_{A,B,C}(R) ∪ π_{X,Y,Z}(S) matches A with X, B with Y, C with Z. Mismatched order with mismatched semantics leads to nonsensical results.
Understanding common union compatibility errors helps you diagnose issues quickly and write correct queries from the start. Here are the most frequent problems and their solutions:
| Error | Cause | Solution |
|---|---|---|
| Different column count | SELECT * from tables with different schemas | Use explicit column lists with same count |
| Type mismatch | Corresponding columns have incompatible types | Use CAST/CONVERT to unify types |
| Column order mismatch | Attributes listed in different order | Reorder SELECT columns consistently |
| Implicit vs explicit NULL | One query has NULLs where other has values | Use COALESCE or add NULL constants |
| Missing constants | Need discriminator column | Add literal values in SELECT |
| Subquery degree error | Subquery returns wrong column count | Verify subquery schema matches |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
-- ERROR 1: Different column count-- Bad:SELECT * FROM Customers -- 5 columnsUNIONSELECT * FROM SimpleContacts; -- 3 columns-- Error: Number of columns must match -- Fix:SELECT id, name, email FROM CustomersUNIONSELECT contact_id, contact_name, email FROM SimpleContacts; -- ERROR 2: Type mismatch-- Bad:SELECT order_id, order_date FROM Orders -- order_id is INTUNIONSELECT invoice_no, invoice_date FROM Invoices; -- invoice_no is VARCHAR-- Error (in strict mode): Type mismatch for column 1 -- Fix:SELECT CAST(order_id AS VARCHAR(20)), order_date FROM OrdersUNIONSELECT invoice_no, invoice_date FROM Invoices; -- ERROR 3: Column order mismatch (semantic error, not syntax)-- Bad (runs but wrong):SELECT name, email, id FROM TableAUNIONSELECT id, name, email FROM TableB;-- Result mixes name with id, email with name! -- Fix: Ensure consistent column orderSELECT id, name, email FROM TableAUNIONSELECT id, name, email FROM TableB; -- ERROR 4: Aggregate mismatch-- Bad:SELECT customer_id, SUM(amount) FROM Orders GROUP BY customer_idUNIONSELECT customer_id, amount FROM SinglePurchases; -- amount vs SUM-- Might run but semantics are wrong (aggregate vs detail) -- Fix: Either aggregate both or detail bothSELECT customer_id, SUM(amount) FROM Orders GROUP BY customer_idUNIONSELECT customer_id, SUM(amount) FROM SinglePurchases GROUP BY customer_id; -- ERROR 5: UNION in wrong position-- Bad:SELECT a FROM t1 ORDER BY aUNIONSELECT a FROM t2; -- ORDER BY before UNION is often an error -- Fix: ORDER BY after all UNION operationsSELECT a FROM t1UNIONSELECT a FROM t2ORDER BY a;When union compatibility errors occur: 1) Run each SELECT independently to verify it works, 2) Compare column counts (SELECT NULL, NULL, ... to match), 3) Check data types of each position, 4) Use explicit CASTs if types differ. Most IDEs will show column types on hover.
Different database systems have varying levels of strictness and behavior regarding union compatibility. Understanding these differences is essential for writing portable SQL and diagnosing cross-platform issues.
| Database | Type Strictness | Implicit Conversion | Name Resolution |
|---|---|---|---|
| PostgreSQL | Strict | Limited | First query names used |
| MySQL | Lenient | Extensive | First query names used |
| SQL Server | Moderate | Common types | First query names used |
| Oracle | Moderate | Common types | First query names used |
| SQLite | Very lenient | Dynamic typing | First query names used |
PostgreSQL (Strictest):
-- PostgreSQL: This fails
SELECT 1 UNION SELECT 'a'; -- ERROR: types integer and text cannot be matched
-- Fix requires explicit cast
SELECT CAST(1 AS TEXT) UNION SELECT 'a';
MySQL (Most Lenient):
-- MySQL: This works (but might not be what you want)
SELECT 1 UNION SELECT 'abc'; -- Returns 1 and 'abc' as strings
SQLite (Dynamic Typing):
-- SQLite: Very permissive
SELECT 123 UNION SELECT 'hello' UNION SELECT 45.67;
For maximum portability, write for the strictest system (PostgreSQL). Use explicit CASTs, match types exactly, and specify column lists explicitly. Code that works on PostgreSQL will work everywhere; code that only works on MySQL may fail on stricter systems.
Beyond basic projection and casting, several advanced patterns help achieve compatibility in complex scenarios.
Pattern 1: Synthetic Key Generation
When combining relations with incompatible keys, generate a synthetic unified key:
-- Create unified key with source prefix
SELECT CONCAT('CUS_', CAST(customer_id AS VARCHAR(20))) AS unified_id,
name, email, 'Customer' AS source
FROM Customers
UNION
SELECT CONCAT('SUP_', CAST(supplier_id AS VARCHAR(20))) AS unified_id,
company_name, contact_email, 'Supplier' AS source
FROM Suppliers;
Pattern 2: Polymorphic Schema with NULLs
Create a "supertype" schema that accommodates attributes from all subtypes:
-- Union of different entity types with type-specific columns
SELECT 'Employee' AS entity_type,
emp_id AS id,
name,
salary,
NULL AS hourly_rate,
NULL AS project_code
FROM Employees
UNION ALL
SELECT 'Contractor' AS entity_type,
contractor_id,
name,
NULL,
hourly_rate,
project_code
FROM Contractors;
Pattern 3: Temporal Schema Alignment
When unioning data from different time periods with schema evolution:
-- Old schema: (id, name, address)
-- New schema: (id, first_name, last_name, street, city, zip)
SELECT id,
name AS full_name,
address AS full_address,
NULL AS street,
NULL AS city,
NULL AS zip,
'legacy' AS schema_version
FROM Customers_Archive
UNION ALL
SELECT id,
CONCAT(first_name, ' ', last_name),
CONCAT(street, ', ', city, ' ', zip),
street,
city,
zip,
'current' AS schema_version
FROM Customers_Current;
Pattern 4: JSON/XML Expansion
For semi-structured data, expand to compatible relations:
-- Expand JSON customer field to match relational schema
SELECT customer_data->>'id' AS id,
customer_data->>'name' AS name,
customer_data->>'email' AS email
FROM JSON_Customers
UNION
SELECT CAST(id AS TEXT), name, email
FROM Relational_Customers;
In long-lived systems, schemas evolve over time. Archived data may have different structures than current data. Union compatibility patterns for temporal alignment are essential for historical analysis and data migration projects.
We've completed a thorough exploration of union compatibility—the essential prerequisite for all relational set operations. Let's consolidate the key knowledge:
You now have comprehensive understanding of union compatibility in relational algebra and SQL. You can diagnose compatibility errors, reshape relations for set operations, and understand the nuances across database systems. Next, we'll explore the rules governing all set operations, including operator precedence, algebraic properties, and query optimization opportunities.