Loading content...
The employee-manager relationship is the quintessential example of self-referential modeling. Nearly every organization has reporting structures, and understanding how to model, implement, and query these structures is foundational for any database professional.
This page brings together everything we've learned about self-referential relationships through a complete, realistic example. We'll start with requirements analysis, proceed through conceptual modeling, implement in SQL, and demonstrate essential query patterns. By the end, you'll have a template you can adapt to any hierarchical data structure.
We'll use a fictional technology company, TechCorp, with a typical organizational structure: a CEO at the top, VPs reporting to the CEO, directors reporting to VPs, managers reporting to directors, and individual contributors reporting to managers.
By the end of this page, you will have worked through a complete employee-manager implementation including requirements analysis, ER modeling with proper role names, multiple SQL implementations (adjacency list and closure table), comprehensive query patterns for common HR scenarios, and practical insights from production-grade systems.
Before modeling, we must understand the business requirements. TechCorp's HR department needs to track the reporting structure with the following requirements:
Functional Requirements:
Business Rules:
Data Volume Estimates:
Given the moderate size, stable structure, and read-heavy workload, both adjacency list and closure table are viable. We'll implement both to compare.
Notice how we documented query patterns before choosing an implementation. The requirement for 'full subtree' and 'path to CEO' queries suggests we need efficient recursive traversal. The read-heavy workload makes the additional storage of a closure table acceptable.
The conceptual model captures the recursive relationship with proper role names:
Entity: EMPLOYEE
Attributes include employee_id (PK), name, email, department, hire_date, and salary.
Entity-Relationship Model for TechCorp ENTITY: EMPLOYEE├── employee_id (PK)├── name├── email├── department├── hire_date└── salary RELATIONSHIP: REPORTS_TO├── Type: Recursive (unary) on EMPLOYEE├── Cardinality: 1:N│ └── One manager has many direct reports│ └── One employee has at most one manager├── Participation:│ └── Manager role: Partial (not all employees manage)│ └── Report role: Partial (CEO has no manager)└── Roles: └── MANAGER (parent role) └── DIRECT_REPORT (child role) ER Diagram (Chen Notation): ┌────────────────────────────┐ │ EMPLOYEE │ │ ────────────────────── │ │ PK: employee_id │ │ name │ │ email │ │ department │ │ hire_date │ │ salary │ └────────────┬───────────────┘ ╱ │ ╲ ╱ │ ╲ (manager) │ (direct_report) ╲ │ ╱ ╲ │ ╱ ┌────────────────┐ │◇ REPORTS_TO ◇ │ │ 1:N │ └────────────────┘Key Modeling Decisions:
Role names: 'manager' and 'direct_report' clearly indicate the hierarchical relationship
Cardinality 1:N: Each direct report has exactly one manager, but each manager can have many direct reports
Partial participation on both roles: Not every employee is a manager (leaves), and one employee has no manager (root)
Relationship attributes: None needed for this basic hierarchy (in complex scenarios, you might track 'start_date' of the reporting relationship)
Common alternatives: 'SUPERVISES', 'MANAGES', 'OVERSEES'. We chose 'REPORTS_TO' because it reads naturally from the child perspective ('Alice REPORTS_TO Bob'). Either perspective works—just be consistent in documentation.
The adjacency list implementation is straightforward—add a self-referencing foreign key:
Schema:
12345678910111213141516171819202122232425262728293031323334353637
-- TechCorp Employee Hierarchy - Adjacency List ModelCREATE TABLE Employee ( employee_id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(100) NOT NULL, email VARCHAR(150) NOT NULL UNIQUE, department VARCHAR(50) NOT NULL, hire_date DATE NOT NULL, salary DECIMAL(12, 2) NOT NULL, -- Self-referencing FK: points to manager -- NULL means this is the CEO (root node) manager_id INT, -- FK constraint: manager must be a valid employee CONSTRAINT fk_manager FOREIGN KEY (manager_id) REFERENCES Employee(employee_id) ON DELETE SET NULL, -- If manager leaves, reports become unassigned -- Prevent self-management CONSTRAINT chk_no_self_manage CHECK (employee_id != manager_id)); -- Index for efficient manager lookupsCREATE INDEX idx_employee_manager ON Employee(manager_id); -- View for easy hierarchy queryingCREATE VIEW Employee_Hierarchy ASSELECT e.employee_id, e.name AS employee_name, e.department, m.employee_id AS manager_id, m.name AS manager_nameFROM Employee eLEFT JOIN Employee m ON e.manager_id = m.employee_id;Sample Data:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
-- TechCorp Organizational Structure -- Level 0: CEO (root)INSERT INTO Employee VALUES (1, 'Sarah Chen', 'sarah.chen@techcorp.com', 'Executive', '2010-01-15', 450000, NULL); -- Level 1: C-Suite (report to CEO)INSERT INTO Employee VALUES (2, 'Michael Rodriguez', 'm.rodriguez@techcorp.com', 'Engineering', '2012-03-20', 350000, 1),(3, 'Jennifer Park', 'j.park@techcorp.com', 'Sales', '2013-06-10', 320000, 1),(4, 'David Kim', 'd.kim@techcorp.com', 'Operations', '2014-02-28', 300000, 1); -- Level 2: VPs (report to C-Suite)INSERT INTO Employee VALUES (5, 'Emily Watson', 'e.watson@techcorp.com', 'Engineering', '2015-04-15', 220000, 2),(6, 'James Liu', 'j.liu@techcorp.com', 'Engineering', '2015-07-22', 215000, 2),(7, 'Amanda Foster', 'a.foster@techcorp.com', 'Sales', '2016-01-10', 200000, 3); -- Level 3: Directors (report to VPs)INSERT INTO Employee VALUES (8, 'Robert Martinez', 'r.martinez@techcorp.com', 'Engineering', '2017-03-01', 160000, 5),(9, 'Lisa Thompson', 'l.thompson@techcorp.com', 'Engineering', '2017-05-15', 155000, 5),(10, 'Christopher Brown', 'c.brown@techcorp.com', 'Engineering', '2017-08-20', 150000, 6); -- Level 4: Managers (report to Directors)INSERT INTO Employee VALUES (11, 'Michelle Davis', 'm.davis@techcorp.com', 'Engineering', '2018-02-10', 120000, 8),(12, 'Daniel Wilson', 'd.wilson@techcorp.com', 'Engineering', '2018-04-25', 118000, 8),(13, 'Jessica Garcia', 'j.garcia@techcorp.com', 'Engineering', '2018-06-30', 115000, 9); -- Level 5: Senior Engineers (report to Managers)INSERT INTO Employee VALUES (14, 'Andrew Taylor', 'a.taylor@techcorp.com', 'Engineering', '2019-01-15', 95000, 11),(15, 'Nicole Anderson', 'n.anderson@techcorp.com', 'Engineering', '2019-03-20', 92000, 11),(16, 'Kevin White', 'k.white@techcorp.com', 'Engineering', '2019-05-10', 90000, 12); -- Level 6: Junior Engineers (report to Senior Engineers)INSERT INTO Employee VALUES (17, 'Rachel Green', 'r.green@techcorp.com', 'Engineering', '2020-07-01', 75000, 14),(18, 'Steven Black', 's.black@techcorp.com', 'Engineering', '2020-09-15', 72000, 14); /*Organizational Tree: Sarah Chen (CEO)├── Michael Rodriguez (CTO)│ ├── Emily Watson (VP Engineering - Platform)│ │ ├── Robert Martinez (Director)│ │ │ ├── Michelle Davis (Manager)│ │ │ │ ├── Andrew Taylor (Sr Eng)│ │ │ │ │ ├── Rachel Green (Jr Eng)│ │ │ │ │ └── Steven Black (Jr Eng)│ │ │ │ └── Nicole Anderson (Sr Eng)│ │ │ └── Daniel Wilson (Manager)│ │ │ └── Kevin White (Sr Eng)│ │ └── Lisa Thompson (Director)│ │ └── Jessica Garcia (Manager)│ └── James Liu (VP Engineering - Applications)│ └── Christopher Brown (Director)├── Jennifer Park (CRO)│ └── Amanda Foster (VP Sales)└── David Kim (COO)*/Notice we inserted from top to bottom (executives first). This ensures manager_id foreign keys are valid when lower-level employees are inserted. For bulk loading, you might disable FK checks temporarily, then validate after loading.
These queries address the requirements we identified. Each demonstrates a fundamental hierarchy operation:
Basic Navigation Queries:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
-- Query 1: Get direct reports of a specific manager-- "Who reports directly to Emily Watson (id=5)?"SELECT employee_id, name, department, salaryFROM EmployeeWHERE manager_id = 5ORDER BY name; -- Result: Robert Martinez, Lisa Thompson -- Query 2: Get the manager of a specific employee-- "Who is Andrew Taylor's (id=14) manager?"SELECT m.employee_id, m.name, m.departmentFROM Employee eJOIN Employee m ON e.manager_id = m.employee_idWHERE e.employee_id = 14; -- Result: Michelle Davis -- Query 3: Find employees with no direct reports (leaf nodes)-- "List all individual contributors (non-managers)"SELECT e.employee_id, e.name, e.departmentFROM Employee eLEFT JOIN Employee reports ON reports.manager_id = e.employee_idWHERE reports.employee_id IS NULL; -- Result: All level 5-6 employees (no one reports to them) -- Query 4: Find the CEO (root node)-- "Who is the top of the organization?"SELECT employee_id, name, departmentFROM EmployeeWHERE manager_id IS NULL; -- Result: Sarah Chen -- Query 5: Count direct reports per manager-- "How many people report directly to each manager?"SELECT m.employee_id, m.name AS manager_name, COUNT(e.employee_id) AS direct_report_countFROM Employee mLEFT JOIN Employee e ON e.manager_id = m.employee_idGROUP BY m.employee_id, m.nameHAVING COUNT(e.employee_id) > 0ORDER BY direct_report_count DESC;Recursive Traversal Queries:
These require recursive CTEs to navigate the full hierarchy:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990
-- Query 6: Get entire organization under a manager (all descendants)-- "List everyone in Emily Watson's organization"WITH RECURSIVE Org AS ( -- Base case: start with the manager SELECT employee_id, name, manager_id, 0 AS level FROM Employee WHERE employee_id = 5 -- Emily Watson UNION ALL -- Recursive case: get their reports, and reports' reports, etc. SELECT e.employee_id, e.name, e.manager_id, o.level + 1 FROM Employee e JOIN Org o ON e.manager_id = o.employee_id)SELECT * FROM Org ORDER BY level, name; /*Result:employee_id | name | level------------|-------------------|------5 | Emily Watson | 0 (the starting manager)8 | Robert Martinez | 19 | Lisa Thompson | 111 | Michelle Davis | 212 | Daniel Wilson | 213 | Jessica Garcia | 214 | Andrew Taylor | 315 | Nicole Anderson | 316 | Kevin White | 317 | Rachel Green | 418 | Steven Black | 4*/ -- Query 7: Get management chain to CEO (all ancestors)-- "Show the full reporting chain for Rachel Green"WITH RECURSIVE Chain AS ( -- Base case: start with the employee SELECT employee_id, name, manager_id, 0 AS distance FROM Employee WHERE employee_id = 17 -- Rachel Green UNION ALL -- Recursive case: follow manager_id up SELECT e.employee_id, e.name, e.manager_id, c.distance + 1 FROM Employee e JOIN Chain c ON c.manager_id = e.employee_id)SELECT * FROM Chain ORDER BY distance; /*Result (bottom to top):employee_id | name | distance------------|-------------------|--------17 | Rachel Green | 0 (starting point)14 | Andrew Taylor | 1 (manager)11 | Michelle Davis | 28 | Robert Martinez | 35 | Emily Watson | 42 | Michael Rodriguez | 51 | Sarah Chen | 6 (CEO)*/ -- Query 8: Calculate organizational depth for all employeesWITH RECURSIVE Depths AS ( SELECT employee_id, name, manager_id, 0 AS depth FROM Employee WHERE manager_id IS NULL -- Start from CEO UNION ALL SELECT e.employee_id, e.name, e.manager_id, d.depth + 1 FROM Employee e JOIN Depths d ON e.manager_id = d.employee_id)SELECT * FROM Depths ORDER BY depth, name; -- Query 9: Find all employees at a specific organizational levelWITH RECURSIVE Levels AS ( SELECT employee_id, name, manager_id, 0 AS level FROM Employee WHERE manager_id IS NULL UNION ALL SELECT e.employee_id, e.name, e.manager_id, l.level + 1 FROM Employee e JOIN Levels l ON e.manager_id = l.employee_id)SELECT * FROM Levels WHERE level = 3; -- Directors levelAll recursive CTEs follow the same pattern: (1) Base case SELECT that starts the recursion, (2) UNION ALL, (3) Recursive case that joins to the CTE itself. The recursion terminates when the recursive SELECT returns no rows.
Beyond navigation, organizational hierarchies enable powerful business analytics:
Compensation and Span Analysis:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788
-- Query 10: Total compensation under each manager-- "What's the total salary expense in each org?"WITH RECURSIVE OrgCost AS ( SELECT employee_id, name, salary, employee_id AS org_root FROM Employee WHERE employee_id IN (5, 6, 7) -- VPs UNION ALL SELECT e.employee_id, e.name, e.salary, oc.org_root FROM Employee e JOIN OrgCost oc ON e.manager_id = oc.employee_id)SELECT org_root, (SELECT name FROM Employee WHERE employee_id = org_root) AS vp_name, COUNT(*) AS org_size, SUM(salary) AS total_salary, AVG(salary) AS avg_salaryFROM OrgCostGROUP BY org_rootORDER BY total_salary DESC; -- Query 11: Identify managers with many direct reports (span of control)SELECT m.employee_id, m.name, COUNT(e.employee_id) AS direct_reportsFROM Employee mJOIN Employee e ON e.manager_id = m.employee_idGROUP BY m.employee_id, m.nameHAVING COUNT(e.employee_id) > 3 -- ThresholdORDER BY direct_reports DESC; -- Query 12: Find longest reporting chainWITH RECURSIVE Chains AS ( SELECT employee_id, name, manager_id, 1 AS chain_length, CAST(name AS VARCHAR(1000)) AS path FROM Employee WHERE manager_id IS NULL UNION ALL SELECT e.employee_id, e.name, e.manager_id, c.chain_length + 1, c.path || ' > ' || e.name FROM Employee e JOIN Chains c ON e.manager_id = c.employee_id)SELECT * FROM Chains ORDER BY chain_length DESCLIMIT 1; -- Query 13: Compare employee salary to manager's salarySELECT e.name AS employee, e.salary AS employee_salary, m.name AS manager, m.salary AS manager_salary, m.salary - e.salary AS salary_gap, ROUND((e.salary / m.salary) * 100, 1) AS pct_of_managerFROM Employee eJOIN Employee m ON e.manager_id = m.employee_idWHERE e.salary > m.salary * 0.9 -- Employees earning >90% of managerORDER BY pct_of_manager DESC; -- Query 14: Find common manager (lowest common ancestor)-- "Who is the lowest-level manager overseeing both Rachel and Kevin?"WITH RECURSIVE RachelChain AS ( SELECT employee_id, manager_id, 0 AS dist FROM Employee WHERE employee_id = 17 UNION ALL SELECT e.employee_id, e.manager_id, r.dist + 1 FROM Employee e JOIN RachelChain r ON r.manager_id = e.employee_id),KevinChain AS ( SELECT employee_id, manager_id, 0 AS dist FROM Employee WHERE employee_id = 16 UNION ALL SELECT e.employee_id, e.manager_id, k.dist + 1 FROM Employee e JOIN KevinChain k ON k.manager_id = e.employee_id)SELECT r.employee_id, (SELECT name FROM Employee WHERE employee_id = r.employee_id)FROM RachelChain rJOIN KevinChain k ON r.employee_id = k.employee_idORDER BY r.dist + k.dist -- Closest common ancestorLIMIT 1; -- Result: Robert Martinez (id=8) - lowest manager over both engineersRecursive CTEs recompute on every execution. For frequently-run analytics, consider materializing the hierarchy periodically into a flattened table or using a closure table. The choice depends on update frequency vs query frequency.
For read-heavy hierarchies, a closure table pre-computes all ancestor-descendant relationships:
Schema:
12345678910111213141516171819202122232425262728293031
-- Closure Table: Pre-computed hierarchy pathsCREATE TABLE Employee_Closure ( ancestor_id INT NOT NULL, descendant_id INT NOT NULL, depth INT NOT NULL, -- Distance between nodes PRIMARY KEY (ancestor_id, descendant_id), FOREIGN KEY (ancestor_id) REFERENCES Employee(employee_id) ON DELETE CASCADE, FOREIGN KEY (descendant_id) REFERENCES Employee(employee_id) ON DELETE CASCADE); -- Indexes for fast queries in both directionsCREATE INDEX idx_closure_ancestor ON Employee_Closure(ancestor_id, depth);CREATE INDEX idx_closure_descendant ON Employee_Closure(descendant_id, depth); -- Populate closure table from existing adjacency listWITH RECURSIVE Paths AS ( -- Self-references (depth = 0) SELECT employee_id AS ancestor_id, employee_id AS descendant_id, 0 AS depth FROM Employee UNION ALL -- Extend paths SELECT p.ancestor_id, e.employee_id, p.depth + 1 FROM Paths p JOIN Employee e ON e.manager_id = p.descendant_id)INSERT INTO Employee_ClosureSELECT * FROM Paths;Simplified Queries with Closure Table:
With the closure table, recursive CTEs become unnecessary:
12345678910111213141516171819202122232425262728293031323334
-- Get entire organization under Emily Watson (id=5) - NO RECURSION!SELECT e.*, ec.depthFROM Employee eJOIN Employee_Closure ec ON e.employee_id = ec.descendant_idWHERE ec.ancestor_id = 5 AND ec.depth > 0 -- Exclude selfORDER BY ec.depth, e.name; -- Get management chain for Rachel Green (id=17) - NO RECURSION!SELECT e.*, ec.depthFROM Employee eJOIN Employee_Closure ec ON e.employee_id = ec.ancestor_idWHERE ec.descendant_id = 17 AND ec.depth > 0 -- Exclude selfORDER BY ec.depth DESC; -- Check if someone is in someone else's org - SIMPLE!SELECT COUNT(*) > 0 AS is_subordinateFROM Employee_ClosureWHERE ancestor_id = 5 AND descendant_id = 17; -- Is Rachel under Emily? -- Get direct reports only (depth = 1)SELECT e.*FROM Employee eJOIN Employee_Closure ec ON e.employee_id = ec.descendant_idWHERE ec.ancestor_id = 5 AND ec.depth = 1; -- Count total org size per manager - FAST!SELECT e.name, (SELECT COUNT(*) - 1 FROM Employee_Closure WHERE ancestor_id = e.employee_id) AS org_sizeFROM Employee eWHERE EXISTS ( SELECT 1 FROM Employee_Closure WHERE ancestor_id = e.employee_id AND depth = 1);Maintaining the Closure Table:
When the hierarchy changes, the closure table must be updated:
1234567891011121314151617181920212223
-- Adding a new employee (new hire under Michelle Davis, id=11)INSERT INTO Employee VALUES (19, 'New Hire', 'new@techcorp.com', 'Engineering', '2024-01-15', 70000, 11); -- Add closure entries: copy all ancestors' closure + selfINSERT INTO Employee_Closure (ancestor_id, descendant_id, depth)SELECT ec.ancestor_id, 19, ec.depth + 1FROM Employee_Closure ecWHERE ec.descendant_id = 11 -- Parent's closure entriesUNION ALLSELECT 19, 19, 0; -- Self-reference -- Removing an employee and their subtree-- First, delete all closure entries involving the subtreeDELETE FROM Employee_ClosureWHERE descendant_id IN ( SELECT descendant_id FROM Employee_Closure WHERE ancestor_id = 19); -- Moving an employee to a new manager (complex operation)-- 1. Remove old subtree closure entries-- 2. Recompute new paths from new parent-- Often done via DELETE + CASCADE + re-insertThe closure table trades storage (O(n × avg_depth) rows) for query simplicity. For TechCorp's 500 employees with avg depth 5, that's ~2,500 closure rows vs 18 employees. Worth it for read-heavy systems, but MUST be kept in sync with the main table.
Moving from an academic example to production requires additional considerations:
Data Integrity:
1234567891011121314151617181920212223242526272829303132333435363738
-- Trigger to prevent cycles on manager updatesCREATE OR REPLACE FUNCTION prevent_management_cycle()RETURNS TRIGGER AS $$DECLARE current_id INT;BEGIN -- Walk up from new manager to see if we reach the employee current_id := NEW.manager_id; WHILE current_id IS NOT NULL LOOP IF current_id = NEW.employee_id THEN RAISE EXCEPTION 'Cycle detected: Employee cannot be their own ancestor'; END IF; SELECT manager_id INTO current_id FROM Employee WHERE employee_id = current_id; END LOOP; RETURN NEW;END;$$ LANGUAGE plpgsql; CREATE TRIGGER trg_prevent_cycleBEFORE INSERT OR UPDATE OF manager_id ON EmployeeFOR EACH ROWWHEN (NEW.manager_id IS NOT NULL)EXECUTE FUNCTION prevent_management_cycle(); -- Ensure exactly one CEOCREATE OR REPLACE FUNCTION enforce_single_ceo()RETURNS TRIGGER AS $$BEGIN IF NEW.manager_id IS NULL THEN IF EXISTS (SELECT 1 FROM Employee WHERE manager_id IS NULL AND employee_id != NEW.employee_id) THEN RAISE EXCEPTION 'Only one employee can have no manager (CEO)'; END IF; END IF; RETURN NEW;END;$$ LANGUAGE plpgsql;Historical Tracking:
Production systems often need to track how the hierarchy changes over time:
123456789101112131415161718192021222324
-- Track historical reporting relationshipsCREATE TABLE Reporting_History ( history_id INT PRIMARY KEY AUTO_INCREMENT, employee_id INT NOT NULL REFERENCES Employee(employee_id), manager_id INT REFERENCES Employee(employee_id), start_date DATE NOT NULL, end_date DATE, -- NULL = current relationship change_reason VARCHAR(100) -- 'Hire', 'Reorg', 'Promotion', etc.); -- Query: Who was Rachel reporting to on 2023-06-15?SELECT m.name AS managerFROM Reporting_History rhJOIN Employee m ON rh.manager_id = m.employee_idWHERE rh.employee_id = 17 AND rh.start_date <= '2023-06-15' AND (rh.end_date IS NULL OR rh.end_date > '2023-06-15'); -- Query: All managers Rachel has ever reported toSELECT DISTINCT m.name, rh.start_date, rh.end_dateFROM Reporting_History rhJOIN Employee m ON rh.manager_id = m.employee_idWHERE rh.employee_id = 17ORDER BY rh.start_date;Many modern databases support temporal tables (system-versioned tables) that automatically track history. This eliminates the need for manual history tables and triggers. Check if your database (PostgreSQL, SQL Server, MariaDB) supports this feature.
We've completed a comprehensive, end-to-end treatment of the employee-manager hierarchy. Let's consolidate the key learnings:
Module Completion:
With this example, we've covered the complete arc of self-referential relationships:
This knowledge equips you to model, implement, and query any self-referential data structure—from organizational charts to category trees, from prerequisite graphs to social networks.
Congratulations! You've mastered self-referential relationships—one of the most powerful and frequently used advanced ER constructs. You can now model hierarchies and networks within a single entity type, choose appropriate implementation strategies, and write efficient queries for these structures. This knowledge applies to countless real-world database design scenarios.