Loading learning content...
Identifying 1NF violations is essential, but incomplete. The real value comes from systematically transforming non-compliant tables into proper First Normal Form. This transformation requires careful planning—you must preserve all data, maintain relationships, update dependent applications, and minimize downtime.
This page provides battle-tested algorithms and procedures for 1NF conversion. Whether you're dealing with comma-separated values, repeating column groups, embedded JSON, or any other violation type, you'll learn systematic approaches that produce correct results reliably.
By the end of this page, you will master the complete 1NF conversion algorithm with step-by-step procedures, learn data migration techniques that preserve integrity, understand how to handle each violation type with specific transformation patterns, develop testing and validation strategies to ensure correctness, and practice strategies for minimizing application disruption during conversion.
Converting a table to 1NF follows a systematic process. While specific techniques vary by violation type, the overall algorithm remains consistent.
Always create new tables and migrate data before touching the original structure. Keep the original tables intact until migration is verified and applications are updated. This provides rollback capability if issues emerge.
The transformation principle:
Every 1NF conversion follows a core principle: What was implicit becomes explicit, and what was embedded becomes extracted.
Multi-valued columns (comma-separated lists, etc.) require splitting values into separate rows in a new child table.
The conversion pattern:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
-- ORIGINAL: Employees with comma-separated skillsCREATE TABLE Employees_Old ( EmpID INT PRIMARY KEY, Name VARCHAR(100), Department VARCHAR(50), Skills VARCHAR(500) -- 'Java, Python, SQL, Docker'); -- Sample dataINSERT INTO Employees_Old VALUES (1, 'Alice', 'Engineering', 'Java, Python, SQL'), (2, 'Bob', 'Engineering', 'Python, Docker, Kubernetes'), (3, 'Charlie', 'Data Science', 'Python, R, SQL, TensorFlow'); -- STEP 1: Create normalized tablesCREATE TABLE Employees ( EmpID INT PRIMARY KEY, Name VARCHAR(100) NOT NULL, Department VARCHAR(50)); CREATE TABLE Skills ( SkillID INT PRIMARY KEY AUTO_INCREMENT, SkillName VARCHAR(50) UNIQUE NOT NULL); CREATE TABLE EmployeeSkills ( EmpID INT NOT NULL, SkillID INT NOT NULL, PRIMARY KEY (EmpID, SkillID), FOREIGN KEY (EmpID) REFERENCES Employees(EmpID) ON DELETE CASCADE, FOREIGN KEY (SkillID) REFERENCES Skills(SkillID)); -- STEP 2: Migrate employee base dataINSERT INTO Employees (EmpID, Name, Department)SELECT EmpID, Name, Department FROM Employees_Old; -- STEP 3: Extract and populate distinct skills-- PostgreSQL approach using UNNEST and STRING_TO_ARRAYINSERT INTO Skills (SkillName)SELECT DISTINCT TRIM(skill) AS SkillNameFROM Employees_Old, UNNEST(STRING_TO_ARRAY(Skills, ',')) AS skillWHERE Skills IS NOT NULL; -- STEP 4: Link employees to skillsINSERT INTO EmployeeSkills (EmpID, SkillID)SELECT e.EmpID, s.SkillIDFROM Employees_Old e, UNNEST(STRING_TO_ARRAY(e.Skills, ',')) AS skill_nameJOIN Skills s ON TRIM(skill_name) = s.SkillNameWHERE e.Skills IS NOT NULL;MySQL approach for string splitting:
MySQL lacks built-in array functions, requiring a different technique:
12345678910111213141516171819202122232425
-- MySQL: Using a numbers table for string splitting-- First, create a numbers table (one-time setup)CREATE TABLE Numbers (n INT PRIMARY KEY);INSERT INTO Numbers VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10); -- Extract skills using SUBSTRING_INDEXINSERT INTO Skills (SkillName)SELECT DISTINCT TRIM(SUBSTRING_INDEX( SUBSTRING_INDEX(e.Skills, ',', n.n), ',', -1)) AS SkillNameFROM Employees_Old eJOIN Numbers n ON n.n <= 1 + LENGTH(e.Skills) - LENGTH(REPLACE(e.Skills, ',', ''))WHERE e.Skills IS NOT NULL AND e.Skills != ''; -- Link employees to skillsINSERT INTO EmployeeSkills (EmpID, SkillID)SELECT e.EmpID, s.SkillIDFROM Employees_Old eJOIN Numbers n ON n.n <= 1 + LENGTH(e.Skills) - LENGTH(REPLACE(e.Skills, ',', ''))JOIN Skills s ON s.SkillName = TRIM(SUBSTRING_INDEX( SUBSTRING_INDEX(e.Skills, ',', n.n), ',', -1))WHERE e.Skills IS NOT NULL;Real data often has inconsistent formatting: 'Java,Python' vs 'Java, Python' vs 'Java , Python'. Always TRIM() extracted values and consider normalizing case (LOWER/UPPER) before inserting into the skills table to prevent near-duplicates.
Repeating column groups require transposing horizontal data into vertical rows. The UNION approach is most reliable across database systems.
The conversion pattern:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283
-- ORIGINAL: Orders with repeating item columnsCREATE TABLE Orders_Old ( OrderID INT PRIMARY KEY, CustomerName VARCHAR(100), OrderDate DATE, Item1_Name VARCHAR(100), Item1_Qty INT, Item1_Price DECIMAL(10,2), Item2_Name VARCHAR(100), Item2_Qty INT, Item2_Price DECIMAL(10,2), Item3_Name VARCHAR(100), Item3_Qty INT, Item3_Price DECIMAL(10,2), Item4_Name VARCHAR(100), Item4_Qty INT, Item4_Price DECIMAL(10,2), Item5_Name VARCHAR(100), Item5_Qty INT, Item5_Price DECIMAL(10,2)); -- Sample dataINSERT INTO Orders_Old VALUES ( 1001, 'Alice Johnson', '2024-03-15', 'Laptop', 1, 999.99, 'Mouse', 2, 29.99, 'Keyboard', 1, 79.99, NULL, NULL, NULL, NULL, NULL, NULL); -- STEP 1: Create normalized tablesCREATE TABLE Orders ( OrderID INT PRIMARY KEY, CustomerName VARCHAR(100) NOT NULL, OrderDate DATE NOT NULL); CREATE TABLE OrderItems ( OrderItemID INT PRIMARY KEY AUTO_INCREMENT, OrderID INT NOT NULL, LineNumber INT NOT NULL, -- Preserves original position ItemName VARCHAR(100) NOT NULL, Quantity INT NOT NULL, UnitPrice DECIMAL(10,2) NOT NULL, FOREIGN KEY (OrderID) REFERENCES Orders(OrderID) ON DELETE CASCADE, UNIQUE (OrderID, LineNumber)); -- STEP 2: Migrate order headersINSERT INTO Orders (OrderID, CustomerName, OrderDate)SELECT OrderID, CustomerName, OrderDate FROM Orders_Old; -- STEP 3: Migrate order items using UNION ALLINSERT INTO OrderItems (OrderID, LineNumber, ItemName, Quantity, UnitPrice)SELECT OrderID, 1, Item1_Name, Item1_Qty, Item1_PriceFROM Orders_OldWHERE Item1_Name IS NOT NULL UNION ALL SELECT OrderID, 2, Item2_Name, Item2_Qty, Item2_PriceFROM Orders_OldWHERE Item2_Name IS NOT NULL UNION ALL SELECT OrderID, 3, Item3_Name, Item3_Qty, Item3_PriceFROM Orders_OldWHERE Item3_Name IS NOT NULL UNION ALL SELECT OrderID, 4, Item4_Name, Item4_Qty, Item4_PriceFROM Orders_OldWHERE Item4_Name IS NOT NULL UNION ALL SELECT OrderID, 5, Item5_Name, Item5_Qty, Item5_PriceFROM Orders_OldWHERE Item5_Name IS NOT NULL;Validation queries:
123456789101112131415161718192021222324252627282930313233343536373839404142
-- Verify row counts matchSELECT 'Orders' AS Table_Name, (SELECT COUNT(*) FROM Orders_Old) AS OldCount, (SELECT COUNT(*) FROM Orders) AS NewCount; -- Verify item counts matchWITH OldItemCounts AS ( SELECT OrderID, (CASE WHEN Item1_Name IS NOT NULL THEN 1 ELSE 0 END + CASE WHEN Item2_Name IS NOT NULL THEN 1 ELSE 0 END + CASE WHEN Item3_Name IS NOT NULL THEN 1 ELSE 0 END + CASE WHEN Item4_Name IS NOT NULL THEN 1 ELSE 0 END + CASE WHEN Item5_Name IS NOT NULL THEN 1 ELSE 0 END) AS ItemCount FROM Orders_Old),NewItemCounts AS ( SELECT OrderID, COUNT(*) AS ItemCount FROM OrderItems GROUP BY OrderID)SELECT COALESCE(o.OrderID, n.OrderID) AS OrderID, o.ItemCount AS OldItemCount, n.ItemCount AS NewItemCount, CASE WHEN o.ItemCount = n.ItemCount THEN 'MATCH' ELSE 'MISMATCH' END AS StatusFROM OldItemCounts oFULL OUTER JOIN NewItemCounts n ON o.OrderID = n.OrderIDWHERE o.ItemCount != n.ItemCount OR o.OrderID IS NULL OR n.OrderID IS NULL; -- Verify revenue totals matchSELECT 'Revenue Check' AS Validation, (SELECT SUM( COALESCE(Item1_Qty * Item1_Price, 0) + COALESCE(Item2_Qty * Item2_Price, 0) + COALESCE(Item3_Qty * Item3_Price, 0) + COALESCE(Item4_Qty * Item4_Price, 0) + COALESCE(Item5_Qty * Item5_Price, 0) ) FROM Orders_Old) AS OldTotal, (SELECT SUM(Quantity * UnitPrice) FROM OrderItems) AS NewTotal;JSON columns containing query-critical data must be decomposed into atomic columns or related tables. Modern databases provide JSON extraction functions for this purpose.
The conversion pattern:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
-- ORIGINAL: Orders with JSON item dataCREATE TABLE Orders_Old ( OrderID INT PRIMARY KEY, CustomerID INT, OrderDate DATE, OrderData JSON /* Sample OrderData: { "shippingAddress": { "street": "123 Main St", "city": "Boston", "state": "MA", "zip": "02101" }, "items": [ {"sku": "LAPTOP", "qty": 1, "price": 999.99}, {"sku": "MOUSE", "qty": 2, "price": 29.99} ], "notes": "Gift wrap requested" } */); -- STEP 1: Create normalized tablesCREATE TABLE Orders ( OrderID INT PRIMARY KEY, CustomerID INT NOT NULL, OrderDate DATE NOT NULL, -- Scalar JSON properties become columns ShippingStreet VARCHAR(200), ShippingCity VARCHAR(100), ShippingState VARCHAR(50), ShippingZip VARCHAR(20), Notes TEXT); CREATE TABLE OrderItems ( OrderItemID INT PRIMARY KEY AUTO_INCREMENT, OrderID INT NOT NULL, SKU VARCHAR(50) NOT NULL, Quantity INT NOT NULL, Price DECIMAL(10,2) NOT NULL, FOREIGN KEY (OrderID) REFERENCES Orders(OrderID) ON DELETE CASCADE); -- STEP 2: Migrate order data (PostgreSQL JSONB syntax)INSERT INTO Orders ( OrderID, CustomerID, OrderDate, ShippingStreet, ShippingCity, ShippingState, ShippingZip, Notes)SELECT OrderID, CustomerID, OrderDate, OrderData->'shippingAddress'->>'street', OrderData->'shippingAddress'->>'city', OrderData->'shippingAddress'->>'state', OrderData->'shippingAddress'->>'zip', OrderData->>'notes'FROM Orders_Old; -- STEP 3: Migrate items from JSON array (PostgreSQL)INSERT INTO OrderItems (OrderID, SKU, Quantity, Price)SELECT o.OrderID, item->>'sku', (item->>'qty')::INT, (item->>'price')::DECIMAL(10,2)FROM Orders_Old o, JSONB_ARRAY_ELEMENTS(o.OrderData->'items') AS item;MySQL JSON extraction syntax:
123456789101112131415161718192021222324252627282930313233
-- MySQL 8.0+ JSON extractionINSERT INTO Orders ( OrderID, CustomerID, OrderDate, ShippingStreet, ShippingCity, ShippingState, ShippingZip, Notes)SELECT OrderID, CustomerID, OrderDate, JSON_UNQUOTE(JSON_EXTRACT(OrderData, '$.shippingAddress.street')), JSON_UNQUOTE(JSON_EXTRACT(OrderData, '$.shippingAddress.city')), JSON_UNQUOTE(JSON_EXTRACT(OrderData, '$.shippingAddress.state')), JSON_UNQUOTE(JSON_EXTRACT(OrderData, '$.shippingAddress.zip')), JSON_UNQUOTE(JSON_EXTRACT(OrderData, '$.notes'))FROM Orders_Old; -- MySQL: Extract JSON array elements using JSON_TABLE (MySQL 8.0.4+)INSERT INTO OrderItems (OrderID, SKU, Quantity, Price)SELECT o.OrderID, jt.sku, jt.qty, jt.priceFROM Orders_Old o, JSON_TABLE( o.OrderData, '$.items[*]' COLUMNS ( sku VARCHAR(50) PATH '$.sku', qty INT PATH '$.qty', price DECIMAL(10,2) PATH '$.price' ) ) AS jt;JSON data often has missing properties. Use COALESCE or NULLIF to handle cases where properties don't exist. Test extraction on sample data before running full migration to catch path errors and type mismatches.
Encoded information must be decomposed into separate atomic columns. This typically involves string parsing and may require lookup tables to translate codes.
The conversion pattern:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
-- ORIGINAL: Products with encoded ProductCode-- Format: 'CATEGORY-YYYY-WAREHOUSE-SEQUENCE'-- Example: 'ELEC-2024-NYC-00142'CREATE TABLE Products_Old ( ProductCode VARCHAR(25) PRIMARY KEY, Description VARCHAR(200), Price DECIMAL(10,2)); INSERT INTO Products_Old VALUES ('ELEC-2024-NYC-00142', 'Gaming Laptop', 1299.99), ('ELEC-2024-LAX-00089', 'Wireless Mouse', 49.99), ('FURN-2023-CHI-00331', 'Office Chair', 299.99); -- STEP 1: Create normalized table with atomic columnsCREATE TABLE Products ( ProductID INT PRIMARY KEY AUTO_INCREMENT, Category VARCHAR(20) NOT NULL, ProductionYear INT NOT NULL, WarehouseCode VARCHAR(10) NOT NULL, SequenceNumber INT NOT NULL, Description VARCHAR(200), Price DECIMAL(10,2), LegacyCode VARCHAR(25) UNIQUE -- Keep for backward compatibility); -- STEP 2: Optional - Create lookup tablesCREATE TABLE Categories ( CategoryCode VARCHAR(20) PRIMARY KEY, CategoryName VARCHAR(100) NOT NULL); INSERT INTO Categories VALUES ('ELEC', 'Electronics'), ('FURN', 'Furniture'), ('CLTH', 'Clothing'); CREATE TABLE Warehouses ( WarehouseCode VARCHAR(10) PRIMARY KEY, City VARCHAR(100) NOT NULL, Region VARCHAR(50)); INSERT INTO Warehouses VALUES ('NYC', 'New York', 'East'), ('LAX', 'Los Angeles', 'West'), ('CHI', 'Chicago', 'Midwest'); -- STEP 3: Migrate with parsing (PostgreSQL/standard SQL)INSERT INTO Products ( Category, ProductionYear, WarehouseCode, SequenceNumber, Description, Price, LegacyCode)SELECT SPLIT_PART(ProductCode, '-', 1) AS Category, CAST(SPLIT_PART(ProductCode, '-', 2) AS INT) AS ProductionYear, SPLIT_PART(ProductCode, '-', 3) AS WarehouseCode, CAST(SPLIT_PART(ProductCode, '-', 4) AS INT) AS SequenceNumber, Description, Price, ProductCode -- Preserve originalFROM Products_Old; -- Alternative for MySQL (using SUBSTRING_INDEX)INSERT INTO Products ( Category, ProductionYear, WarehouseCode, SequenceNumber, Description, Price, LegacyCode)SELECT SUBSTRING_INDEX(ProductCode, '-', 1), CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(ProductCode, '-', 2), '-', -1) AS UNSIGNED), SUBSTRING_INDEX(SUBSTRING_INDEX(ProductCode, '-', 3), '-', -1), CAST(SUBSTRING_INDEX(ProductCode, '-', -1) AS UNSIGNED), Description, Price, ProductCodeFROM Products_Old;Converting status flags:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
-- ORIGINAL: Users with encoded status flags-- StatusFlags: 'AVN' = Active, Verified, Notifications-on-- Position 1: A=Active, I=Inactive-- Position 2: V=Verified, U=Unverified -- Position 3: N=Notifications-on, X=Notifications-offCREATE TABLE Users_Old ( UserID INT PRIMARY KEY, Username VARCHAR(50), StatusFlags VARCHAR(10)); INSERT INTO Users_Old VALUES (1, 'alice', 'AVN'), (2, 'bob', 'AUX'), (3, 'charlie', 'IVN'); -- STEP 1: Create table with boolean columnsCREATE TABLE Users ( UserID INT PRIMARY KEY, Username VARCHAR(50) NOT NULL, IsActive BOOLEAN NOT NULL DEFAULT TRUE, IsVerified BOOLEAN NOT NULL DEFAULT FALSE, NotificationsEnabled BOOLEAN NOT NULL DEFAULT TRUE); -- STEP 2: Migrate with flag parsingINSERT INTO Users (UserID, Username, IsActive, IsVerified, NotificationsEnabled)SELECT UserID, Username, CASE SUBSTRING(StatusFlags, 1, 1) WHEN 'A' THEN TRUE WHEN 'I' THEN FALSE ELSE TRUE -- Default END AS IsActive, CASE SUBSTRING(StatusFlags, 2, 1) WHEN 'V' THEN TRUE WHEN 'U' THEN FALSE ELSE FALSE -- Default END AS IsVerified, CASE SUBSTRING(StatusFlags, 3, 1) WHEN 'N' THEN TRUE WHEN 'X' THEN FALSE ELSE TRUE -- Default END AS NotificationsEnabledFROM Users_Old;Tables without primary keys must have one added. The approach depends on whether a natural key can be identified or a surrogate key must be introduced.
The decision process:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
-- ORIGINAL: Event logs without primary keyCREATE TABLE EventLogs_Old ( EventTime TIMESTAMP, EventType VARCHAR(50), UserID INT, EventData TEXT -- No PRIMARY KEY!); -- STEP 1: Check for potential natural keySELECT EventTime, EventType, UserID, COUNT(*) as OccurrencesFROM EventLogs_OldGROUP BY EventTime, EventType, UserIDHAVING COUNT(*) > 1; -- If duplicates exist, we need a surrogate key -- STEP 2: Handle duplicates (if needed)-- Option A: Keep only first occurrenceCREATE TABLE EventLogs_Deduped ASSELECT DISTINCT ON (EventTime, EventType, UserID) *FROM EventLogs_OldORDER BY EventTime, EventType, UserID; -- Option B: Keep all with surrogate keyCREATE TABLE EventLogs ( EventLogID BIGINT PRIMARY KEY AUTO_INCREMENT, EventTime TIMESTAMP NOT NULL, EventType VARCHAR(50) NOT NULL, UserID INT, EventData TEXT, -- Add index on common query patterns INDEX idx_event_time (EventTime), INDEX idx_user_events (UserID, EventTime)); INSERT INTO EventLogs (EventTime, EventType, UserID, EventData)SELECT EventTime, EventType, UserID, EventDataFROM EventLogs_Old; -- STEP 3: If natural key exists and is unique-- Add constraint directlyALTER TABLE EventLogs_OldADD CONSTRAINT pk_events PRIMARY KEY (EventTime, EventType, UserID); -- Or create new table with proper keyCREATE TABLE EventLogs ( EventTime TIMESTAMP NOT NULL, EventType VARCHAR(50) NOT NULL, UserID INT NOT NULL, EventData TEXT, PRIMARY KEY (EventTime, EventType, UserID));Use natural keys when: the combination is stable, meaningful, and commonly used in queries. Use surrogate keys when: no natural key exists, the natural key is large/composite, values might change, or you need join performance. For event logs, surrogates are usually better since sequential IDs are efficient for indexing.
Schema changes require corresponding application updates. A phased transition minimizes risk and allows rollback if issues emerge.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
-- Create compatibility view that mimics old structure-- New tables: Employees (EmpID, Name, Department), EmployeeSkills (EmpID, SkillID), Skills (SkillID, SkillName) -- View that looks like the old comma-separated structureCREATE VIEW Employees_Legacy ASSELECT e.EmpID, e.Name, e.Department, STRING_AGG(s.SkillName, ', ' ORDER BY s.SkillName) AS SkillsFROM Employees eLEFT JOIN EmployeeSkills es ON e.EmpID = es.EmpIDLEFT JOIN Skills s ON es.SkillID = s.SkillIDGROUP BY e.EmpID, e.Name, e.Department; -- MySQL equivalent using GROUP_CONCATCREATE VIEW Employees_Legacy ASSELECT e.EmpID, e.Name, e.Department, GROUP_CONCAT(s.SkillName ORDER BY s.SkillName SEPARATOR ', ') AS SkillsFROM Employees eLEFT JOIN EmployeeSkills es ON e.EmpID = es.EmpIDLEFT JOIN Skills s ON es.SkillID = s.SkillIDGROUP BY e.EmpID, e.Name, e.Department; -- Legacy code can SELECT from Employees_Legacy and see familiar structure-- New code writes to normalized tables-- View automatically reflects current data -- For INSERTs, create a trigger or stored procedure-- that accepts old format and inserts into new tablesCREATE PROCEDURE AddEmployeeWithSkills( IN p_EmpID INT, IN p_Name VARCHAR(100), IN p_Department VARCHAR(50), IN p_SkillsCSV VARCHAR(500))BEGIN -- Insert employee INSERT INTO Employees (EmpID, Name, Department) VALUES (p_EmpID, p_Name, p_Department); -- Parse and insert skills -- (Implementation depends on DBMS - see earlier string splitting examples)END;During parallel operation, monitor both old and new query patterns. Track query performance, error rates, and data consistency. Set alerts for discrepancies between old and new structures. This monitoring provides confidence to proceed with deprecation.
Let's walk through a complete 1NF conversion for a table with multiple violation types.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181
-- ORIGINAL: Multiple 1NF violations in one tableCREATE TABLE StudentRecords_Old ( -- Violation 1: Encoded student ID (DEPT-YEAR-SEQ) StudentCode VARCHAR(20) PRIMARY KEY, -- Violation 2: Combined name FullName VARCHAR(100), -- Violation 3: Multi-valued email addresses Emails VARCHAR(500), -- Violation 4: Repeating course groups Course1_Name VARCHAR(100), Course1_Grade VARCHAR(2), Course2_Name VARCHAR(100), Course2_Grade VARCHAR(2), Course3_Name VARCHAR(100), Course3_Grade VARCHAR(2)); INSERT INTO StudentRecords_Old VALUES ( 'CS-2024-0042', 'John A. Smith', 'john@university.edu, john.smith@gmail.com', 'Database Systems', 'A', 'Algorithms', 'B+', 'Web Development', 'A-'); ------------------------------------------------------------------- STEP 1: DESIGN TARGET SCHEMA----------------------------------------------------------------- -- Departments reference tableCREATE TABLE Departments ( DeptCode VARCHAR(10) PRIMARY KEY, DeptName VARCHAR(100) NOT NULL); -- Students table with atomic columnsCREATE TABLE Students ( StudentID INT PRIMARY KEY AUTO_INCREMENT, DeptCode VARCHAR(10) NOT NULL, EnrollmentYear INT NOT NULL, SequenceNumber INT NOT NULL, FirstName VARCHAR(50) NOT NULL, MiddleName VARCHAR(50), LastName VARCHAR(50) NOT NULL, LegacyCode VARCHAR(20) UNIQUE, FOREIGN KEY (DeptCode) REFERENCES Departments(DeptCode), UNIQUE (DeptCode, EnrollmentYear, SequenceNumber)); -- Email addresses as separate tableCREATE TABLE StudentEmails ( StudentID INT NOT NULL, Email VARCHAR(200) NOT NULL, IsPrimary BOOLEAN DEFAULT FALSE, PRIMARY KEY (StudentID, Email), FOREIGN KEY (StudentID) REFERENCES Students(StudentID) ON DELETE CASCADE); -- Courses reference tableCREATE TABLE Courses ( CourseID INT PRIMARY KEY AUTO_INCREMENT, CourseName VARCHAR(100) NOT NULL UNIQUE); -- Enrollments junction tableCREATE TABLE Enrollments ( StudentID INT NOT NULL, CourseID INT NOT NULL, Grade VARCHAR(2), PRIMARY KEY (StudentID, CourseID), FOREIGN KEY (StudentID) REFERENCES Students(StudentID), FOREIGN KEY (CourseID) REFERENCES Courses(CourseID)); ------------------------------------------------------------------- STEP 2: POPULATE LOOKUP TABLES----------------------------------------------------------------- INSERT INTO Departments VALUES ('CS', 'Computer Science'), ('EE', 'Electrical Engineering'); ------------------------------------------------------------------- STEP 3: MIGRATE STUDENTS (parse encoded code and name)----------------------------------------------------------------- INSERT INTO Students (DeptCode, EnrollmentYear, SequenceNumber, FirstName, MiddleName, LastName, LegacyCode)SELECT SPLIT_PART(StudentCode, '-', 1), CAST(SPLIT_PART(StudentCode, '-', 2) AS INT), CAST(SPLIT_PART(StudentCode, '-', 3) AS INT), -- Parse name: assume "First Middle Last" format SPLIT_PART(FullName, ' ', 1), CASE WHEN array_length(string_to_array(FullName, ' '), 1) = 3 THEN SPLIT_PART(FullName, ' ', 2) ELSE NULL END, SPLIT_PART(FullName, ' ', -1), -- Last word StudentCodeFROM StudentRecords_Old; ------------------------------------------------------------------- STEP 4: MIGRATE EMAILS (parse multi-valued column)----------------------------------------------------------------- INSERT INTO StudentEmails (StudentID, Email, IsPrimary)SELECT s.StudentID, TRIM(email), ROW_NUMBER() OVER (PARTITION BY s.StudentID ORDER BY email) = 1FROM StudentRecords_Old oJOIN Students s ON s.LegacyCode = o.StudentCode, UNNEST(STRING_TO_ARRAY(o.Emails, ',')) AS emailWHERE o.Emails IS NOT NULL AND TRIM(email) != ''; ------------------------------------------------------------------- STEP 5: MIGRATE COURSES (extract distinct courses)----------------------------------------------------------------- INSERT INTO Courses (CourseName)SELECT DISTINCT CourseName FROM ( SELECT Course1_Name AS CourseName FROM StudentRecords_Old WHERE Course1_Name IS NOT NULL UNION SELECT Course2_Name FROM StudentRecords_Old WHERE Course2_Name IS NOT NULL UNION SELECT Course3_Name FROM StudentRecords_Old WHERE Course3_Name IS NOT NULL) AllCourses; ------------------------------------------------------------------- STEP 6: MIGRATE ENROLLMENTS (flatten repeating groups)----------------------------------------------------------------- INSERT INTO Enrollments (StudentID, CourseID, Grade)SELECT s.StudentID, c.CourseID, o.Course1_GradeFROM StudentRecords_Old oJOIN Students s ON s.LegacyCode = o.StudentCodeJOIN Courses c ON c.CourseName = o.Course1_NameWHERE o.Course1_Name IS NOT NULL UNION ALL SELECT s.StudentID, c.CourseID, o.Course2_GradeFROM StudentRecords_Old oJOIN Students s ON s.LegacyCode = o.StudentCodeJOIN Courses c ON c.CourseName = o.Course2_NameWHERE o.Course2_Name IS NOT NULL UNION ALL SELECT s.StudentID, c.CourseID, o.Course3_GradeFROM StudentRecords_Old oJOIN Students s ON s.LegacyCode = o.StudentCodeJOIN Courses c ON c.CourseName = o.Course3_NameWHERE o.Course3_Name IS NOT NULL; ------------------------------------------------------------------- STEP 7: VALIDATE----------------------------------------------------------------- -- Check student countSELECT 'Students' AS Entity, (SELECT COUNT(*) FROM StudentRecords_Old) AS OldCount, (SELECT COUNT(*) FROM Students) AS NewCount; -- Check email migrationSELECT s.FirstName, s.LastName, array_agg(e.Email) AS EmailsFROM Students sLEFT JOIN StudentEmails e ON s.StudentID = e.StudentIDGROUP BY s.StudentID, s.FirstName, s.LastName; -- Check enrollment counts matchSELECT o.StudentCode, (CASE WHEN o.Course1_Name IS NOT NULL THEN 1 ELSE 0 END + CASE WHEN o.Course2_Name IS NOT NULL THEN 1 ELSE 0 END + CASE WHEN o.Course3_Name IS NOT NULL THEN 1 ELSE 0 END) AS OldCourseCount, (SELECT COUNT(*) FROM Enrollments e JOIN Students s ON e.StudentID = s.StudentID WHERE s.LegacyCode = o.StudentCode) AS NewCourseCountFROM StudentRecords_Old o;You now have the complete toolkit for converting any table to First Normal Form. Let's consolidate the key techniques and principles:
Module Complete:
You have completed the First Normal Form module. You now understand what 1NF requires (atomicity, no repeating groups, unique rows), can recognize all types of violations, know how to flatten hierarchical data, and can systematically convert any table to 1NF compliance.
With 1NF as the foundation, subsequent modules will build on this base to address Second Normal Form (eliminating partial dependencies) and Third Normal Form (eliminating transitive dependencies).
Congratulations! You have mastered First Normal Form—the foundational level of database normalization. You can now identify violations, design compliant schemas, and execute systematic conversions. Continue to Second Normal Form to learn how to eliminate partial dependencies within your 1NF-compliant tables.