Loading content...
In the late 1960s, as organizations increasingly relied on computerized data management, a critical problem emerged: vendor lock-in. Each database vendor implemented their own proprietary syntax, data structures, and programming interfaces. Applications written for one system couldn't run on another. Skills didn't transfer. Data couldn't migrate. The industry desperately needed standardization.
Enter CODASYL—the Conference on Data Systems Languages. The same organization that had successfully standardized COBOL now turned its attention to the emerging field of database management. What they produced became the most comprehensive database standard of its era, formalizing the network data model and shaping how an entire generation of database systems would be designed, implemented, and programmed.
The CODASYL network model wasn't just an academic exercise—it became the blueprint for dozens of commercial database products that powered enterprise computing through the 1970s and 1980s.
By the end of this page, you will understand: (1) The historical context and formation of CODASYL's database efforts, (2) The complete DBTG architecture and its component specifications, (3) How the schema DDL and subschema define database structure, (4) The DML operations that manipulate network databases, (5) Commercial implementations and the CODASYL legacy.
CODASYL: A Proven Standardization Body
CODASYL (Conference on Data Systems Languages) was established in 1959 as a consortium of industry and government representatives. Its most famous achievement was the creation of COBOL—the Common Business-Oriented Language—which became the dominant programming language for business computing for decades.
The organization proved that competitors could collaborate on standards that benefited everyone. IBM, Honeywell, RCA, Burroughs, General Electric, UNIVAC, and government agencies all participated. This track record made CODASYL the natural choice to tackle database standardization.
The Formation of the DBTG:
In 1965, CODASYL established the List Processing Task Force to explore standardized approaches to data management. This evolved into the Data Base Task Group (DBTG) in 1967, chaired initially by William McGee and later by T. William Olle.
The DBTG's mission was ambitious:
| Year | Event | Significance |
|---|---|---|
| 1959 | CODASYL founded | Created to develop common business data processing standards |
| 1960 | COBOL published | Proved industry consensus on standards was possible |
| 1965 | List Processing Task Force | Initial exploration of data management standards |
| 1967 | DBTG established | Formal effort to standardize database management |
| 1969 | DBTG Initial Report | First draft specifications circulated for comment |
| 1971 | DBTG Final Report | Landmark specification defining the network model |
| 1973 | DDL Committee Report | Refined schema and subschema definitions |
| 1978 | CODASYL Journal of Development | Ongoing refinements and extensions |
| 1981 | Final major revision | Last significant update to CODASYL specifications |
Competing Influences:
The DBTG didn't work in a vacuum. Several existing systems influenced their thinking:
IDS (Integrated Data Store): Developed by Charles Bachman at General Electric in 1963-64. Bachman's work on IDS was foundational—he won the Turing Award in 1973 for his database contributions. IDS was effectively the prototype for the CODASYL model.
IMS (Information Management System): IBM's hierarchical database, first released in 1966 for the Apollo program. IMS represented the competing hierarchical approach and was a major commercial success.
TOTAL: Cincom's database system, which offered a simpler network implementation and gained significant market share.
The DBTG synthesized ideas from these systems while creating a comprehensive, vendor-neutral specification.
Charles Bachman, often called the 'father of databases,' was instrumental in both creating IDS and guiding CODASYL specifications. His 1973 Turing Award lecture, 'The Programmer as Navigator,' eloquently described the network model's navigational paradigm—programmers explicitly navigating through data relationships rather than declaratively specifying needed data.
The DBTG specification defined a clear three-level architecture that separated concerns and enabled flexible deployment. This architecture influenced the later ANSI-SPARC three-schema architecture that we still reference today.
Level 1: The Schema (Conceptual Definition)
The schema represents the complete logical structure of the database as seen by the database administrator. It defines:
There is exactly one schema for a database, and it is the authoritative definition of database structure.
Level 2: The Subschema (External Views)
The subschema provides application-specific views of the database. Each program may use a different subschema that:
Multiple subschemas can exist for a single schema, each tailored to different applications.
Level 3: The Physical Storage (Internal Level)
While the DBTG focused less on physical storage than on logical structure, it acknowledged that:
The three-level architecture enables data independence. Applications using subschemas are insulated from schema changes (if their subset remains valid). The schema is insulated from physical storage reorganization. This separation of concerns was revolutionary for maintainability.
The CODASYL Schema DDL provides a comprehensive syntax for defining the complete database structure. This DDL became the template for subsequent database definition languages, and understanding its constructs illuminates network model semantics.
Record Type Definition:
Record types are defined with their data items, location modes, and optional constraints:
1234567891011121314151617181920212223242526272829303132333435
SCHEMA NAME IS MANUFACTURING_DB RECORD TYPE SUPPLIER LOCATION MODE IS CALC USING SupplierID DUPLICATES ARE NOT ALLOWED 02 SupplierID TYPE IS DECIMAL 6. 02 SupplierName TYPE IS CHARACTER 40. 02 Address. 03 Street TYPE IS CHARACTER 30. 03 City TYPE IS CHARACTER 20. 03 State TYPE IS CHARACTER 2. 03 ZipCode TYPE IS CHARACTER 10. 02 Rating TYPE IS DECIMAL 2. 02 AvgLeadTime TYPE IS DECIMAL 4. 02 TotalOrders TYPE IS DECIMAL 10. RECORD TYPE PART LOCATION MODE IS CALC USING PartNumber DUPLICATES ARE NOT ALLOWED 02 PartNumber TYPE IS CHARACTER 15. 02 Description TYPE IS CHARACTER 50. 02 UnitCost TYPE IS DECIMAL 10 DOLLARS. 02 QuantityOnHand TYPE IS DECIMAL 8. 02 ReorderLevel TYPE IS DECIMAL 8. 02 Category TYPE IS CHARACTER 20. RECORD TYPE SUPPLY LOCATION MODE IS VIA SUPPLIER_SUPPLY SET 02 UnitPrice TYPE IS DECIMAL 10 DOLLARS. 02 LeadTimeDays TYPE IS DECIMAL 4. 02 MinOrderQty TYPE IS DECIMAL 6. 02 LastOrderDate TYPE IS DATE. 02 PreferredFlag TYPE IS CHARACTER 1.Key DDL Constructs for Records:
Set Type Definition:
Set types define relationships with comprehensive behavioral specifications:
1234567891011121314151617181920212223242526272829303132
SET TYPE IS SUPPLIER_SUPPLY OWNER IS SUPPLIER ORDER IS SORTED BY DEFINED KEYS MEMBER IS SUPPLY KEY IS UnitPrice ASCENDING INSERTION IS AUTOMATIC RETENTION IS MANDATORY SET SELECTION IS BY VALUE OF SupplierID IN SUPPLIER KEY IS ASCENDING UnitPrice DUPLICATES ARE LAST CHECK IS NOT NULL SupplierID IN OWNER SET TYPE IS PART_SUPPLY OWNER IS PART ORDER IS FIRST MEMBER IS SUPPLY INSERTION IS MANUAL RETENTION IS OPTIONAL SET SELECTION IS CURRENT OF PART_SUPPLY SET SET TYPE IS DEPT_PROJECT OWNER IS DEPARTMENT ORDER IS SORTED BY DEFINED KEYS MEMBER IS PROJECT KEY IS ProjectName ASCENDING INSERTION IS AUTOMATIC RETENTION IS FIXED SET SELECTION IS STRUCTURAL Owner-ID = PROJECT.DeptCodeThe subschema DDL defines application-specific views of the database. Each application program compiles against its subschema, seeing only authorized portions of the complete schema.
Subschema Capabilities:
12345678910111213141516171819202122232425262728293031323334
SUBSCHEMA NAME IS PAYROLL_VIEW OF SCHEMA COMPANY_DB RECORD SECTION. 01 EMPLOYEE RENAMING RECORD TO EMP-RECORD. 02 EmployeeID. 02 Name. 02 Department. 02 Salary. 02 HireDate. -- NOTE: DoB, SSN, and other items are NOT included -- The application cannot see sensitive personal data 01 DEPARTMENT RENAMING RECORD TO DEPT-RECORD. 02 DeptCode. 02 DeptName. 02 Manager. -- Budget is excluded - payroll app doesn't need it SET SECTION. SET NAME IS DEPT_EMPLOYEE RENAMING SET TO DEPT-HAS-EMPLOYEES. -- Application sees relationship by this name -- EMPLOYEE_DEPENDENT set is NOT included -- Payroll app cannot navigate to dependent records PRIVACY. LOCK FOR ALTER IS "PAYROLL_ADMIN". LOCK FOR DELETE IS "PAYROLL_ADMIN". LOCK FOR STORE IS "PAYROLL_USER". LOCK FOR MODIFY IS "PAYROLL_USER".Security Through Subschemas:
Subschemas provide a primary security mechanism in CODASYL systems:
Visibility Control: Only included elements are accessible. A subschema that omits the SALARY field cannot access salaries—the field is invisible.
Operation Restriction: Privacy locks can prevent certain operations even on visible elements. An application might read salaries but not modify them.
Navigation Boundaries: By omitting sets, applications are prevented from traversing certain relationships. A subschema without EMPLOYEE_DEPENDENT cannot reach dependent records.
This is security through view definition—the subschema acts as a security filter between the application and the full database.
Each application program is compiled against its specific subschema. This is compile-time binding—the application's view of the database is fixed when it's built. Changing a subschema requires recompiling affected applications. This differs from modern relational views, which are interpreted at runtime.
The CODASYL DML defines navigational operations for accessing and modifying database records. Unlike declarative query languages (SQL), the DML requires programmers to explicitly navigate through the database structure, following set relationships step by step.
The Currency Concept:
A fundamental concept in CODASYL DML is currency—the notion of current position in the database. The DBMS maintains several currency indicators:
DML operations implicitly read from and update these currency indicators, creating an imperative programming model where program state interacts with database state.
12345678910111213141516171819202122232425262728293031323334353637383940414243
/* ================================================================ CODASYL DML - Core Navigation and Retrieval Operations ================================================================ */ /* FIND Operations - Locate records without retrieving data */ FIND FIRST record-name WITHIN set-name -- Locate first member of current set occurrence -- Updates current of record type and current of set FIND NEXT record-name WITHIN set-name -- Locate next member in set after current position -- Loop through all members of a set FIND OWNER WITHIN set-name -- Locate owner of current set occurrence -- Navigate "upward" from member to owner FIND record-name VIA CURRENT OF set-name USING key-field -- Locate specific member by key value FIND ANY record-name USING key-field -- Direct access via CALC key (like hash lookup) FIND DUPLICATE record-name USING key-field -- Find next record with same key value /* GET Operations - Retrieve data into program variables */ GET record-name -- Retrieve all data items of current record -- Current record must have been established by FIND GET record-name; data-item-1; data-item-2 -- Retrieve only specified data items /* Combined FIND and GET */ OBTAIN NEXT record-name WITHIN set-name -- FIND and GET in one operation -- Convenience for common patternModification Operations:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
/* ================================================================ CODASYL DML - Modification Operations ================================================================ */ /* STORE - Insert new records */ MOVE "S1001" TO SupplierID IN SUPPLIER.MOVE "Acme Corp" TO SupplierName IN SUPPLIER.MOVE "123 Main St" TO Street IN SUPPLIER.STORE SUPPLIER. -- Creates new SUPPLIER record -- AUTOMATIC sets insert record into appropriate set occurrences -- CALC location mode hashes on SupplierID /* MODIFY - Update current record */ FIND ANY SUPPLIER USING SupplierID = "S1001".GET SUPPLIER.MOVE "Acme Corporation" TO SupplierName IN SUPPLIER.MODIFY SUPPLIER. -- Updates the current SUPPLIER record /* ERASE - Delete records */ FIND ANY SUPPLIER USING SupplierID = "S1001".ERASE SUPPLIER. -- Deletes current SUPPLIER record ERASE SUPPLIER ALL. -- Deletes SUPPLIER and all MANDATORY members (cascade) ERASE SUPPLIER SELECTIVE. -- Deletes SUPPLIER and OPTIONAL members with no other owner -- Keeps members belonging to other sets /* CONNECT/DISCONNECT - Set membership operations */ FIND ANY PART USING PartNumber = "P001".FIND ANY SUPPLIER USING SupplierID = "S1002".CONNECT SUPPLY TO SUPPLIER_SUPPLY. -- Adds current SUPPLY record to the set occurrence -- Owned by current SUPPLIER DISCONNECT SUPPLY FROM SUPPLIER_SUPPLY. -- Removes current SUPPLY record from set -- Only valid if RETENTION IS OPTIONAL RECONNECT SUPPLY WITHIN SUPPLIER_SUPPLY. -- Moves member from one set occurrence to another -- Current of set changes to new ownerNotice how every operation depends on establishing 'currency' first. You FIND to position, then GET to retrieve, MODIFY to update, or ERASE to delete. This is fundamentally different from SQL's declarative approach where you specify what you want and the system determines how to get it. In CODASYL, the programmer explicitly specifies how to navigate to the data.
To fully appreciate CODASYL programming, let's examine a complete example that navigates the network structure to answer a business question:
Problem: Find all parts supplied by supplier 'S1001' and list each part's description along with the supply price from that supplier.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172
IDENTIFICATION DIVISION.PROGRAM-ID. SUPPLIER-PARTS-REPORT. DATA DIVISION.WORKING-STORAGE SECTION.01 WS-PART-COUNT PIC 9(6) VALUE 0.01 WS-EOF-FLAG PIC 9 VALUE 0. 88 END-OF-SET VALUE 1. * Database records defined in subschema are available here* SUPPLIER, PART, SUPPLY records with their data items PROCEDURE DIVISION. MAIN-PARAGRAPH. PERFORM OPEN-DATABASE. PERFORM FIND-SUPPLIER. IF END-OF-SET DISPLAY "Supplier S1001 not found" PERFORM CLOSE-DATABASE STOP RUN. PERFORM PROCESS-SUPPLIER-PARTS. DISPLAY "Total parts from supplier S1001: " WS-PART-COUNT. PERFORM CLOSE-DATABASE. STOP RUN. FIND-SUPPLIER.* Direct access via CALC key - O(1) lookup using hash MOVE "S1001" TO SupplierID IN SUPPLIER. FIND ANY SUPPLIER USING SupplierID. IF DB-STATUS NOT = "00" SET END-OF-SET TO TRUE. PROCESS-SUPPLIER-PARTS.* Navigate through SUPPLIER_SUPPLY set (all supplies from this supplier) MOVE 0 TO WS-EOF-FLAG. FIND FIRST SUPPLY WITHIN SUPPLIER_SUPPLY. IF DB-STATUS = "00" PERFORM PROCESS-ONE-SUPPLY UNTIL END-OF-SET. PROCESS-ONE-SUPPLY.* We have a SUPPLY record - now navigate to its PART owner GET SUPPLY. * Retrieve supply data * Navigate via PART_SUPPLY set to get the associated part FIND OWNER WITHIN PART_SUPPLY. * Move to PART that owns this supply GET PART. * Retrieve part data * Display the results DISPLAY "Part: " PartNumber IN PART " - " Description IN PART " Price: $" UnitPrice IN SUPPLY. ADD 1 TO WS-PART-COUNT. * Navigate back and continue with next supply* Current of SUPPLIER_SUPPLY was NOT changed by PART navigation FIND NEXT SUPPLY WITHIN SUPPLIER_SUPPLY. IF DB-STATUS NOT = "00" SET END-OF-SET TO TRUE. OPEN-DATABASE. OPEN ALL. IF DB-STATUS NOT = "00" DISPLAY "Database open failed: " DB-STATUS STOP RUN. CLOSE-DATABASE. CLOSE ALL.Execution Flow Analysis:
FIND ANY SUPPLIER: Uses CALC location mode to directly hash-access supplier S1001. O(1) operation.
FIND FIRST SUPPLY WITHIN SUPPLIER_SUPPLY: Locates first member in the set occurrence owned by current supplier. Follows owner's first-member pointer.
GET SUPPLY: Retrieves supply record data into program variables.
FIND OWNER WITHIN PART_SUPPLY: From current SUPPLY record, follows its owner pointer in the PART_SUPPLY set to reach the associated PART. This is cross-set navigation.
GET PART: Retrieves part data.
FIND NEXT SUPPLY WITHIN SUPPLIER_SUPPLY: Returns to the SUPPLIER_SUPPLY set (currency preserved) and advances to next member.
Key Observation: The program navigates bidirectionally through the network—down from SUPPLIER to SUPPLY, then up to PART, then continues down from SUPPLIER to the next SUPPLY. The dual set membership of SUPPLY enables this.
Notice that the programmer must understand the entire database structure, know which sets connect which records, manage currency carefully, and code the complete navigation path. This is the core criticism of navigational databases—the burden on programmers is substantial, and program logic is tightly coupled to database structure.
The CODASYL specifications spawned numerous commercial implementations. These products dominated enterprise database markets throughout the 1970s and into the mid-1980s, before relational databases achieved sufficient performance for transaction processing.
Major CODASYL Database Products:
| Product | Vendor | Platform | Notable Features |
|---|---|---|---|
| IDMS | Cullinet (later CA) | IBM Mainframes | Most successful CODASYL implementation; added SQL interface in 1980s |
| IDS/II | Honeywell | Honeywell Mainframes | Descendant of Bachman's original IDS; strong CODASYL compliance |
| DMS-1100 | Univac/Unisys | Univac 1100 series | Known for performance and reliability |
| DBMS-10/20 | Digital Equipment | PDP-10, DECSYSTEM-20 | Popular in academic and research settings |
| VAX DBMS | Digital Equipment | VAX/VMS | Later added relational capabilities |
| IMAGE/3000 | Hewlett-Packard | HP 3000 | Simplified network model; very successful in midrange market |
| TOTAL | Cincom | Multiple platforms | Simpler implementation; large market share |
| DMS-II | Burroughs | Burroughs mainframes | Integrated with Burroughs COBOL environment |
IDMS: The Success Story
Integrated Database Management System (IDMS) by Cullinet became the most commercially successful CODASYL implementation. Key factors in its success:
IBM Platform Compatibility: Running on IBM mainframes gave access to the largest market segment.
Performance: Pointer-based navigation delivered excellent transaction processing performance—often faster than early relational systems.
Comprehensive Tools: IDMS included data dictionary, query facilities, and application development tools.
Evolution: Cullinet added SQL support in the 1980s (IDMS/SQL), allowing customers to transition gradually to relational access while preserving network investments.
Installed Base: By the mid-1980s, IDMS ran in over 4,000 installations worldwide.
After Cullinet was acquired by Computer Associates (CA), IDMS continued to be maintained and is still operational in some legacy environments today—a testament to the durability of these systems.
Although new network database deployments are rare, many CODASYL databases remain in production. Organizations with decades of investment in IDMS, IDS/II, or similar systems continue to run them for mission-critical applications. Migration costs and risks often outweigh the benefits of rewriting on modern platforms.
Although the relational model ultimately prevailed, CODASYL's influence on database technology remains significant:
Conceptual Contributions:
Data Independence: The three-level architecture (schema, subschema, storage) pioneered the separation of logical and physical concerns that remains fundamental to database design.
Data Definition Language: The concept of a formal DDL separate from procedural code became universal.
Integrity Constraints: CODASYL's membership constraints (MANDATORY, OPTIONAL, FIXED) and ordering specifications anticipated modern referential integrity concepts.
Database Administration: The DBA role, distinct from application programming, was formalized.
Standard Interfaces: The attempt to create vendor-neutral specifications, while imperfect, established the expectation that databases should have standard interfaces.
Modern Echoes: Graph Databases
Interestingly, modern graph databases like Neo4j, Amazon Neptune, and TigerGraph share conceptual similarities with CODASYL:
The key differences are modern graph databases offer declarative query languages (Cypher, Gremlin, SPARQL) that hide navigational details, addressing CODASYL's primary weakness while retaining its structural flexibility.
You now understand CODASYL—the committee that standardized the network model, the comprehensive specifications they produced, and the commercial implementations that dominated enterprise computing for two decades. The navigational paradigm they formalized may have given way to relational declarativism, but their architectural concepts and the problems they solved remain relevant to modern database design. Next, we'll explore set relationships in depth.