Directory Implementation - Learning Module

Loading content...

0/227

Directory Entries

The Atomic Unit of Directories

Every directory—whether implemented as a linear list, hash table, or B-tree—is ultimately a collection of directory entries. Each entry answers a simple question: "What inode (or file metadata) corresponds to this filename?"

But the simplicity of this question belies the complexity of the answer. How long can filenames be? What character encodings are supported? What metadata should be duplicated in the entry vs. stored only in the inode? How do we mark entries as deleted without expensive compaction?

This page provides comprehensive coverage of directory entry design: the exact byte-level structure of entries in major file systems, strategies for variable-length filenames, metadata choices and their implications, and how entries interact with the directory structures we've studied.

What You Will Learn

By the end of this page, you will understand: (1) The essential components of every directory entry, (2) Fixed-length vs variable-length entry formats, (3) How different file systems structure their entries, (4) Metadata inclusion trade-offs, (5) Entry deletion and space reclamation, and (6) Special entries like '.' and '..'.

Essential Components of Directory Entries

At minimum, a directory entry must contain two pieces of information: the name of the file and a way to locate the file's metadata. Everything else is optional—but optional fields can dramatically improve performance.

Mandatory Components:

Required Entry Fields

•Filename — The human-readable name of the file. May be fixed-length (early systems), variable-length with length prefix, or null-terminated. Encoding varies (ASCII, UTF-8, UTF-16).
•File Locator — How to find the file's metadata and data. In Unix systems, this is an inode number. In FAT, it's the first cluster number. In NTFS, it's an MFT record reference.

Optional Components (Common Additions):

Optional Directory Entry Fields
Field	Purpose	Trade-off
File type	Is it a file, directory, symlink, etc.?	Avoids reading inode just to determine type
Entry length	For variable-length entries, how long is this entry?	Enables skipping entries without parsing names
Name length	How many bytes in the filename?	Enables reading variable-length names correctly
Hash/checksum	For integrity checking or hash-based lookup	Speeds up hash table searches
File size	Cached copy of file size from inode	Enables 'ls -l' without reading all inodes
Timestamps	Cached creation/modification times	Same benefit as file size
Deletion marker	Is this entry deleted/free?	Enables lazy deletion without compaction

Minimal vs Rich Directory Entries
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Minimal Entry (Original Unix V7):
┌─────────────────────────────┐
│ inode (2 bytes)             │  ← Just the inode number
│ name (14 bytes, padded)     │  ← Just the filename
└─────────────────────────────┘
Total: 16 bytes
 
Rich Entry (NTFS):
┌─────────────────────────────┐
│ MFT Reference (8 bytes)     │  ← File locator
│ Entry size (2 bytes)        │  ← Variable length support
│ Name length (2 bytes)       │  ← For reading name
│ Flags (4 bytes)             │  ← Entry metadata
│ Parent directory (8 bytes)  │  ← For consistency checks
│ Creation time (8 bytes)     │  ← Cached timestamp
│ Modified time (8 bytes)     │  ← Cached timestamp
│ Access time (8 bytes)       │  ← Cached timestamp
│ File size (8 bytes)         │  ← Cached from MFT
│ Allocated size (8 bytes)    │  ← Cached from MFT
│ Filename (variable, UTF-16) │  ← Up to 255 chars
└─────────────────────────────┘
Total: 66+ bytes (but 'dir' doesn't need MFT reads!)
 
The trade-off:
- Minimal entries save space, but every 'ls -l' reads every inode
- Rich entries cost space, but directory listing is fast

The Metadata Duplication Dilemma

Storing file size in directory entries means it exists in two places (entry and inode). This speeds up 'ls -l' but creates a consistency risk: if the file is modified, both copies must be updated. File systems handle this with journaling or atomic updates.

Fixed-Length vs Variable-Length Entries

The choice between fixed-length and variable-length entries affects every aspect of directory implementation.

Fixed-Length Entry Systems:

Fixed-Length Entry Examples
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
// Original Unix V7: 16 bytes fixed
struct v7_direct {
    uint16_t d_ino;      // Inode number (0 = unused)
    char d_name[14];     // Filename, null-padded
};
 
// DOS FAT 8.3: 32 bytes fixed
struct fat_dirent {
    char name[8];        // Filename, space-padded
    char ext[3];         // Extension, space-padded
    uint8_t attr;        // Attributes
    uint8_t reserved[10];
    uint16_t time;       // Modification time
    uint16_t date;       // Modification date
    uint16_t cluster;    // First cluster
    uint32_t size;       // File size
};
 
// Advantages of fixed-length:
// 1. Direct indexing: entry[n] is at offset n * entry_size
// 2. Simple allocation: any empty slot works
// 3. Easy recovery: known boundaries between entries
// 4. Binary search possible (if sorted)
 
// Disadvantages:
// 1. Wasted space for short names ("a.c" uses 14 bytes)
// 2. Arbitrary name length limits (14 chars, 8.3)
// 3. Can't adapt to different needs
 
// Example: Finding entry #47 in fixed-length directory
struct v7_direct* get_entry(void* dir_data, int index) {
    return (struct v7_direct*)dir_data + index;  // O(1)!
}

Variable-Length Entry Systems:

Variable-Length Entry Examples
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// Linux ext4 directory entry
struct ext4_dir_entry_2 {
    __le32 inode;        // Inode number
    __le16 rec_len;      // Total entry length (including padding)
    __u8   name_len;     // Actual filename length
    __u8   file_type;    // File type (for quick filtering)
    char   name[];       // Filename (NOT null-terminated)
};
 
// Entry layout in memory:
// ┌────────┬────────┬────────┬────────┬─────────────────────┐
// │ inode  │rec_len │name_len│  type  │   name + padding    │
// │4 bytes │2 bytes │ 1 byte │ 1 byte │   variable bytes    │
// └────────┴────────┴────────┴────────┴─────────────────────┘
 
// rec_len is aligned to 4 bytes and includes padding
// This allows quick traversal and absorbs deleted space
 
// Example: Parsing variable-length directory block
void parse_directory_block(char* block, size_t block_size) {
    size_t offset = 0;
    
    while (offset < block_size) {
        struct ext4_dir_entry_2* entry = 
            (struct ext4_dir_entry_2*)(block + offset);
        
        if (entry->rec_len == 0) break;  // Corruption check
        
        if (entry->inode != 0) {  // Not deleted
            // entry->name is NOT null-terminated!
            printf("%.*s -> inode %u\n", 
                   entry->name_len, entry->name, entry->inode);
        }
        
        offset += entry->rec_len;  // Skip to next entry
    }
}
 
// Advantages of variable-length:
// 1. Space efficient (short names use less space)
// 2. Long names supported (255 chars typical)
// 3. Flexible for future extensions
 
// Disadvantages:
// 1. Must scan sequentially (no direct indexing)
// 2. Complex allocation (find gap large enough)
// 3. Fragmentation as entries deleted/added

Fixed vs Variable Entry Comparison
Aspect	Fixed-Length	Variable-Length
Space for 'a.txt'	14-32 bytes	12 bytes (8 header + 4 name)
Space for 'very_long_filename.txt'	Truncated or error!	28 bytes
Find entry by index	O(1) - direct calculation	O(n) - must scan
Deletion	Set inode to 0	Merge rec_len with neighbor
Insertion	Find any slot with inode 0	Find gap with enough space
Max name length	Fixed (8, 14 chars)	Flexible (255+ chars)

The rec_len Magic

ext4's rec_len field serves multiple purposes: (1) it tells you where the next entry starts, (2) it includes padding for alignment, (3) when an entry is deleted, the previous entry's rec_len grows to absorb the space. This avoids compaction while allowing space reuse.

Directory Entry Formats Across File Systems

Let's examine the exact directory entry formats used by major file systems.

FAT (File Allocation Table):

FAT Directory Entries
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
// FAT short (8.3) directory entry - 32 bytes exactly
struct fat_short_entry {
    char     name[8];        // Filename, space-padded, uppercase
    char     ext[3];         // Extension, space-padded, uppercase
    uint8_t  attr;           // Attribute byte
    uint8_t  nt_reserved;    // Reserved for NT
    uint8_t  create_time_ms; // Creation time (10ms units)
    uint16_t create_time;    // Creation time (2-second resolution)
    uint16_t create_date;    // Creation date
    uint16_t access_date;    // Last access date (no time!)
    uint16_t cluster_hi;     // High 16 bits of cluster (FAT32)
    uint16_t modify_time;    // Modification time
    uint16_t modify_date;    // Modification date
    uint16_t cluster_lo;     // Low 16 bits of cluster
    uint32_t size;           // File size in bytes
};
 
// Attribute byte flags
#define ATTR_READ_ONLY  0x01
#define ATTR_HIDDEN     0x02
#define ATTR_SYSTEM     0x04
#define ATTR_VOLUME_ID  0x08  // Volume label, not a file
#define ATTR_DIRECTORY  0x10
#define ATTR_ARCHIVE    0x20
#define ATTR_LFN        0x0F  // Long filename marker
 
// FAT Long Filename (LFN) entry - also 32 bytes
// Used for names > 8.3 or with special characters
struct fat_lfn_entry {
    uint8_t  order;          // Sequence number (0x40 = last)
    uint16_t name1[5];       // Characters 1-5 (UCS-2)
    uint8_t  attr;           // Always 0x0F
    uint8_t  type;           // Always 0x00
    uint8_t  checksum;       // Checksum of short name
    uint16_t name2[6];       // Characters 6-11
    uint16_t cluster;        // Always 0x0000
    uint16_t name3[2];       // Characters 12-13
};
 
// Long filenames are stored BACKWARDS before short entry:
// Entry -3: LFN part 3 (chars 27-39)
// Entry -2: LFN part 2 (chars 14-26)  
// Entry -1: LFN part 1 (chars 1-13)
// Entry 0:  Short entry (8.3 name, actual metadata)

ext4 (Linux Extended Filesystem):

ext4 Directory Entries
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
// ext4 directory entry (variable length)
struct ext4_dir_entry_2 {
    __le32 inode;        // 4 bytes: Inode number (0 = deleted)
    __le16 rec_len;      // 2 bytes: Entry length (aligned to 4)
    __u8   name_len;     // 1 byte: Name length (1-255)
    __u8   file_type;    // 1 byte: File type
    char   name[];       // 0-255 bytes: Filename
};
 
// File type values (set at creation time)
#define EXT4_FT_UNKNOWN   0
#define EXT4_FT_REG_FILE  1
#define EXT4_FT_DIR       2
#define EXT4_FT_CHRDEV    3
#define EXT4_FT_BLKDEV    4
#define EXT4_FT_FIFO      5
#define EXT4_FT_SOCK      6
#define EXT4_FT_SYMLINK   7
 
// Example directory block:
// Offset   inode  rec_len name_len type  name
// 0        2      12      1        2     "."
// 12       2      12      2        2     ".."
// 24       100    16      8        1     "file.txt"
// 40       101    20      10       2     "subdir"
// 60       102    4036    7        1     "data.db"
//                 ^^^^
// Last entry's rec_len extends to end of block
 
// Why file_type exists:
// Without it: readdir() reads inode for EVERY file to determine type
// With it: Can filter "show only directories" without inode reads
// Trade-off: One extra byte per entry, must keep synchronized

NTFS (New Technology File System):

NTFS Index Entries
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// NTFS stores directory entries as index entries in B+ trees
// Each entry contains a copy of the FILE_NAME attribute
 
struct ntfs_index_entry {
    // Index entry header
    uint64_t mft_reference;     // MFT record number + sequence
    uint16_t entry_length;      // Total size of this entry
    uint16_t key_length;        // Size of the key (FILE_NAME)
    uint32_t flags;             // INDEX_ENTRY_NODE, INDEX_ENTRY_END
    
    // FILE_NAME attribute (the key)
    uint64_t parent_directory;  // Parent's MFT reference
    uint64_t creation_time;     // 100-nanosecond intervals since 1601
    uint64_t modification_time;
    uint64_t mft_modification_time;
    uint64_t access_time;
    uint64_t allocated_size;    // Size on disk (clusters)
    uint64_t data_size;         // Logical file size
    uint32_t file_flags;        // READ_ONLY, HIDDEN, SYSTEM, etc.
    uint32_t reparse_point;     // For symlinks, mount points
    uint8_t  name_length;       // Filename length in characters
    uint8_t  name_type;         // POSIX, WIN32, DOS, WIN32_AND_DOS
    wchar_t  name[];            // UTF-16LE filename
    
    // If INDEX_ENTRY_NODE flag set:
    // uint64_t vcn;            // VCN of child node
};
 
// NTFS duplicates substantial metadata in directory entries
// This makes 'dir' incredibly fast:
// - No MFT reads needed for size, dates, attributes
// - Only need MFT for actual file access
 
// Name types:
// POSIX: Case-sensitive, full Unicode
// WIN32: Case-insensitive, most Unicode
// DOS: 8.3 short name for compatibility
// WIN32_AND_DOS: WIN32 name that's also valid DOS

Directory Entry Format Comparison
Feature	FAT	ext4	NTFS
Entry size	32 bytes (fixed)	8-263 bytes (variable)	66+ bytes (variable)
Max name length	255 (via LFN)	255 bytes	255 chars (510 bytes)
Name encoding	OEM/UCS-2	User-defined (usually UTF-8)	UTF-16LE
Metadata cached	Size, dates	File type only	Size, dates, flags
Deletion marker	First byte = 0xE5	inode = 0	Entry removed from B+ tree

Entry Deletion and Space Reclamation

When files are deleted, their directory entries must be removed. Different file systems handle this differently, with implications for performance and data recovery.

Lazy Deletion (Marking Only):

FAT Deletion
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// FAT deletion: First byte of name set to 0xE5
void fat_delete_entry(struct fat_short_entry* entry) {
    entry->name[0] = 0xE5;  // Magic "deleted" marker
    // Everything else unchanged!
    // Size, cluster, dates all still present
}
 
// Why 0xE5?
// - 0x00 means "never used" (end of directory)
// - 0xE5 means "was used, now deleted"
// - Originally chosen by Tim Paterson for 86-DOS/MS-DOS
// - Japanese Kanji codepage uses E5 as first byte
//   So they use 0x05 instead (converted on read)
 
// Space reclamation:
// - Deleted entries can be reused for new files
// - Scan continues past deleted entries
// - Directory shrinks only if last entries deleted
 
// Data recovery implication:
// - Undelete tools scan for 0xE5 entries
// - If cluster chain not overwritten, file recoverable
// - This is why "secure delete" overwrites data

ext4 Deletion (rec_len Absorption):

ext4 Deletion
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// ext4 deletion: Merge deleted entry with previous
void ext4_delete_entry(struct ext4_dir_entry_2* prev,
                       struct ext4_dir_entry_2* target) {
    // Set inode to 0 (entry is now free)
    target->inode = 0;
    
    // Absorb target's space into previous entry
    if (prev != NULL) {
        prev->rec_len += target->rec_len;
        // Now prev->rec_len spans both entries
        // target is effectively "gone"
    }
    // If target is first entry, just set inode=0
    // Space reclaimed when new entry fits in this slot
}
 
// Before deletion of "b.txt":
// ┌──────────────────┬───────────────────┬──────────────────┐
// │ a.txt (rec=20)   │ b.txt (rec=20)    │ c.txt (rec=4056) │
// └──────────────────┴───────────────────┴──────────────────┘
 
// After deletion:
// ┌──────────────────┬───────────────────┬──────────────────┐
// │ a.txt (rec=40)   │ (absorbed space)  │ c.txt (rec=4056) │
// └──────────────────┴───────────────────┴──────────────────┘
 
// Space reuse:
void ext4_insert_entry(char* block, const char* name, uint32_t inode) {
    struct ext4_dir_entry_2* entry = (struct ext4_dir_entry_2*)block;
    
    while (entry->rec_len > 0) {
        // Calculate minimum space this entry needs
        size_t min_size = 8 + entry->name_len;
        min_size = (min_size + 3) & ~3;  // Align to 4
        
        // If entry has extra space (from absorbed deletions)
        size_t new_entry_size = 8 + strlen(name);
        new_entry_size = (new_entry_size + 3) & ~3;
        
        if (entry->rec_len - min_size >= new_entry_size) {
            // Split: shrink this entry, insert new after it
            // ... (insertion code)
        }
        
        entry = next_entry(entry);
    }
}

B-tree Based Deletion (True Removal):

B-tree Deletion
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
B-tree directory deletion (XFS, NTFS, Btrfs):
 
Unlike linear lists, B-trees actually REMOVE entries:
 
1. Locate entry in leaf node (O(log n))
2. Remove entry from leaf
3. If leaf underflows (< minimum keys):
   a. Borrow from sibling if possible
   b. Otherwise, merge with sibling
4. Recursively fix parent if needed
 
Advantages:
- No fragmentation (entries always contiguous in nodes)
- No "deleted entry" scanning overhead
- Space immediately available after node cleanup
 
Disadvantages:
- More complex than marking
- Potential for cascading rebalancing
- Harder to "undelete" (entry truly gone)
 
Data recovery with B-trees:
- Much harder than FAT/ext4
- Entry removed from index immediately
- Only hope is journaling or snapshots
- Forensic tools must reconstruct trees

Deletion Strategy Comparison
Strategy	Speed	Space Reclaim	Recoverability
Mark deleted (FAT)	O(1) constant	Immediate reuse possible	High (data intact)
Absorb space (ext4)	O(1) constant	Immediate for same-size entries	Medium (inode may persist)
True removal (B-tree)	O(log n)	Eventual (after rebalance)	Low (entry gone)

Security Implication

Lazy deletion in FAT leaves filename visible even after deletion. Sensitive filenames can be recovered. This is why secure deletion tools overwrite the entire directory entry, not just the first byte. B-tree based systems are slightly better but filenames may still exist in journal or old tree versions.

Special Directory Entries

Every directory contains special entries that serve structural purposes beyond normal file references.

The '.' and '..' Entries:

Dot Entries
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// Every Unix/Linux directory contains:
// "."  - Points to the directory itself
// ".." - Points to the parent directory
 
// When directory "/home/alice" (inode 100) is created:
struct ext4_dir_entry_2 dot = {
    .inode = 100,       // This directory's inode
    .rec_len = 12,
    .name_len = 1,
    .file_type = EXT4_FT_DIR,
    .name = "."
};
 
struct ext4_dir_entry_2 dotdot = {
    .inode = 50,        // Parent directory's inode (/home)
    .rec_len = 4084,    // Spans rest of block
    .name_len = 2,
    .file_type = EXT4_FT_DIR,
    .name = ".."
};
 
// Root directory is special:
// ".."  points to itself (inode 2 on ext4)
// There is no parent above root
 
// Why do these entries exist?
// 1. Convenience: "cd .." works without path parsing
// 2. Link counting: "." is why directory link count starts at 2
// 3. Path resolution: Relative paths need these references
// 4. Recovery: Can reconstruct tree by following ".." pointers
 
// Hard link count for directories:
// Empty directory: 2 (parent's reference + self ".")
// With subdirectory: 3 (parent + "." + subdir's "..")
// Each subdirectory adds 1 to parent's link count

Volume Labels and Special Markers:

Special Entries
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// FAT volume label entry
struct fat_volume_label {
    char name[11];       // Volume label (space-padded)
    uint8_t attr;        // 0x08 = ATTR_VOLUME_ID
    // Rest of 32-byte entry is zeros
};
// Only exists in root directory
// Not a real file - just stores the volume name
 
// FAT End-of-Directory marker
// Entry with name[0] == 0x00 means "no more entries"
// Different from 0xE5 (deleted) - stops scanning entirely
 
// NTFS "." and ".." handling:
// NTFS stores "." as special case (MFT reference to self)
// ".." is implicit from parent reference in FILE_NAME attribute
// Not stored as separate index entries like Unix
 
// Symbolic link entries:
// ext4: file_type = EXT4_FT_SYMLINK, points to inode with target path
// NTFS: Reparse point with symlink data
// FAT: Not supported (VFAT added .lnk files as workaround)

Special Entry Considerations

•Hidden Files — Unix hides files starting with '.'. Windows uses HIDDEN attribute in directory entry. Not enforced by file system, just convention.
•Case Sensitivity — Entry comparison may be case-sensitive (Unix) or case-insensitive (Windows). Affects lookup, not storage.
•Reserved Names — Windows prohibits CON, PRN, AUX, NUL, COM1-9, LPT1-9. Unix only prohibits '/' in names and empty names.
•Maximum Path — Windows: 260 chars (MAX_PATH). Linux: 4096 chars (PATH_MAX). Entries themselves have lower limits (255 typical).

Why Prohibit '/' in Filenames?

'/' is the path separator in Unix. If allowed in filenames, '/home/user/file/name' would be ambiguous: is 'file/name' one file or 'file' directory containing 'name'? The prohibition is fundamental to path parsing.

Summary: Directory Entries

Directory entries are the atomic units that give meaning to directory data structures. Their design reflects decades of evolution in file system engineering.

Key Takeaways

•Essential Components — Every entry needs a filename and file locator. Additional metadata (type, size, timestamps) trade space for performance.
•Fixed vs Variable — Fixed-length entries enable direct indexing but waste space and limit names. Variable-length entries are flexible but require sequential scanning.
•Format Variations — FAT uses 8.3 with LFN extensions. ext4 uses variable-length with rec_len. NTFS caches extensive metadata in rich entries.
•Deletion Strategies — Lazy deletion (marking) is fast but leaves traces. True removal (B-trees) is cleaner but more complex.
•Special Entries — '.' and '..' enable relative paths and link counting. Volume labels, hidden files, and reserved names add file system personality.
•Design Trade-offs — Entry format choices affect space efficiency, lookup speed, listing performance, and data recoverability.

What's Next:

Now that we understand how individual entries are structured, the next page examines name lookup—the complete algorithm for resolving a pathname like /home/alice/documents/report.pdf into the actual file location.

Page Complete

You now understand directory entries comprehensively: their structure across file systems, the trade-offs between fixed and variable length formats, deletion and space reclamation strategies, and special entries. This knowledge completes your understanding of what directory data structures contain.