Loading learning content...
When you look at a file in your operating system's file browser, you see far more than just the file's name. You see its size, when it was last modified, perhaps an icon indicating its type, and maybe even a preview of its contents. All of this information—everything except the file's actual data—is collectively called file attributes or file metadata.
File attributes answer fundamental questions:
Understanding file attributes is crucial for system programming, administration, and security. This page explores every major attribute type in depth.
By the end of this page, you will understand all the major file attributes maintained by operating systems, how they're used in practice, how different operating systems handle them differently, and the implications for programming and system administration.
File attributes serve several critical purposes in operating system design:
1. User Information: Attributes like size, modification time, and file type help users understand and manage their files. When you sort files by date or size in a file manager, you're using attributes.
2. System Management: The operating system uses attributes to make decisions. Location attributes tell the OS where to find the file's blocks. Protection attributes determine access control. Type attributes may influence how the file is opened.
3. Application Behavior: Programs read attributes to make decisions. A backup program checks modification times to detect changed files. A compiler checks timestamps to avoid recompiling unchanged source files.
4. Security Enforcement: Permission and ownership attributes form the foundation of access control. The OS reads these attributes on every file access to determine if the operation is allowed.
5. Forensics and Auditing: Timestamps and ownership records support security investigations. When was a file changed? By whom? These questions are answered by attributes.
On most file systems, attributes are stored separately from file data. Unix/Linux stores them in 'inodes.' NTFS stores them in the Master File Table (MFT). This separation means you can read all a file's attributes without ever touching its data—essential for fast directory listings and searches.
Every file system, regardless of its design, must track certain fundamental attributes. These core attributes are universal across virtually all operating systems:
| Attribute | Description | Typical Storage |
|---|---|---|
| Name | Human-readable identifier for the file | Directory entry |
| Size | Length of file data in bytes | Inode / MFT (64-bit integer) |
| Location | Pointer(s) to physical data blocks | Inode / MFT (block list) |
| Type | Indicator of file kind (regular, directory, symlink, etc.) | Inode / MFT (type field) |
| Protection | Access control information | Inode / MFT (mode bits or ACL) |
| Owner | User and group that own the file | Inode (user ID, group ID) |
| Timestamps | Creation, modification, access times | Inode / MFT (multiple fields) |
Let's examine each of these core attributes in detail:
In many file systems (Unix/Linux especially), the file name is NOT stored with the file's other attributes. The name is stored in the directory entry, which contains the name and a reference to the file's inode. This allows hard links—multiple names for the same file.
The size attribute seems simple—it's just a number indicating how many bytes the file contains. But there's surprising subtlety here.
Logical Size vs. Physical Size:
The logical size is the number of bytes from the file's start to its end—what an application sees when it reads the entire file. The physical size is the actual disk space consumed, which may differ due to:
Example: Sparse File
// Create a sparse file
int fd = open("sparse.bin", O_CREAT | O_WRONLY);
lseek(fd, 1073741823, SEEK_SET); // Seek to 1 GB
write(fd, "X", 1); // Write 1 byte
close(fd);
Logical size: 1,073,741,824 bytes (1 GB) Physical size: ~4 KB (one block plus inode)
The file appears to contain 1 GB of data (mostly zeros), but only one small portion is actually stored on disk.
ls -l: Shows logical sizels -s: Shows physical blocks useddu -h: Shows disk usage (physical)stat: Shows both logical and block countwc -c: Counts actual bytes (logical)Copying a sparse file with naive tools may 'materialize' it—filling all the holes with actual zeros. A 1 GB sparse file that uses 4 KB on disk could become a 1 GB file consuming 1 GB of disk space. Use 'cp --sparse=always' or 'rsync --sparse' to preserve sparseness.
File systems maintain multiple timestamps to track file history. The exact timestamps available vary by file system, but most support at least three:
Unix/Linux Standard Timestamps (POSIX):
| Timestamp | Name | Updated When |
|---|---|---|
| atime | Access Time | File data is read |
| mtime | Modification Time | File data is written |
| ctime | Change Time | File metadata OR data is changed |
Additional Timestamps on Modern Systems:
| Timestamp | Name | Available On |
|---|---|---|
| crtime/btime | Birth/Creation Time | ext4, NTFS, APFS, Btrfs |
| dtime | Deletion Time | ext3/ext4 (for recovery) |
Common Confusion: ctime vs Creation Time
On Unix/Linux, ctime stands for "change time," NOT "creation time." It records when the file's inode (metadata) was last modified. This includes:
ctime cannot be manually set by users (except root with special filesystem operations). This makes it useful for security auditing—you can't fake when a file's permissions were changed.
| Operation | atime | mtime | ctime |
|---|---|---|---|
| Read file content | ✓ Updated | — | — |
| Write file content | — | ✓ Updated | ✓ Updated |
| chmod (change permissions) | — | — | ✓ Updated |
| chown (change owner) | — | — | ✓ Updated |
| rename / mv | — | — | ✓ Updated |
| touch (no content change) | ✓ Updated | ✓ Updated | ✓ Updated |
| Create hard link to file | — | — | ✓ Updated |
Updating atime on every read creates disk I/O overhead—a read operation now requires a write! Many systems mount filesystems with 'noatime' or 'relatime' to reduce this overhead. 'relatime' only updates atime if it's older than mtime, preserving functionality while reducing writes.
Every file in a multi-user operating system has an owner—the user primarily responsible for and with control over the file. On Unix-like systems, files also have a group owner.
User ID (UID) and Group ID (GID):
Internally, the file system doesn't store usernames—it stores numeric IDs:
$ ls -ln /etc/passwd
-rw-r--r-- 1 0 0 2853 Jan 15 10:30 /etc/passwd
^--UID ^--GID
The mapping from UID/GID to human-readable names is stored in /etc/passwd and /etc/group (or provided by LDAP, Active Directory, etc.).
Ownership Determines:
| OS | Owner Identifier | Group Identifier | Notes |
|---|---|---|---|
| Linux/Unix | UID (numeric) | GID (numeric) | Stored in inode |
| Windows/NTFS | SID (binary) | N/A (ACL-based) | Owner is part of security descriptor |
| macOS/APFS | UID (numeric) | GID (numeric) | BSD heritage, plus extended ACLs |
When a user is deleted but their files remain, those files become 'orphaned'—owned by a UID that no longer has a name. Running 'ls -l' shows the raw UID number. This is common with files copied from other systems. Use 'find / -nouser' to locate such files.
Protection attributes define who can access a file and what operations they can perform. This is the heart of operating system security.
Unix Permission Bits:
The classic Unix model uses 9 permission bits arranged in three groups:
-rwxr-xr--
│││││││││
│││││││└─ Others: Execute (x) - No
││││││└── Others: Write (w) - No
│││││└─── Others: Read (r) - Yes
││││└──── Group: Execute (x) - Yes
│││└───── Group: Write (w) - No
││└────── Group: Read (r) - Yes
│└─────── Owner: Execute (x) - Yes
└──────── Owner: Write (w) - Yes
└───────── Owner: Read (r) - Yes
Numeric Representation:
Permissions are often expressed as octal numbers:
r = 4, w = 2, x = 1rwxr-xr-- = 7 (4+2+1), 5 (4+0+1), 4 (4+0+0) = 754| Permission | Octal | Use Case |
|---|---|---|
-rwxr-xr-x | 755 | Executable programs, public scripts |
-rw-r--r-- | 644 | Public readable files (web pages, docs) |
-rw------- | 600 | Private files (SSH keys, passwords) |
-rwxr-x--- | 750 | Group-accessible executables |
drwxr-xr-x | 755 | Standard directory (public browsable) |
drwx------ | 700 | Private directory (home directory) |
drwxrwx--- | 770 | Shared group directory |
Special Permission Bits:
Beyond the 9 basic bits, three special bits exist:
| Bit | Octal | Name | Effect on Files | Effect on Directories |
|---|---|---|---|---|
| SUID | 4000 | Set User ID | Execute as file owner | (No standard effect) |
| SGID | 2000 | Set Group ID | Execute as file group | New files inherit directory's group |
| Sticky | 1000 | Sticky Bit | (Obsolete on files) | Only owner can delete files |
The /tmp directory typically has permissions 1777 (drwxrwxrwt). The 't' indicates the sticky bit is set. Without it, any user could delete any other user's temporary files. With the sticky bit, only the file owner (or root) can delete a file, even though /tmp is world-writable.
Modern file systems support extended attributes (xattrs)—arbitrary name-value pairs attached to files. These extend beyond the traditional fixed set of attributes.
Namespaces (Linux):
Linux organizes extended attributes into namespaces:
| Namespace | Purpose | Access |
|---|---|---|
user | Application-defined metadata | Any user (subject to permissions) |
system | System-level metadata (ACLs, capabilities) | Root or privileged |
security | Security labels (SELinux, AppArmor) | Root or privileged |
trusted | Trusted system metadata | Root only |
Working with Extended Attributes:
# Set an extended attribute
$ setfattr -n user.author -v "John Smith" document.txt
# List extended attributes
$ getfattr -d document.txt
# file: document.txt
user.author="John Smith"
# Get a specific attribute
$ getfattr -n user.author document.txt
# file: document.txt
user.author="John Smith"
# Remove an extended attribute
$ setfattr -x user.author document.txt
system.posix_acl_access and system.posix_acl_defaultsecurity.selinux (e.g., system_u:object_r:httpd_sys_content_t:s0)security.capability (executable privilege elevation)Not all tools preserve extended attributes! Standard 'cp' may lose xattrs unless you use 'cp -a' or 'cp --preserve=xattr'. Uploading to many cloud services strips xattrs. Always verify critical metadata survives transfers.
Operating systems use various mechanisms to identify file types. No single approach dominates—different systems use different strategies, and multiple indicators may coexist.
1. File Extension (Name-Based)
The most visible type indicator is the filename extension:
document.pdf → PDF fileimage.jpg → JPEG imageprogram.exe → Windows executableThis is a convention, not enforcement. A file named malware.jpg could contain executable code. Extensions are:
2. Magic Numbers (Content-Based)
Many file formats begin with specific byte sequences called magic numbers:
| Magic Number (Hex) | Format |
|---|---|
89 50 4E 47 0D 0A 1A 0A | PNG image |
FF D8 FF | JPEG image |
25 50 44 46 | PDF (starts with %PDF) |
50 4B 03 04 | ZIP archive (and derivatives) |
7F 45 4C 46 | ELF executable (Linux binaries) |
4D 5A | DOS/Windows executable |
The Unix file command uses /usr/share/misc/magic to identify file types by content, regardless of extension.
3. File System Type Field
Unix file systems store a type field in the inode:
| Symbol | Type |
|---|---|
- | Regular file |
d | Directory |
l | Symbolic link |
c | Character device |
b | Block device |
p | Named pipe (FIFO) |
s | Socket |
This is the first character shown by ls -l:
drwxr-xr-x 2 user user 4096 Jan 15 10:30 directory/
-rw-r--r-- 1 user user 1234 Jan 15 10:30 file.txt
lrwxrwxrwx 1 user user 8 Jan 15 10:30 link -> file.txt
For security-sensitive applications, always validate file type by content (magic numbers) rather than extension. An attacker can rename 'malware.exe' to 'resume.pdf'—only content analysis reveals the truth. Libraries like libmagic (used by the 'file' command) handle this reliably.
Perhaps the most critical metadata—from the file system's perspective—is where the file's data actually resides on storage. This location information is essential for reading and writing the file.
Block Pointers (Unix/ext4 style):
Traditional Unix file systems store block pointers in the inode:
Inode for 'largefile.dat'
├── Size: 15,000,000 bytes
├── Direct blocks [12 pointers]
│ └── blocks 1000, 1001, 1002, ..., 1011
├── Single indirect block
│ └── pointer to block 2000 containing 1024 more block addresses
├── Double indirect block
│ └── pointer to block 3000 containing pointers to indirect blocks
└── Triple indirect block
└── pointer to block 4000 (for truly massive files)
This multi-level structure enables files of any size while keeping small files efficient.
| File System | Location Metadata Structure | Maximum File Size |
|---|---|---|
| ext2/ext3 | 12 direct + 3 indirect block pointers | ~2 TB (4K blocks) |
| ext4 | Extent tree (start block + length) | 16 TB standard, 1 EB theoretical |
| NTFS | Data runs in MFT attributes | 16 EB theoretical |
| ZFS/Btrfs | Copy-on-write B-trees | 16 EB theoretical |
| FAT32 | Cluster chain in FAT | 4 GB per file |
Extents (Modern Approach):
Modern file systems use extents—contiguous ranges of blocks—rather than individual block pointers:
Extent-based layout for 'largefile.dat':
├── Extent 1: Start block 1000, length 1000 blocks (4 MB)
├── Extent 2: Start block 5000, length 2500 blocks (10 MB)
└── Extent 3: Start block 10000, length 125 blocks (500 KB)
Advantages of extents:
Normal users and applications never see block pointers or extents. The file system transparently translates between logical byte offsets and physical block addresses. Only specialized tools like 'debugfs' or 'hdparm --fibmap' expose this information.
Beyond the core and extended attributes, different operating systems and file systems support unique, specialized attributes.
Linux/ext4 File Attributes (chattr/lsattr):
| Flag | Meaning |
|---|---|
i | Immutable: File cannot be modified, deleted, or renamed (even by root) |
a | Append-only: Can only add data, not modify or delete |
A | No atime updates: Don't update access time |
c | Compressed: File is transparently compressed |
e | Extent format: File uses extents (default on ext4) |
j | Journaled: Data written to journal before file |
s | Secure deletion: Zero blocks on deletion |
NTFS Attributes (Windows):
| Attribute | Meaning |
|---|---|
| Read-only | File cannot be modified |
| Hidden | File not shown in normal directory listings |
| System | Critical system file |
| Archive | File has been modified since last backup |
| Encrypted | File contents are encrypted (EFS) |
| Compressed | File is transparently compressed |
| Sparse | File has sparse regions (unallocated blocks) |
| Offline | File data is stored remotely (hierarchical storage) |
macOS-Specific Attributes:
| Attribute | Meaning |
|---|---|
| UF_HIDDEN | Hidden from Finder |
| UF_IMMUTABLE | User immutable flag |
| SF_ARCHIVED | Archived by backup software |
| com.apple.quarantine | Downloaded file quarantine flag |
| com.apple.lastuseddate | Spotlight last used date |
| Resource Fork | Classic Mac metadata stream |
On Linux, setting the immutable attribute with 'chattr +i file' prevents even root from modifying or deleting the file until the flag is removed with 'chattr -i file'. This is powerful for protecting critical configuration files or creating write-once audit logs.
Different tools are used to view and modify file attributes across operating systems:
Linux/Unix:
# Basic attributes (ls)
$ ls -l file.txt
-rw-r--r-- 1 user group 1234 Jan 15 10:30 file.txt
# Detailed attributes (stat)
$ stat file.txt
File: file.txt
Size: 1234 Blocks: 8 IO Block: 4096 regular file
Device: 801h/2049d Inode: 12345678 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1000/ user) Gid: ( 1000/ group)
Access: 2024-01-15 10:30:00.000000000 +0000
Modify: 2024-01-15 10:25:00.000000000 +0000
Change: 2024-01-15 10:25:00.000000000 +0000
Birth: 2024-01-14 09:00:00.000000000 +0000
# Extended attributes
$ getfattr -d file.txt
$ lsattr file.txt # Linux-specific attributes
----i--------e-- file.txt
Windows (PowerShell):
# Basic attributes
> Get-ItemProperty .\file.txt
# File system info
> Get-Item .\file.txt | Format-List *
# Modify attributes
> Set-ItemProperty .\file.txt -Name IsReadOnly -Value $true
# View NTFS streams
> Get-Item .\file.txt -Stream *
macOS:
# Extended attributes (crucial for Mac files)
$ xattr -l file.txt
com.apple.quarantine: 0083;5a123456;Chrome;ABC123
# Remove quarantine flag
$ xattr -d com.apple.quarantine file.txt
# Show all metadata including resource fork
$ ls -l@ file.txt
At the programming level, the stat() system call (or fstat() for open files) retrieves most file attributes into a struct stat. Windows uses GetFileInformationByHandle() with various INFO_CLASS values. These are the fundamental interfaces for attribute access.
We've explored the complete landscape of file attributes—from basic properties like size and timestamps to advanced features like extended attributes and system-specific flags. Let's consolidate:
What's next:
Now that we understand what properties files have, we'll explore file operations—the actions you can perform on files. Opening, reading, writing, seeking, closing, and the many variations and subtleties of each operation are essential for system programming.
You now have comprehensive knowledge of file attributes across major operating systems. This foundation enables you to understand file system behavior, debug permission issues, write robust file-handling code, and effectively administer systems.