Loading learning content...
Spool directories are where the promise of spooling becomes reality. They provide the persistent storage that enables jobs to survive system crashes, device failures, and network interruptions. Without robust spool directory design, spooling would be merely buffering—temporary and fragile.
The organization of spool directories reflects decades of hard-won experience with multi-user systems, security requirements, concurrent access, and recovery scenarios. Understanding this structure reveals fundamental principles of reliable system design.
This page covers spool directory organization, file naming conventions, permission models, quota enforcement, cleanup policies, and crash recovery. You'll understand how to design and manage reliable persistent queue storage.
Spool directories follow a hierarchical organization designed for efficiency, security, and manageability.
The /var/spool Hierarchy:
On UNIX systems, /var/spool/ is the standard location for spool data. The /var filesystem is for variable data that changes during operation, kept separate from the OS (/) and user data (/home).
Design Rationale:
/var can be a dedicated partition, preventing spool growth from filling the root filesystem| Path | Purpose | Typical Size | Security |
|---|---|---|---|
| /var/spool/cups | Print jobs | 100MB - 10GB | cupsd:lp, 0710 |
| /var/spool/mail | User mailboxes | 1GB - 100GB | root:mail, 1777 |
| /var/spool/mqueue | Outbound mail queue | 10MB - 1GB | root:smmsp, 0700 |
| /var/spool/cron | Cron job definitions | <10MB | root:root, 0700 |
| /var/spool/at | One-time jobs | <100MB | daemon:daemon, 1770 |
| /var/spool/anacron | Anacron timestamps | <1MB | root:root, 0755 |
123456789101112131415161718192021222324
#!/bin/bash# Explore spool directory structure echo "=== Print Spool (CUPS) ==="ls -la /var/spool/cups/ 2>/dev/null | head -20echo "" echo "=== CUPS Subdirectories ==="# d* = data files (job content)# c* = control files (job metadata) # tmp/ = temporary processing filesfind /var/spool/cups -type f 2>/dev/null | head -10echo "" echo "=== Mail Spool ==="ls -la /var/spool/mail/ 2>/dev/null | head -10echo "" echo "=== Disk Usage ==="du -sh /var/spool/* 2>/dev/null | sort -hecho "" echo "=== Filesystem Check ==="df -h /var/spoolSpool file naming follows conventions that encode essential information while ensuring uniqueness across concurrent operations.
CUPS Naming Convention:
d<job-id>-<doc-num> Data file (job content)
c<job-id> Control file (metadata)
Example: Job 1234 with 2 documents:
d01234-001 - First document datad01234-002 - Second document datac01234 - Control file with job attributesNaming Design Principles:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
/* Spool File Creation with Atomic Semantics */#include <stdio.h>#include <stdlib.h>#include <unistd.h>#include <fcntl.h>#include <sys/stat.h> #define SPOOL_DIR "/var/spool/myapp"#define SPOOL_TMP "/var/spool/myapp/tmp" /* Generate unique spool file safely */int create_spool_file(int job_id, const void *data, size_t len) { char tmp_path[256], final_path[256]; /* Create in temporary directory first */ snprintf(tmp_path, sizeof(tmp_path), "%s/d%05d.XXXXXX", SPOOL_TMP, job_id); int fd = mkstemp(tmp_path); /* Creates unique temp file */ if (fd < 0) return -1; /* Set restrictive permissions */ fchmod(fd, 0640); /* Write data */ if (write(fd, data, len) != len) { close(fd); unlink(tmp_path); return -1; } /* Ensure data hits disk before rename */ fsync(fd); close(fd); /* Atomic rename to final location */ snprintf(final_path, sizeof(final_path), "%s/d%05d-001", SPOOL_DIR, job_id); if (rename(tmp_path, final_path) != 0) { unlink(tmp_path); return -1; } return 0; /* Success - file now visible atomically */}The create-in-temp-then-rename pattern is critical for reliability. A crash during writing leaves only temp files (cleaned on restart). The rename() is atomic on POSIX systems—the file either appears complete or not at all. Other processes never see partial spool files.
Spool directories present unique security challenges: they must accept data from untrusted users while protecting job contents and system integrity.
Security Requirements:
CUPS Permission Model:
| Path | Owner | Permissions | Purpose |
|---|---|---|---|
| /var/spool/cups | root:lp | 0710 | Base directory |
| /var/spool/cups/tmp | root:lp | 0710 | Filter temp files |
| /run/cups/certs | root:lpadmin | 0511 | Auth certificates |
| Spool data files | root:lp | 0640 | Job content |
| Control files | root:lp | 0640 | Job metadata |
Key Security Mechanisms:
Authentication Flow:
CUPS authenticates users before accepting jobs:
Uncontrolled spool growth can fill filesystems and crash systems. Multiple layers of protection exist.
Space Protection Strategies:
CUPS Configuration:
# /etc/cups/cupsd.conf
MaxJobSize 100m # Max 100MB per job
PreserveJobFiles No # Don't keep completed jobs
PreserveJobHistory Yes # Keep job history only
MaxJobs 500 # Max 500 active jobs
MaxJobsPerUser 100 # Max 100 jobs per user
MaxJobsPerPrinter 100 # Max 100 jobs per printer
1234567891011121314151617181920212223
#!/bin/bash# Monitor spool space and alert on thresholds SPOOL_DIR="/var/spool/cups"WARN_PERCENT=80CRIT_PERCENT=95 # Get usage percentageUSAGE=$(df --output=pcent "$SPOOL_DIR" | tail -1 | tr -d ' %') if [ "$USAGE" -ge "$CRIT_PERCENT" ]; then echo "CRITICAL: Spool at ${USAGE}%" # Emergency cleanup: remove oldest completed jobs find "$SPOOL_DIR" -name "d*" -mtime +7 -delete systemctl restart cupselif [ "$USAGE" -ge "$WARN_PERCENT" ]; then echo "WARNING: Spool at ${USAGE}%" logger -p daemon.warning "Print spool usage at ${USAGE}%"fi # Report per-user usage (if tracking enabled)echo "=== Jobs by User ==="lpstat -o | awk '{print $2}' | sort | uniq -c | sort -rn | head -10A malicious user can intentionally fill the spool with large jobs, causing denial of service. Defense: per-user quotas, job size limits, authentication requirements, and monitoring alerts. Some sites disable anonymous printing entirely.
Spool cleanup balances storage efficiency against the ability to reprint or audit jobs.
Cleanup Triggers:
CUPS Cleanup Behavior:
PreserveJobFiles No: Delete spool files on completionPreserveJobFiles Yes: Keep until MaxJobs exceededPreserveJobHistory Yes/No: Keep metadata after files deletedAutoPurgeJobs Yes: Automatically purge old jobsMaxHoldTime 86400: Cancel held jobs after 1 day123456789101112131415161718192021
#!/bin/bash# Comprehensive spool cleanup script # Clean print spool - jobs older than 7 daysfind /var/spool/cups -name "d*" -type f -mtime +7 -delete 2>/dev/nullfind /var/spool/cups -name "c*" -type f -mtime +7 -delete 2>/dev/null # Clean mail queue - deferred messages older than 30 daysif [ -d /var/spool/postfix/deferred ]; then find /var/spool/postfix/deferred -type f -mtime +30 -deletefi # Clean at jobs - completed jobsfind /var/spool/at -name "*.done" -mtime +1 -delete 2>/dev/null # Clean temp filesfind /var/spool/*/tmp -type f -mtime +1 -delete 2>/dev/null # Report resultsecho "Spool cleanup complete"du -sh /var/spool/*| Environment | Job Files | Job History | Rationale |
|---|---|---|---|
| Home/Small Office | 0 (immediate) | 7 days | Space conservation |
| Enterprise | 24 hours | 90 days | Reprint capability |
| Regulated Industry | 30 days | 7 years | Audit requirements |
| Public Kiosk | 0 | 0 | Privacy protection |
A key benefit of spool persistence is crash recovery. When the system restarts, queued jobs should resume without data loss.
Recovery Process:
Ensuring Recoverability:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
/* Spool Recovery on Daemon Startup */#include <dirent.h>#include <stdio.h>#include <stdlib.h>#include <string.h> typedef struct job { int id; char *state; char *data_path; struct job *next;} job_t; /* Recover jobs from spool directory */job_t *recover_spool(const char *spool_dir) { DIR *dir = opendir(spool_dir); if (!dir) return NULL; job_t *jobs = NULL; struct dirent *entry; /* Pass 1: Find control files (authoritative job list) */ while ((entry = readdir(dir)) != NULL) { if (entry->d_name[0] != 'c') continue; int job_id = atoi(entry->d_name + 1); /* Parse control file for job state */ job_t *job = malloc(sizeof(job_t)); job->id = job_id; job->state = read_job_state(spool_dir, job_id); job->data_path = find_data_file(spool_dir, job_id); /* Validate data file exists */ if (!job->data_path) { fprintf(stderr, "WARN: Job %d missing data file\n", job_id); free(job); continue; } job->next = jobs; jobs = job; fprintf(stderr, "Recovered job %d, state=%s\n", job_id, job->state); } closedir(dir); /* Pass 2: Clean orphan data files (no control file) */ cleanup_orphans(spool_dir); return jobs;}Test crash recovery by submitting jobs, then killing the daemon (kill -9) and restarting. Jobs should reappear in the queue. Also test: power loss (pull plug on test system), filesystem full during write, network partition during remote print.
Next, we'll explore daemon processes—the background services that drive spooling systems, including their lifecycle management, process models, and system integration.