First Come First Served - Learning Module

Loading content...

0/227

FIFO Queue

The Queue: Nature's Ordering Principle

Before there were computers, there were queues. People waiting in line at markets, customers at bank tellers, travelers at passport control—the First-In, First-Out (FIFO) principle governs countless aspects of human society. It embodies a fundamental notion of fairness: those who arrive first are served first.

In computer science, the FIFO queue is one of the most fundamental abstract data types. It provides the essential mechanism that powers FCFS scheduling, message passing systems, breadth-first search, and countless other algorithms. Understanding the FIFO queue deeply—its invariants, implementations, and tradeoffs—is essential for systems programming.

This page explores the FIFO queue not as a simple data structure tutorial, but as the critical infrastructure that makes FCFS scheduling possible and predictable.

What You Will Learn

By the end of this page, you will understand the FIFO queue's formal properties, multiple implementation strategies, their tradeoffs in the context of OS scheduling, and how real operating systems implement ready queues. You'll be able to implement and analyze queue structures suitable for kernel-level scheduling.

Abstract Data Type Definition

A FIFO queue is defined by its behavioral contract—the operations it supports and the invariants it maintains—independent of any particular implementation. This abstraction allows us to reason about queue behavior and substitute implementations as needed.

Formal Definition:

A FIFO Queue Q is an ordered collection of elements supporting the following operations:

Enqueue(Q, e): Add element e to the rear (tail) of Q
Dequeue(Q): Remove and return the element at the front (head) of Q
Front(Q): Return (without removing) the element at the front of Q
IsEmpty(Q): Return true if Q contains no elements
Size(Q): Return the number of elements in Q

Invariants:

For any sequence of operations, the following invariants must hold:

FIFO Ordering: If element a is enqueued before element b, then a will be dequeued before b (assuming neither is removed by other means).
Temporal Preservation: The relative order of elements is preserved—elements cannot "pass" each other in the queue.
Dequeue Validity: Dequeue returns the element that has been in the queue longest among all current elements.
Size Consistency: Size(Q) = number of Enqueue operations - number of successful Dequeue operations.

Queue Operations Specification
Operation	Precondition	Postcondition	Return Value
Enqueue(Q, e)	None	e is at rear of Q; Size increased by 1	None (or success indicator)
Dequeue(Q)	Q is not empty	Front element removed; Size decreased by 1	Former front element
Front(Q)	Q is not empty	Q unchanged	Current front element
IsEmpty(Q)	None	Q unchanged	Boolean (Size = 0)
Size(Q)	None	Q unchanged	Number of elements

Queue vs. Stack Contrast

While both queues and stacks are ordered collections, they differ in their removal policy: Queues use FIFO (first in, first out), while stacks use LIFO (last in, first out). This seemingly small difference leads to fundamentally different behaviors—stacks enable recursion and backtracking, while queues enable scheduling and level-order traversal.

Linked List Implementation

The most common implementation of a FIFO queue in operating system kernels is the linked list. This approach offers dynamic sizing, O(1) operations, and straightforward memory management—essential properties for handling unpredictable process populations.

Singly-Linked List with Tail Pointer:

The simplest linked list queue uses a singly-linked list with separate head and tail pointers:

Head Pointer: Points to the first node (front of queue)
Tail Pointer: Points to the last node (rear of queue)
Node Structure: Each node contains data and a pointer to the next node

Operation Analysis:

Enqueue: Create new node, set tail->next to new node, update tail pointer → O(1)
Dequeue: Save head->data, advance head pointer, free old head → O(1)
Front: Return head->data → O(1)
IsEmpty: Check if head is NULL → O(1)

This achieves optimal time complexity for all operations.

singly_linked_queue.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
/* Singly-Linked List Queue for OS Scheduling */
 
#include <stdlib.h>
#include <stdbool.h>
#include <stddef.h>
 
/* Forward declaration of PCB (Process Control Block) */
typedef struct PCB PCB;
 
/* Queue node structure */
typedef struct QueueNode {
    PCB *process;              /* Pointer to PCB */
    struct QueueNode *next;    /* Next node in queue */
} QueueNode;
 
/* Queue structure with head and tail pointers */
typedef struct {
    QueueNode *head;           /* Front of queue (dequeue from here) */
    QueueNode *tail;           /* Rear of queue (enqueue here) */
    size_t count;              /* Number of elements */
} FIFOQueue;
 
/* Initialize an empty queue */
void queue_init(FIFOQueue *q) {
    q->head = NULL;
    q->tail = NULL;
    q->count = 0;
}
 
/* Check if queue is empty - O(1) */
bool queue_is_empty(const FIFOQueue *q) {
    return q->head == NULL;
}
 
/* Get number of elements - O(1) */
size_t queue_size(const FIFOQueue *q) {
    return q->count;
}
 
/* 
 * Enqueue operation - O(1)
 * Adds process to rear of queue
 */
bool queue_enqueue(FIFOQueue *q, PCB *process) {
    /* Allocate new node */
    QueueNode *new_node = (QueueNode *)malloc(sizeof(QueueNode));
    if (new_node == NULL) {
        return false;  /* Memory allocation failed */
    }
    
    new_node->process = process;
    new_node->next = NULL;
    
    /* Handle empty queue case */
    if (q->tail == NULL) {
        q->head = new_node;
        q->tail = new_node;
    } else {
        /* Append to rear */
        q->tail->next = new_node;
        q->tail = new_node;
    }
    
    q->count++;
    return true;
}
 
/*
 * Dequeue operation - O(1)
 * Removes and returns process from front of queue
 */
PCB* queue_dequeue(FIFOQueue *q) {
    if (queue_is_empty(q)) {
        return NULL;  /* Queue is empty */
    }
    
    /* Remove from front */
    QueueNode *old_head = q->head;
    PCB *process = old_head->process;
    
    q->head = old_head->next;
    
    /* If queue is now empty, update tail too */
    if (q->head == NULL) {
        q->tail = NULL;
    }
    
    free(old_head);
    q->count--;
    
    return process;
}
 
/*
 * Peek at front without removing - O(1)
 */
PCB* queue_front(const FIFOQueue *q) {
    if (queue_is_empty(q)) {
        return NULL;
    }
    return q->head->process;
}
 
/* 
 * Destroy queue and free all nodes
 * Note: Does NOT free the PCBs themselves
 */
void queue_destroy(FIFOQueue *q) {
    while (!queue_is_empty(q)) {
        queue_dequeue(q);
    }
}

Doubly-Linked List Variant:

Some operating systems use doubly-linked lists for ready queues, adding a prev pointer to each node. While this adds memory overhead, it enables:

O(1) Removal from Middle: If a process needs to be removed from the queue (e.g., killed or priority changed), having a prev pointer allows immediate removal without traversal.
Bidirectional Traversal: Useful for debugging, visualization, and complex queue manipulations.
Circular Variants: Easier implementation of circular doubly-linked lists for round-robin scheduling.

doubly_linked_queue.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
/* Doubly-Linked List Queue - Supports O(1) arbitrary removal */
 
typedef struct DLNode {
    PCB *process;
    struct DLNode *next;
    struct DLNode *prev;
} DLNode;
 
typedef struct {
    DLNode *head;
    DLNode *tail;
    size_t count;
} DoublyLinkedQueue;
 
/* Remove a specific node from anywhere in the queue - O(1) */
void queue_remove_node(DoublyLinkedQueue *q, DLNode *node) {
    if (node == NULL) return;
    
    /* Update previous node's next pointer */
    if (node->prev != NULL) {
        node->prev->next = node->next;
    } else {
        /* Removing head */
        q->head = node->next;
    }
    
    /* Update next node's prev pointer */
    if (node->next != NULL) {
        node->next->prev = node->prev;
    } else {
        /* Removing tail */
        q->tail = node->prev;
    }
    
    q->count--;
    free(node);
}
 
/* Enqueue at tail - O(1) */
DLNode* dlqueue_enqueue(DoublyLinkedQueue *q, PCB *process) {
    DLNode *new_node = (DLNode *)malloc(sizeof(DLNode));
    if (!new_node) return NULL;
    
    new_node->process = process;
    new_node->next = NULL;
    new_node->prev = q->tail;
    
    if (q->tail) {
        q->tail->next = new_node;
    } else {
        q->head = new_node;
    }
    q->tail = new_node;
    q->count++;
    
    /* Return node pointer for potential future removal */
    return new_node;
}

Circular Buffer Implementation

An alternative to linked lists is the circular buffer (also called a ring buffer). This approach uses a fixed-size array with two indices that wrap around, eliminating the memory allocation overhead of linked lists.

How Circular Buffers Work:

A circular buffer maintains:

Array: Fixed-size array of elements
Head Index: Position of next element to dequeue
Tail Index: Position where next enqueue will place element
Count or Full Flag: Track if buffer is full or empty

As elements are added and removed, the indices increment and wrap around using modular arithmetic:

head = (head + 1) % capacity
tail = (tail + 1) % capacity

The Full vs. Empty Dilemma:

A challenge with circular buffers is distinguishing "full" from "empty"—both states have head equal to tail (or appear to). Solutions include:

Counter: Maintain a separate count of elements
Waste One Slot: Consider buffer full when tail + 1 == head (wastes one slot)
Boolean Flag: Use an explicit full flag

circular_buffer_queue.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
/* Circular Buffer Queue for Fixed-Size Process Pools */
 
#include <stdlib.h>
#include <stdbool.h>
 
#define MAX_PROCESSES 256  /* Maximum processes in queue */
 
typedef struct {
    PCB *processes[MAX_PROCESSES];  /* Fixed array of PCB pointers */
    size_t head;                     /* Next to dequeue */
    size_t tail;                     /* Next position to enqueue */
    size_t count;                    /* Current number of elements */
    size_t capacity;                 /* Maximum capacity */
} CircularQueue;
 
/* Initialize circular queue */
void cqueue_init(CircularQueue *q) {
    q->head = 0;
    q->tail = 0;
    q->count = 0;
    q->capacity = MAX_PROCESSES;
}
 
/* Check if empty - O(1) */
bool cqueue_is_empty(const CircularQueue *q) {
    return q->count == 0;
}
 
/* Check if full - O(1) */
bool cqueue_is_full(const CircularQueue *q) {
    return q->count == q->capacity;
}
 
/* 
 * Enqueue - O(1) 
 * Returns false if queue is full
 */
bool cqueue_enqueue(CircularQueue *q, PCB *process) {
    if (cqueue_is_full(q)) {
        return false;  /* Queue overflow */
    }
    
    q->processes[q->tail] = process;
    q->tail = (q->tail + 1) % q->capacity;  /* Wrap around */
    q->count++;
    
    return true;
}
 
/*
 * Dequeue - O(1)
 * Returns NULL if queue is empty
 */
PCB* cqueue_dequeue(CircularQueue *q) {
    if (cqueue_is_empty(q)) {
        return NULL;  /* Queue underflow */
    }
    
    PCB *process = q->processes[q->head];
    q->head = (q->head + 1) % q->capacity;  /* Wrap around */
    q->count--;
    
    return process;
}
 
/* Peek at front - O(1) */
PCB* cqueue_front(const CircularQueue *q) {
    if (cqueue_is_empty(q)) {
        return NULL;
    }
    return q->processes[q->head];
}

Circular Buffer Advantages

•No dynamic memory allocation
•Better cache locality
•No memory fragmentation
•Predictable memory usage
•Faster operations (no malloc/free)
•Suitable for real-time systems

Circular Buffer Disadvantages

•Fixed maximum size
•Wastes memory if underutilized
•Cannot grow dynamically
•Must pre-determine capacity
•No arbitrary removal support
•Overflow handling required

Queue Properties and Invariants

Understanding the formal properties of FIFO queues is essential for proving correctness of scheduling algorithms and reasoning about system behavior.

Fundamental Properties:

1. Order Preservation (FIFO Property)

For any two elements a and b, if a is enqueued before b, then either:

a is dequeued before b, or
a is still in the queue when b is dequeued (impossible—contradicts FIFO)

Formally: ∀a,b ∈ Q: enqueue_time(a) < enqueue_time(b) ⟹ dequeue_time(a) < dequeue_time(b)

2. Causality

An element cannot be dequeued until it has been enqueued. This seems obvious but is important for concurrent queue implementations.

3. Liveness

In a fair execution, every enqueued element will eventually be dequeued (assuming dequeue operations continue). This prevents starvation.

4. Bounded Waiting (in finite queues)

For a queue of maximum size N, any element waits behind at most N-1 other elements. Waiting time is bounded if service times are bounded.

Queue Invariants by Implementation
Invariant	Linked List	Circular Buffer	Verification
FIFO Order	Maintained by list order	Maintained by index order	Trace element positions
Size Accuracy	Count field	Count field	count == enqueues - dequeues
Head Validity	head == NULL ⟺ empty	head == tail ∧ count == 0	Check on operations
Tail Validity	tail->next == NULL	Implicit in array	Check on enqueue
Memory Bounds	Unlimited (until OOM)	capacity limit	count ≤ capacity

Concurrent Queue Correctness

When multiple threads or CPUs access a queue simultaneously, maintaining FIFO order becomes significantly more complex. Lock-free queues, used in high-performance operating systems, must carefully ensure linearizability—that concurrent operations appear to occur atomically in some sequential order that respects the FIFO property.

Scheduling-Specific Invariants:

In the context of FCFS scheduling, the ready queue must maintain additional invariants:

Arrival Time Ordering: If process P1 arrived before P2, then P1 precedes P2 in the queue.
No Duplicates: Each process appears in the queue at most once (a process can't be "ready" in multiple queue positions).
State Consistency: Every process in the ready queue has state == READY.
PCB Validity: Every queue node points to a valid, non-NULL PCB.
Mutual Exclusion from Running: The currently running process is not in the ready queue.

Violating any of these invariants leads to scheduler bugs—processes executed multiple times, skipped, or deadlocked.

Real Operating System Queue Implementations

Modern operating systems implement ready queues with careful attention to performance, concurrency, and memory efficiency. Let's examine how real systems approach this problem.

Linux Kernel: list_head

Linux uses an elegant intrusive doubly-linked list defined in <linux/list.h>. Rather than wrapping data in queue nodes, the list pointers are embedded directly in the data structure:

struct list_head {
    struct list_head *next, *prev;
};

struct task_struct {
    /* ... many fields ... */
    struct list_head run_list;  /* Embedded list node */
    /* ... more fields ... */
};

Advantages of Intrusive Lists:

Zero allocation overhead: No separate node allocation—the PCB IS the node
Better cache performance: List pointers are near other PCB data
Flexible multi-list membership: A task can be on multiple lists simultaneously
Type-safe container access: container_of() macro retrieves PCB from list_head pointer

The Linux scheduler uses this pattern extensively, with tasks potentially on run queues, wait queues, and hash tables simultaneously.

linux_list_example.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
/* Linux-style intrusive list for ready queue */
 
#include <stddef.h>
 
/* Linux list_head structure */
struct list_head {
    struct list_head *next;
    struct list_head *prev;
};
 
/* container_of macro - get enclosing structure from list pointer */
#define container_of(ptr, type, member) \
    ((type *)((char *)(ptr) - offsetof(type, member)))
 
/* Process structure with embedded list node */
struct task_struct {
    int pid;
    int state;
    int prio;
    /* ... other fields ... */
    struct list_head run_list;  /* For ready queue membership */
};
 
/* Ready queue is just a list_head acting as sentinel */
struct list_head ready_queue;
 
/* Initialize empty queue */
static inline void INIT_LIST_HEAD(struct list_head *list) {
    list->next = list;
    list->prev = list;
}
 
/* Add to end of queue (enqueue) */
static inline void list_add_tail(struct list_head *new,
                                  struct list_head *head) {
    new->prev = head->prev;
    new->next = head;
    head->prev->next = new;
    head->prev = new;
}
 
/* Remove from queue */
static inline void list_del(struct list_head *entry) {
    entry->prev->next = entry->next;
    entry->next->prev = entry->prev;
    entry->next = NULL;  /* Poison pointers for safety */
    entry->prev = NULL;
}
 
/* Check if queue is empty */
static inline int list_empty(const struct list_head *head) {
    return head->next == head;
}
 
/* FCFS schedule: get first task from ready queue */
struct task_struct* fcfs_pick_next_task(void) {
    if (list_empty(&ready_queue)) {
        return NULL;
    }
    
    /* Get first entry in queue */
    struct list_head *first = ready_queue.next;
    list_del(first);  /* Remove from queue */
    
    /* Recover task_struct from embedded list_head */
    return container_of(first, struct task_struct, run_list);
}

Windows Kernel: KQUEUE

Windows uses a similar intrusive list approach with LIST_ENTRY structures. The KQUEUE object provides kernel-mode producer-consumer queues with built-in synchronization. Windows' dispatcher (scheduler) maintains separate ready queues for each priority level, using a bitmap to quickly find the highest-priority non-empty queue.

FreeBSD: TAILQ

FreeBSD uses the TAILQ macro family from <sys/queue.h>, providing type-safe tail queues. Similar to Linux, list nodes are embedded in process structures for minimal overhead.

Common Patterns Across Systems:

OS Ready Queue Implementation Patterns

•Intrusive data structures: List pointers embedded in PCB/task_struct rather than wrapper nodes
•Doubly-linked lists: Enable O(1) removal from anywhere in queue
•Sentinel nodes: Empty list represented by head pointing to itself, simplifying edge cases
•Spin locks or lockless designs: Concurrent access protection without heavy mutexes
•Per-CPU queues: Reduce lock contention on multiprocessor systems
•Memory pools: Pre-allocated memory for predictable latency

Concurrent Queue Considerations

In modern multiprocessor systems, the ready queue is accessed by multiple CPUs simultaneously—one CPU may be dequeueing while another enqueues a process completing I/O. Correct and efficient concurrent access is critical.

Race Conditions in Queue Operations:

Consider two CPUs executing queue operations simultaneously:

CPU 0: enqueue(P1)            CPU 1: dequeue()
────────────────────────────────────────────────
new_node->next = NULL
tail->next = new_node        first = head
                             head = head->next   // RACE!
tail = new_node                                  // tail might be stale

Without synchronization, the queue can become corrupted—lost nodes, dangling pointers, or infinite loops.

Synchronization Strategies:

Queue Synchronization Approaches
Approach	Mechanism	Pros	Cons
Global Lock	Single mutex/spinlock protects all operations	Simple, correct	Serializes all access—bottleneck
Fine-grained Locking	Separate head and tail locks	Better concurrency	Complex, deadlock potential
Lock-free (CAS)	Compare-and-swap for atomic updates	No blocking, good for real-time	Very complex, ABA problem
Per-CPU Queues	Each CPU has its own queue	No contention for local ops	Load imbalance, work stealing needed

Lock-Free Queue: Michael-Scott Algorithm

The Michael-Scott queue (1996) is a foundational lock-free queue algorithm using compare-and-swap (CAS) operations. It achieves O(1) amortized time for enqueue and dequeue without blocking, making it suitable for real-time systems. Modern OS schedulers often use variants of this algorithm for high-performance ready queues.

Memory Ordering Concerns:

On modern CPUs with relaxed memory models (e.g., ARM, POWER), memory operations can be reordered by the processor or compiler. Queue implementations must use appropriate memory barriers:

void enqueue(Node *node) {
    node->next = NULL;
    
    smp_wmb();  /* Write memory barrier: ensure node->next is visible before linking */
    
    tail->next = node;
    
    smp_wmb();  /* Ensure linking visible before updating tail */
    
    tail = node;
}

Missing barriers can cause other CPUs to see a partially-constructed queue state—for example, following tail->next to a node whose next pointer hasn't been written yet.

Performance Analysis

While all basic queue implementations provide O(1) operations in theory, real-world performance depends on memory access patterns, cache behavior, and allocation strategies.

Queue Implementation Performance Comparison
Aspect	Linked List	Circular Buffer	Intrusive List
Enqueue Time	O(1) + malloc	O(1) no alloc	O(1) no alloc
Dequeue Time	O(1) + free	O(1) no alloc	O(1) no alloc
Cache Behavior	Poor (scattered nodes)	Excellent (contiguous)	Good (with PCB)
Memory Overhead	Per-node allocation	Pre-allocated array	Embedded in PCB
Arbitrary Removal	O(n) or O(1) with DLL	O(n)	O(1)
Maximum Size	Unlimited	Fixed	Unlimited

Cache Performance Deep Dive:

Cache behavior significantly impacts scheduler performance. Consider a ready queue with 100 processes:

Linked List (Non-Intrusive):

Queue nodes scattered across heap
Each dequeue: cache miss to fetch node, another to fetch PCB
Typical: 2 cache misses × 100ns = 200ns per operation

Circular Buffer:

Array fits in cache (100 pointers = 800 bytes on 64-bit)
PCB access still may miss, but array access hits
Typical: 1 cache miss × 100ns = 100ns per operation

Intrusive List:

Following list requires fetching each PCB
But PCB is likely needed anyway for scheduling decision
Typical: 1 cache miss (the PCB itself) = 100ns per operation

For high-frequency scheduling (thousands of context switches per second), these differences are significant.

Why Overhead Matters

In a system with 10,000 context switches per second (common on busy servers), an extra 100 nanoseconds per scheduling decision adds up to 1 millisecond of overhead per second—1% of all CPU time spent on bookkeeping rather than useful work. This is why kernel developers obsess over queue implementation details.

Summary: FIFO Queues in FCFS Scheduling

The FIFO queue is far more than a simple data structure—it's the architectural foundation that makes FCFS scheduling possible. Let's consolidate our understanding:

Key Takeaways

•Abstract Interface: FIFO queues are defined by their FIFO invariant (first-in elements are dequeued first) and O(1) enqueue/dequeue operations.
•Multiple Implementations: Linked lists offer flexibility; circular buffers provide cache efficiency; intrusive lists eliminate allocation overhead.
•OS-Specific Designs: Real operating systems use intrusive doubly-linked lists (Linux list_head, Windows LIST_ENTRY) for zero-overhead queue membership.
•Concurrency Challenges: Multi-CPU systems require careful synchronization—locks, lock-free algorithms, or per-CPU queues.
•Performance Matters: Cache behavior, memory allocation, and implementation details significantly impact scheduler throughput.
•Invariant Preservation: Queue implementations must maintain FIFO order, state consistency, and avoid corruption under concurrent access.

What's Next:

In the next page, we'll examine the Convoy Effect—FCFS's most significant performance pathology. We'll see how a single long-running process can cause disproportionate waiting times for all other processes, and understand why this phenomenon motivated the development of smarter scheduling algorithms.

Page Complete

You now have a deep understanding of FIFO queue implementations—from abstract definitions through kernel-level code. This knowledge extends far beyond FCFS scheduling: FIFO queues appear throughout operating systems in I/O request queues, message passing, and event handling. The patterns you've learned here apply broadly across systems programming.