Real Time Linux - Learning Module

Loading content...

0/240

Real-Time Schedulers

Beyond Fairness: Priority-Driven Scheduling

Standard Linux scheduling—the Completely Fair Scheduler (CFS)—is designed around a principle of fairness: every process gets its proportional share of CPU time based on weight. This philosophy is exactly wrong for real-time systems.

In real-time computing, we don't want fairness—we want priority. A high-priority task monitoring a nuclear reactor coolant level must run immediately, regardless of how much CPU time it has already consumed or how many other tasks are waiting. The low-priority task updating a log file can wait indefinitely.

Linux provides three scheduling policies specifically designed for real-time requirements: SCHED_FIFO, SCHED_RR, and SCHED_DEADLINE. Understanding when and how to use each is essential for building deterministic systems.

What You Will Learn

By the end of this page, you will understand: (1) How SCHED_FIFO implements strict priority scheduling; (2) How SCHED_RR adds time-slicing for equal-priority tasks; (3) How SCHED_DEADLINE implements Earliest Deadline First; (4) When to choose each policy; (5) How to configure and monitor RT scheduling; and (6) Common pitfalls and best practices.

Linux Scheduling Class Hierarchy

Linux implements a modular scheduler architecture with multiple scheduling classes, each implementing a different scheduling algorithm. The scheduler consults these classes in priority order, running tasks from the highest-priority class that has runnable tasks.

Linux Scheduling Classes (Highest to Lowest Priority)
Scheduling Class	Policies	Priority Range	Use Case
Stop Class	(internal)	N/A	Kernel-internal task migration, CPU hotplug
Deadline Class	SCHED_DEADLINE	N/A (EDF)	Tasks with explicit timing requirements
RT Class	SCHED_FIFO, SCHED_RR	1-99 (99 highest)	Real-time tasks requiring priority scheduling
Fair Class (CFS)	SCHED_OTHER, SCHED_BATCH, SCHED_IDLE	Nice -20 to +19	Normal tasks, weighted fair sharing
Idle Class	SCHED_IDLE	N/A	Background tasks, run only when nothing else

Critical Architectural Point:

The scheduler always selects from the highest-priority class with runnable tasks. This means:

A SCHED_DEADLINE task preempts everything except stop tasks
Any SCHED_FIFO or SCHED_RR task (even priority 1) preempts all SCHED_OTHER tasks
SCHED_OTHER tasks only run when no RT or deadline tasks are runnable

Scheduling Priority Visualization

Conceptual

Scheduler Decision Flow:
┌─────────────────────────────────────────────────────────────────┐
│                    pick_next_task()                             │
│                          │                                      │
│                          ▼                                      │
│        ┌─── Stop class has runnable task? ───┐                 │
│        │ Yes                          No      │                 │
│        ▼                               │      │                 │
│   Run stop task                        ▼      │                 │
│                    ┌─── Deadline class has task? ───┐          │
│                    │ Yes                      No     │          │
│                    ▼                           │     │          │
│               Run EDF task                     ▼     │          │
│                            ┌─── RT class has task? ───┐        │
│                            │ Yes               No      │        │
│                            ▼                    │      │        │
│                   Run highest-prio RT task     ▼      │        │
│                                    ┌─── Fair class ───┐        │
│                                    ▼                           │
│                           Run CFS-selected task                │
└─────────────────────────────────────────────────────────────────┘

Starvation Risk

This strict priority ordering means RT tasks can completely starve normal tasks. A continuously running SCHED_FIFO task at priority 1 will prevent ALL SCHED_OTHER tasks from running—including your shell. Linux has RT throttling to prevent complete system lockup, but careful design is essential.

SCHED_FIFO: First-In-First-Out Real-Time Scheduling

SCHED_FIFO implements the simplest real-time scheduling policy: strict priority with no time-slicing. A SCHED_FIFO task runs until it voluntarily yields, blocks, or is preempted by a higher-priority task.

SCHED_FIFO Characteristics

•Priority Range: 1 (lowest) to 99 (highest)
•Time-Slicing: None — runs until voluntarily yields or blocks
•Preemption: Immediate preemption by higher-priority SCHED_FIFO/RR tasks
•Queue Order: FIFO within same priority level
•Yielding: sched_yield() moves task to end of its priority queue
•Use Cases: Event-driven tasks, single-threaded RT workloads

SCHED_FIFO Algorithm:

SCHED_FIFO Algorithm

Pseudocode

SCHED_FIFO Scheduling Rules:
 
1. Pick the highest-priority runnable task
2. Run that task until one of:
   a. Task blocks (I/O, mutex, sleep)
   b. Task yields (sched_yield())
   c. Task terminates
   d. Higher-priority task becomes runnable
 
3. When multiple tasks have same priority:
   - Run them in FIFO order
   - A running task stays at front of queue
   - A waking/yielding task goes to back of queue
 
4. When task blocks and later wakes:
   - Task goes to back of its priority queue
   
Example Timeline (priorities: A=90, B=80, C=80, D=70):
┌─────────────────────────────────────────────────────────────────┐
│ Time:  0    5    10   15   20   25   30   35   40              │
│                                                                 │
│ A(90): ████      ████████  (preempts from any lower)           │
│ B(80):     ████            ████████                             │
│ C(80):                              ████  (FIFO order after B) │
│ D(70):                                   ███████                │
│                                                                 │
│ Events: A↓    A↓     A↓    B↓    C↓                            │
│         runs  blocks runs  blocks completes                     │
└─────────────────────────────────────────────────────────────────┘

Using SCHED_FIFO:

sched_fifo_example.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <sys/mman.h>
#include <unistd.h>
 
/**
 * Configure current thread for SCHED_FIFO real-time scheduling
 * 
 * @param priority RT priority (1-99, higher = more important)
 * @return 0 on success, -1 on failure
 */
int configure_sched_fifo(int priority) {
    struct sched_param param;
    
    /* Validate priority range */
    int min_prio = sched_get_priority_min(SCHED_FIFO);
    int max_prio = sched_get_priority_max(SCHED_FIFO);
    
    if (priority < min_prio || priority > max_prio) {
        fprintf(stderr, "Priority %d out of range [%d, %d]
",
                priority, min_prio, max_prio);
        return -1;
    }
    
    memset(&param, 0, sizeof(param));
    param.sched_priority = priority;
    
    /* Set scheduler policy and priority */
    if (sched_setscheduler(0, SCHED_FIFO, &param) != 0) {
        perror("sched_setscheduler failed");
        fprintf(stderr, "Note: Requires CAP_SYS_NICE or root
");
        return -1;
    }
    
    printf("Configured SCHED_FIFO with priority %d
", priority);
    return 0;
}
 
/**
 * Best practices for SCHED_FIFO tasks
 */
void rt_task_best_practices(void) {
    /* 1. Lock all memory to prevent page faults */
    if (mlockall(MCL_CURRENT | MCL_FUTURE) != 0) {
        perror("mlockall failed");
    }
    
    /* 2. Pre-fault stack: touch stack pages before RT section */
    volatile char stack_prefault[8192];
    memset((void*)stack_prefault, 0, sizeof(stack_prefault));
    
    /* 3. Avoid dynamic memory allocation in RT path */
    /* Pre-allocate all buffers before entering RT loop */
    
    /* 4. Avoid blocking system calls without timeouts */
    /* Use poll/select with timeouts, not blocking read() */
}
 
/**
 * Example: Periodic RT task using SCHED_FIFO
 */
void* periodic_rt_task(void* arg) {
    int period_us = *(int*)arg;
    struct timespec next_wake;
    
    /* Get current time */
    clock_gettime(CLOCK_MONOTONIC, &next_wake);
    
    while (1) {
        /* Calculate next wake time */
        next_wake.tv_nsec += period_us * 1000;
        while (next_wake.tv_nsec >= 1000000000) {
            next_wake.tv_nsec -= 1000000000;
            next_wake.tv_sec++;
        }
        
        /* Do RT work */
        do_periodic_control_work();
        
        /* Sleep until next period (precise timing) */
        clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, 
                       &next_wake, NULL);
    }
    
    return NULL;
}
 
int main(int argc, char *argv[]) {
    /* Configure RT scheduling */
    if (configure_sched_fifo(80) != 0) {
        return 1;
    }
    
    rt_task_best_practices();
    
    int period_us = 1000;  /* 1ms period */
    periodic_rt_task(&period_us);
    
    return 0;
}

Command-line Configuration:

Command-Line RT Configuration
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Set SCHED_FIFO for a running process
sudo chrt -f -p 80 <pid>
 
# Run a new command with SCHED_FIFO
sudo chrt -f 80 ./my_rt_application
 
# View current scheduling policy of a process
chrt -p <pid>
# Output: pid 1234's current scheduling policy: SCHED_FIFO
# pid 1234's current scheduling priority: 80
 
# Show valid priority range
chrt -m
# Output:
# SCHED_OTHER min/max priority : 0/0
# SCHED_FIFO min/max priority  : 1/99
# SCHED_RR min/max priority    : 1/99
 
# Set SCHED_FIFO using nice/renice (NOT POSSIBLE)
# nice/renice only affect SCHED_OTHER - they're completely 
# separate from RT priorities!

When to Use SCHED_FIFO

SCHED_FIFO is ideal for event-driven tasks that do work and then block: interrupt handlers, I/O processing, and tasks waiting on condition variables. It's also appropriate when you have only one task per priority level, eliminating the need for time-slicing.

SCHED_RR: Round-Robin Real-Time Scheduling

SCHED_RR extends SCHED_FIFO with time-slicing among equal-priority tasks. Tasks still follow strict priority rules, but when multiple SCHED_RR tasks share the same priority, they take turns via round-robin scheduling.

SCHED_RR Characteristics

•Same priority range as SCHED_FIFO (1-99)
•Time-slicing within same priority
•Default time slice: ~100ms (configurable)
•Still preempts SCHED_OTHER completely
•Preempted by higher-priority RT tasks

Key Differences from SCHED_FIFO

•Quantum expiration moves task to queue end
•Prevents starvation among equal-priority tasks
•Slightly higher overhead (quantum tracking)
•Useful when priority design is imprecise
•Common for POSIX threads migration from other RTOSes

SCHED_RR Algorithm:

SCHED_RR Algorithm

Conceptual

SCHED_RR Scheduling Rules:
 
1. Same as SCHED_FIFO, PLUS:
2. Each task has a time quantum (typically 100ms)
3. When quantum expires:
   a. Task goes to end of its priority queue
   b. Quantum reset for next run
   c. Next same-priority task runs
 
4. Quantum does NOT tick when:
   - Task is blocked
   - Higher-priority task running
 
Example: Three SCHED_RR tasks at priority 50 (quantum = 100ms)
┌─────────────────────────────────────────────────────────────────┐
│ Time(ms): 0      100     200     300     400     500            │
│                                                                 │
│ Task A:   ███████┐      ┌███████┐      ┌███████                │
│                  │      │       │      │                        │
│ Task B:          ███████┐      ┌███████┐      ┌███████         │
│                         │      │       │      │                 │
│ Task C:                 ███████┐      ┌███████┐      ┌         │
│                                                                 │
│ All tasks make progress; none starved                          │
└─────────────────────────────────────────────────────────────────┘
 
With SCHED_FIFO, Task A would run forever (no time-slicing)!

Configuring SCHED_RR Time Quantum:

sched_rr_quantum.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
 
/**
 * Query and display SCHED_RR time quantum
 * 
 * Note: The quantum is typically system-wide and not
 * easily configurable per-process in Linux.
 */
void show_rr_quantum(void) {
    struct timespec quantum;
    
    /* Get RR time slice for PID 0 (current process) */
    if (sched_rr_get_interval(0, &quantum) == 0) {
        printf("SCHED_RR time quantum: %ld.%09ld seconds
",
               (long)quantum.tv_sec, quantum.tv_nsec);
        printf("  = %.1f milliseconds
",
               quantum.tv_sec * 1000.0 + 
               quantum.tv_nsec / 1000000.0);
    } else {
        perror("sched_rr_get_interval");
    }
}
 
/**
 * Configure SCHED_RR scheduling
 */
int configure_sched_rr(int priority) {
    struct sched_param param = { .sched_priority = priority };
    
    if (sched_setscheduler(0, SCHED_RR, &param) != 0) {
        perror("sched_setscheduler SCHED_RR failed");
        return -1;
    }
    
    printf("Configured SCHED_RR with priority %d
", priority);
    show_rr_quantum();
    
    return 0;
}

SCHED_RR Command-Line Usage
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
# Run with SCHED_RR (round-robin)
sudo chrt -r 50 ./my_application
 
# Set running process to SCHED_RR
sudo chrt -r -p 50 <pid>
 
# Query current RR quantum (requires process to be RR)
# There's no direct command; use the C function or:
sudo cat /proc/<pid>/sched | grep policy
 
# Configure system-wide RR quantum (if kernel supports)
# This is kernel-compile-time configurable via:
# CONFIG_HZ and sched_rr_timeslice_ms boot parameter (some kernels)

SCHED_FIFO vs SCHED_RR Decision

In well-designed RT systems, each task should have a unique priority based on its timing requirements. If you find yourself needing SCHED_RR because multiple tasks share priorities, consider whether your priority assignment is correct. SCHED_RR is often a fallback for imprecise priority design rather than an intentional choice.

SCHED_DEADLINE: Earliest Deadline First Scheduling

SCHED_DEADLINE implements Earliest Deadline First (EDF) scheduling, considered theoretically optimal for periodic real-time tasks. Instead of fixed priorities, tasks specify their timing requirements directly: period, deadline, and execution budget.

SCHED_DEADLINE Parameters

•Runtime (WCET Budget) — Maximum CPU time needed per period. Task is throttled if it exceeds this.
•Deadline — Time by which the current job must complete, relative to period start. Scheduler prioritizes tasks with nearest deadline.
•Period — Time between task activations. After each period, task gets fresh budget and new deadline.
•Relationship: Runtime ≤ Deadline ≤ Period (typical: Deadline = Period)

SCHED_DEADLINE Concept

Conceptual

SCHED_DEADLINE Task Model:
┌─────────────────────────────────────────────────────────────────┐
│                 Period                                          │
│ ◄──────────────────────────────────────────────────────────────►│
│                                                                 │
│ │ Job N                 │ Job N+1               │               │
│ │                       │                       │               │
│ │←──Runtime────►        │←──Runtime────►        │               │
│ │═══════════════        │═══════════════        │               │
│ │              ↑        │              ↑        │               │
│ │         Deadline      │         Deadline      │               │
│ │←────────────────►     │←────────────────►     │               │
│                                                                 │
│ Example Task:                                                   │
│   Period   = 10ms  (task activates every 10ms)                 │
│   Deadline = 8ms   (each job must complete within 8ms)         │
│   Runtime  = 2ms   (each job needs at most 2ms of CPU)         │
│                                                                 │
│ Scheduler ensures task gets 2ms CPU within each 8ms window     │
└─────────────────────────────────────────────────────────────────┘

EDF Scheduling Algorithm:

At any scheduling decision, the kernel runs the task with the earliest absolute deadline. This is provably optimal: if any schedule can meet all deadlines, EDF will.

Admission Control:

Unlike SCHED_FIFO/RR, SCHED_DEADLINE performs admission control. When you try to add a deadline task, the kernel checks if the new task's requirements, combined with existing deadline tasks, can be satisfied. If total utilization exceeds capacity, the request is rejected.

sched_deadline_example.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/sched.h>
#include <errno.h>
 
/*
 * Note: SCHED_DEADLINE requires direct syscall usage as
 * glibc sched_setattr() may not be available on all systems.
 */
 
struct sched_attr {
    uint32_t size;
    uint32_t sched_policy;
    uint64_t sched_flags;
    
    /* SCHED_OTHER/BATCH/IDLE */
    int32_t  sched_nice;
    
    /* SCHED_FIFO/RR */
    uint32_t sched_priority;
    
    /* SCHED_DEADLINE */
    uint64_t sched_runtime;   /* in nanoseconds */
    uint64_t sched_deadline;  /* in nanoseconds */
    uint64_t sched_period;    /* in nanoseconds */
};
 
#define SCHED_FLAG_RESET_ON_FORK 0x01
 
/* Syscall wrappers */
static int sched_setattr(pid_t pid, const struct sched_attr *attr,
                         unsigned int flags) {
    return syscall(SYS_sched_setattr, pid, attr, flags);
}
 
static int sched_getattr(pid_t pid, struct sched_attr *attr,
                         unsigned int size, unsigned int flags) {
    return syscall(SYS_sched_getattr, pid, attr, size, flags);
}
 
/**
 * Configure SCHED_DEADLINE for a periodic task
 * 
 * @param runtime_ns  Maximum CPU time per period (nanoseconds)
 * @param deadline_ns Relative deadline from period start
 * @param period_ns   Period of the task (activation interval)
 */
int configure_sched_deadline(uint64_t runtime_ns, 
                              uint64_t deadline_ns,
                              uint64_t period_ns) {
    struct sched_attr attr;
    
    memset(&attr, 0, sizeof(attr));
    attr.size = sizeof(attr);
    attr.sched_policy = SCHED_DEADLINE;
    attr.sched_runtime = runtime_ns;
    attr.sched_deadline = deadline_ns;
    attr.sched_period = period_ns;
    
    if (sched_setattr(0, &attr, 0) != 0) {
        perror("sched_setattr SCHED_DEADLINE failed");
        if (errno == EBUSY) {
            fprintf(stderr, 
                "Admission control failed: insufficient CPU
"
                "Try reducing runtime or increasing period
");
        }
        return -1;
    }
    
    printf("Configured SCHED_DEADLINE:
");
    printf("  Runtime:  %lu ns (%.2f ms)
", 
           runtime_ns, runtime_ns / 1e6);
    printf("  Deadline: %lu ns (%.2f ms)
", 
           deadline_ns, deadline_ns / 1e6);
    printf("  Period:   %lu ns (%.2f ms)
", 
           period_ns, period_ns / 1e6);
    printf("  Utilization: %.1f%%
", 
           100.0 * runtime_ns / period_ns);
    
    return 0;
}
 
/**
 * Example: Video frame processing with SCHED_DEADLINE
 * 
 * Requirements: Process one frame every 16.67ms (60 FPS)
 *   - Each frame takes up to 10ms to process
 *   - Deadline: Must complete before next frame
 */
int main(void) {
    /* 60 FPS video processing */
    uint64_t period_ns   = 16666667;  /* 16.67ms */
    uint64_t deadline_ns = 16666667;  /* Same as period */
    uint64_t runtime_ns  =  8000000;  /* 8ms max (48% utilization) */
    
    /*
     * Admission check happens here: kernel verifies:
     *   sum(runtime_i / period_i) <= CPU_CAPACITY
     * 
     * Our utilization: 8/16.67 = 48%
     * Leaves 52% for other deadline tasks
     */
    if (configure_sched_deadline(runtime_ns, deadline_ns, 
                                  period_ns) != 0) {
        return 1;
    }
    
    /* Main RT loop */
    while (1) {
        /* Process frame - kernel guarantees we finish in time */
        process_video_frame();
        
        /* Yield until next period */
        sched_yield();
        /*
         * After yield, task blocks until next period starts.
         * Kernel enforces that we don't run more than runtime_ns
         * per period, and that we complete by deadline_ns.
         */
    }
    
    return 0;
}

Advantages of SCHED_DEADLINE:

•Optimal Utilization — EDF can achieve 100% CPU utilization while meeting deadlines (SCHED_FIFO/RR with Rate Monotonic is limited to ~69%).
•No Priority Inversion — Tasks don't have fixed priorities, so traditional priority inversion is impossible.
•Admission Control — Prevents overload by rejecting tasks that would cause deadline misses.
•Bandwidth Isolation — Each task is guaranteed its runtime budget regardless of other tasks' behavior.
•Self-Documenting — Task parameters directly express timing requirements, not arbitrary priorities.

SCHED_DEADLINE Limitations

SCHED_DEADLINE is powerful but has constraints: tasks must be periodic or sporadic (no arbitrary arrival); parameters must be known a priori; it's less flexible for aperiodic work; and it requires more careful WCET analysis to set runtime correctly.

Comparing RT Scheduling Policies

Choosing the right scheduling policy depends on your task characteristics, system requirements, and complexity tolerance. Here's a comprehensive comparison:

RT Scheduling Policy Comparison
Aspect	SCHED_FIFO	SCHED_RR	SCHED_DEADLINE
Task Model	Priority-based	Priority-based with time-slice	Periodic/sporadic with parameters
Priorities	1-99 fixed	1-99 fixed + quantum	Implicit (earliest deadline)
Preemption	By higher priority only	By higher priority + quantum expiry	By earlier deadline only
Admission Control	None	None	Yes - rejects if infeasible
Max Utilization	~69% (RMS)	~69% (RMS)	100% theoretical
Priority Inversion	Possible	Possible	Not applicable
Complexity	Low	Low	Medium-High
WCET Required	No (informal)	No (informal)	Yes (for runtime parameter)
Aperiodic Tasks	Easy	Easy	Requires bandwidth server
Use Case	Simple RT, legacy	Multiple equal-priority	Optimal multimedia, control

Decision Framework:

Policy Selection Decision Tree

Conceptual

How to Choose RT Scheduling Policy:
 
1. Do you have explicit timing requirements (period, deadline)?
   └── YES → Consider SCHED_DEADLINE
   └── NO  → Use SCHED_FIFO or SCHED_RR
 
2. Are tasks periodic/sporadic with known parameters?
   └── YES → SCHED_DEADLINE provides optimal scheduling
   └── NO  → Use priority-based (FIFO/RR)
 
3. Will you have multiple RT tasks at the same priority?
   └── YES → SCHED_RR prevents starvation among them
   └── NO  → SCHED_FIFO is simpler
 
4. Is your system well-characterized with proper WCET analysis?
   └── YES → SCHED_DEADLINE gives guaranteed admission control
   └── NO  → SCHED_FIFO/RR with conservative priorities
 
5. Maximum CPU utilization requirement:
   └── Need > 69% → SCHED_DEADLINE (can approach 100%)
   └── < 69% sufficient → Any policy works
 
Common Patterns:
┌─────────────────────────────────────────────────────────────────┐
│ Pattern                          │ Recommendation              │
├──────────────────────────────────┼─────────────────────────────┤
│ Simple control loop              │ SCHED_FIFO, single priority │
│ Multiple control loops           │ SCHED_FIFO, careful priority│
│ Video/audio processing           │ SCHED_DEADLINE (periodic)   │
│ Legacy RTOS port                 │ SCHED_RR (familiar model)   │
│ Event-driven I/O handler         │ SCHED_FIFO (blocks often)   │
│ Soft real-time + best effort     │ Mix: RT for critical, CFS   │
└─────────────────────────────────────────────────────────────────┘

Practical Advice

For most embedded and control applications, start with SCHED_FIFO and well-designed priorities. SCHED_DEADLINE shines for multimedia (fixed frame rates) and when you need formal guarantees. SCHED_RR is mainly useful when migrating from other RTOSes that used round-robin.

RT Throttling and System Safety

Real-time tasks can completely monopolize the CPU, preventing critical system services from running. Linux implements RT throttling to prevent runaway RT tasks from hanging the system.

RT Throttling Mechanism:

By default, Linux reserves a portion of CPU time for non-RT tasks. RT tasks are throttled if they exceed their allocated bandwidth:

RT Throttling Configuration
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# View current RT throttling settings
cat /proc/sys/kernel/sched_rt_period_us    # Period (default: 1000000 = 1s)
cat /proc/sys/kernel/sched_rt_runtime_us   # RT budget (default: 950000 = 0.95s)
 
# Default: RT tasks get 95% of CPU per 1-second period
# Non-RT tasks guaranteed 5% of CPU (50ms per second)
 
# EXAMPLES:
 
# Example 1: Disable RT throttling (DANGEROUS!)
echo -1 > /proc/sys/kernel/sched_rt_runtime_us
# Now RT tasks can use 100% CPU - a buggy RT task will hang system!
 
# Example 2: More conservative throttling
echo 800000 > /proc/sys/kernel/sched_rt_runtime_us
# RT tasks get 80% of CPU, non-RT guaranteed 20%
 
# Example 3: Shorter period for finer granularity
echo 100000 > /proc/sys/kernel/sched_rt_period_us
echo 95000 > /proc/sys/kernel/sched_rt_runtime_us
# Still 95% RT, but checked every 100ms instead of 1s
# Prevents RT bursts longer than 100ms

When Throttling Occurs:

Throttling Behavior

Conceptual

RT Throttling Timeline (default settings: 950ms of 1000ms):
 
Without throttling:
┌─────────────────────────────────────────────────────────────────┐
│ RT Task:   ██████████████████████████████████████████████████  │
│ Non-RT:    (never runs - starved)                              │
│            └─── System becomes unresponsive ───┘               │
└─────────────────────────────────────────────────────────────────┘
 
With throttling:
┌─────────────────────────────────────────────────────────────────┐
│ Period:    │ ◄────────── 1000ms ──────────► │ ◄─── 1000ms ──► │
│                                                                 │
│ RT Task:   █████████████████████████████████│                  │
│            └─── 950ms ───┘ throttled!       │                  │
│                                                                 │
│ Non-RT:                                     ████                │
│            guaranteed 50ms to run           └──┘                │
│                                                                 │
│ You can still SSH in, kill runaway process!                   │
└─────────────────────────────────────────────────────────────────┘

Production Systems

For production real-time systems, carefully consider RT throttling settings. Disabling throttling (echo -1) improves RT performance but means a buggy RT task WILL hang your system. Use hardware watchdogs and test thoroughly before disabling throttling.

Monitoring RT Scheduling:

RT Scheduling Monitoring
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# View all RT processes and their priorities
ps -eo pid,cls,rtprio,pri,comm --sort=-rtprio | head -20
# cls: TS=SCHED_OTHER, FF=SCHED_FIFO, RR=SCHED_RR, DL=SCHED_DEADLINE
 
# Detailed scheduling info for a process
cat /proc/<pid>/sched
 
# For SCHED_DEADLINE tasks, view parameters:
# (requires reading sched_attr via syscall or specialized tools)
 
# Monitor scheduling live with trace-cmd
sudo trace-cmd record -e sched:sched_switch -e sched:sched_wakeup
sudo trace-cmd report | head -100
 
# Use cyclictest to measure RT latency
# (from rt-tests package)
sudo cyclictest -p 90 -t -m -n
# -p 90: priority 90
# -t: per-CPU threads
# -m: mlockall
# -n: use clock_nanosleep
 
# Output shows min/avg/max latency

Real-Time Scheduling Best Practices

Effective use of RT scheduling requires careful system design and adherence to proven practices:

Priority Assignment Guidelines

•Rate Monotonic Assignment: For SCHED_FIFO/RR, assign higher priorities to tasks with shorter periods. This follows Rate Monotonic theory and is provably optimal for fixed-priority scheduling.
•Separate Priority Levels: Avoid multiple tasks at the same RT priority. Each task's deadline requirements should map to a unique priority.
•Reserve Highest Priorities: Keep 95-99 for kernel threads and critical system functions. Application RT tasks should use 1-94.
•Match IRQ Thread Priorities: Set IRQ handler thread priorities according to the devices' timing requirements. Network IRQs might be lower than motor control IRQs.
•Document Priority Rationale: For each priority assignment, document why that value was chosen and what deadline it supports.

RT Application Guidelines

•Lock Memory: Call mlockall(MCL_CURRENT | MCL_FUTURE) to prevent page faults in RT sections.
•Pre-allocate Resources: Allocate all memory, open all files, and initialize all buffers before entering the RT loop.
•Avoid Non-RT System Calls: Don't call malloc(), printf() to console, or blocking I/O without timeouts in RT code paths.
•Use Appropriate Timers: Use clock_nanosleep() with CLOCK_MONOTONIC, not usleep() or gettimeofday()-based delays.
•Separate RT and Non-RT Code: Design clear boundaries between RT-critical code and best-effort code.
•Test Under Load: Latency problems appear under stress. Test with realistic I/O, network, and memory load.

The Complete RT Application Template

A well-designed RT application: (1) Initializes resources; (2) Locks memory; (3) Pre-faults stack; (4) Sets RT scheduling; (5) Enters deterministic loop; (6) Uses precise timing for sleep/wake; (7) Never allocates in the loop. Follow this pattern and most latency problems vanish.

Summary: Real-Time Schedulers

Linux provides three real-time scheduling policies, each suited to different requirements. Let's consolidate the essential concepts:

Key Takeaways

•SCHED_FIFO — Strict priority scheduling without time-slicing. Best for event-driven tasks and when each priority level has one task.
•SCHED_RR — Priority scheduling with round-robin among equal-priority tasks. Useful when multiple tasks must share a priority level.
•SCHED_DEADLINE — EDF scheduling with explicit timing parameters. Optimal for periodic tasks with known requirements. Provides admission control.
•RT Throttling — Linux reserves CPU time for non-RT tasks to prevent system lockup. Disable carefully only for tested production systems.
•All RT Policies Preempt CFS — Any RT task (even priority 1) runs before ALL SCHED_OTHER tasks. Design priorities carefully.
•Priority Assignment Matters — Use Rate Monotonic principles: shorter deadlines get higher priorities.

What's Next:

With RT scheduling policies understood, we'll next explore latency reduction techniques—the system-level optimizations that minimize scheduling and execution jitter, ensuring that RT tasks actually achieve their timing requirements.

Page Complete

You now understand Linux's real-time scheduling policies and can select the appropriate policy for your application requirements. This knowledge enables you to design and implement deterministic real-time systems on Linux platforms.

Real-Time Schedulers

Beyond Fairness: Priority-Driven Scheduling

What You Will Learn

Linux Scheduling Class Hierarchy

Linux Scheduling Classes (Highest to Lowest Priority)
Scheduling Class	Policies	Priority Range	Use Case
Stop Class	(internal)	N/A	Kernel-internal task migration, CPU hotplug
Deadline Class	SCHED_DEADLINE	N/A (EDF)	Tasks with explicit timing requirements
RT Class	SCHED_FIFO, SCHED_RR	1-99 (99 highest)	Real-time tasks requiring priority scheduling
Fair Class (CFS)	SCHED_OTHER, SCHED_BATCH, SCHED_IDLE	Nice -20 to +19	Normal tasks, weighted fair sharing
Idle Class	SCHED_IDLE	N/A	Background tasks, run only when nothing else

Critical Architectural Point:

The scheduler always selects from the highest-priority class with runnable tasks. This means:

A SCHED_DEADLINE task preempts everything except stop tasks
Any SCHED_FIFO or SCHED_RR task (even priority 1) preempts all SCHED_OTHER tasks
SCHED_OTHER tasks only run when no RT or deadline tasks are runnable

Scheduling Priority Visualization

Conceptual

Scheduler Decision Flow:
┌─────────────────────────────────────────────────────────────────┐
│                    pick_next_task()                             │
│                          │                                      │
│                          ▼                                      │
│        ┌─── Stop class has runnable task? ───┐                 │
│        │ Yes                          No      │                 │
│        ▼                               │      │                 │
│   Run stop task                        ▼      │                 │
│                    ┌─── Deadline class has task? ───┐          │
│                    │ Yes                      No     │          │
│                    ▼                           │     │          │
│               Run EDF task                     ▼     │          │
│                            ┌─── RT class has task? ───┐        │
│                            │ Yes               No      │        │
│                            ▼                    │      │        │
│                   Run highest-prio RT task     ▼      │        │
│                                    ┌─── Fair class ───┐        │
│                                    ▼                           │
│                           Run CFS-selected task                │
└─────────────────────────────────────────────────────────────────┘

Starvation Risk

SCHED_FIFO: First-In-First-Out Real-Time Scheduling

SCHED_FIFO Characteristics

•Priority Range: 1 (lowest) to 99 (highest)
•Time-Slicing: None — runs until voluntarily yields or blocks
•Preemption: Immediate preemption by higher-priority SCHED_FIFO/RR tasks
•Queue Order: FIFO within same priority level
•Yielding: sched_yield() moves task to end of its priority queue
•Use Cases: Event-driven tasks, single-threaded RT workloads

SCHED_FIFO Algorithm:

SCHED_FIFO Algorithm

Pseudocode

SCHED_FIFO Scheduling Rules:
 
1. Pick the highest-priority runnable task
2. Run that task until one of:
   a. Task blocks (I/O, mutex, sleep)
   b. Task yields (sched_yield())
   c. Task terminates
   d. Higher-priority task becomes runnable
 
3. When multiple tasks have same priority:
   - Run them in FIFO order
   - A running task stays at front of queue
   - A waking/yielding task goes to back of queue
 
4. When task blocks and later wakes:
   - Task goes to back of its priority queue
   
Example Timeline (priorities: A=90, B=80, C=80, D=70):
┌─────────────────────────────────────────────────────────────────┐
│ Time:  0    5    10   15   20   25   30   35   40              │
│                                                                 │
│ A(90): ████      ████████  (preempts from any lower)           │
│ B(80):     ████            ████████                             │
│ C(80):                              ████  (FIFO order after B) │
│ D(70):                                   ███████                │
│                                                                 │
│ Events: A↓    A↓     A↓    B↓    C↓                            │
│         runs  blocks runs  blocks completes                     │
└─────────────────────────────────────────────────────────────────┘

Using SCHED_FIFO:

sched_fifo_example.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <sys/mman.h>
#include <unistd.h>
 
/**
 * Configure current thread for SCHED_FIFO real-time scheduling
 * 
 * @param priority RT priority (1-99, higher = more important)
 * @return 0 on success, -1 on failure
 */
int configure_sched_fifo(int priority) {
    struct sched_param param;
    
    /* Validate priority range */
    int min_prio = sched_get_priority_min(SCHED_FIFO);
    int max_prio = sched_get_priority_max(SCHED_FIFO);
    
    if (priority < min_prio || priority > max_prio) {
        fprintf(stderr, "Priority %d out of range [%d, %d]
",
                priority, min_prio, max_prio);
        return -1;
    }
    
    memset(&param, 0, sizeof(param));
    param.sched_priority = priority;
    
    /* Set scheduler policy and priority */
    if (sched_setscheduler(0, SCHED_FIFO, &param) != 0) {
        perror("sched_setscheduler failed");
        fprintf(stderr, "Note: Requires CAP_SYS_NICE or root
");
        return -1;
    }
    
    printf("Configured SCHED_FIFO with priority %d
", priority);
    return 0;
}
 
/**
 * Best practices for SCHED_FIFO tasks
 */
void rt_task_best_practices(void) {
    /* 1. Lock all memory to prevent page faults */
    if (mlockall(MCL_CURRENT | MCL_FUTURE) != 0) {
        perror("mlockall failed");
    }
    
    /* 2. Pre-fault stack: touch stack pages before RT section */
    volatile char stack_prefault[8192];
    memset((void*)stack_prefault, 0, sizeof(stack_prefault));
    
    /* 3. Avoid dynamic memory allocation in RT path */
    /* Pre-allocate all buffers before entering RT loop */
    
    /* 4. Avoid blocking system calls without timeouts */
    /* Use poll/select with timeouts, not blocking read() */
}
 
/**
 * Example: Periodic RT task using SCHED_FIFO
 */
void* periodic_rt_task(void* arg) {
    int period_us = *(int*)arg;
    struct timespec next_wake;
    
    /* Get current time */
    clock_gettime(CLOCK_MONOTONIC, &next_wake);
    
    while (1) {
        /* Calculate next wake time */
        next_wake.tv_nsec += period_us * 1000;
        while (next_wake.tv_nsec >= 1000000000) {
            next_wake.tv_nsec -= 1000000000;
            next_wake.tv_sec++;
        }
        
        /* Do RT work */
        do_periodic_control_work();
        
        /* Sleep until next period (precise timing) */
        clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, 
                       &next_wake, NULL);
    }
    
    return NULL;
}
 
int main(int argc, char *argv[]) {
    /* Configure RT scheduling */
    if (configure_sched_fifo(80) != 0) {
        return 1;
    }
    
    rt_task_best_practices();
    
    int period_us = 1000;  /* 1ms period */
    periodic_rt_task(&period_us);
    
    return 0;
}

Command-line Configuration:

Command-Line RT Configuration
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Set SCHED_FIFO for a running process
sudo chrt -f -p 80 <pid>
 
# Run a new command with SCHED_FIFO
sudo chrt -f 80 ./my_rt_application
 
# View current scheduling policy of a process
chrt -p <pid>
# Output: pid 1234's current scheduling policy: SCHED_FIFO
# pid 1234's current scheduling priority: 80
 
# Show valid priority range
chrt -m
# Output:
# SCHED_OTHER min/max priority : 0/0
# SCHED_FIFO min/max priority  : 1/99
# SCHED_RR min/max priority    : 1/99
 
# Set SCHED_FIFO using nice/renice (NOT POSSIBLE)
# nice/renice only affect SCHED_OTHER - they're completely 
# separate from RT priorities!

When to Use SCHED_FIFO

SCHED_RR: Round-Robin Real-Time Scheduling

SCHED_RR Characteristics

•Same priority range as SCHED_FIFO (1-99)
•Time-slicing within same priority
•Default time slice: ~100ms (configurable)
•Still preempts SCHED_OTHER completely
•Preempted by higher-priority RT tasks

Key Differences from SCHED_FIFO

•Quantum expiration moves task to queue end
•Prevents starvation among equal-priority tasks
•Slightly higher overhead (quantum tracking)
•Useful when priority design is imprecise
•Common for POSIX threads migration from other RTOSes

SCHED_RR Algorithm:

SCHED_RR Algorithm

Conceptual

SCHED_RR Scheduling Rules:
 
1. Same as SCHED_FIFO, PLUS:
2. Each task has a time quantum (typically 100ms)
3. When quantum expires:
   a. Task goes to end of its priority queue
   b. Quantum reset for next run
   c. Next same-priority task runs
 
4. Quantum does NOT tick when:
   - Task is blocked
   - Higher-priority task running
 
Example: Three SCHED_RR tasks at priority 50 (quantum = 100ms)
┌─────────────────────────────────────────────────────────────────┐
│ Time(ms): 0      100     200     300     400     500            │
│                                                                 │
│ Task A:   ███████┐      ┌███████┐      ┌███████                │
│                  │      │       │      │                        │
│ Task B:          ███████┐      ┌███████┐      ┌███████         │
│                         │      │       │      │                 │
│ Task C:                 ███████┐      ┌███████┐      ┌         │
│                                                                 │
│ All tasks make progress; none starved                          │
└─────────────────────────────────────────────────────────────────┘
 
With SCHED_FIFO, Task A would run forever (no time-slicing)!

Configuring SCHED_RR Time Quantum:

sched_rr_quantum.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
 
/**
 * Query and display SCHED_RR time quantum
 * 
 * Note: The quantum is typically system-wide and not
 * easily configurable per-process in Linux.
 */
void show_rr_quantum(void) {
    struct timespec quantum;
    
    /* Get RR time slice for PID 0 (current process) */
    if (sched_rr_get_interval(0, &quantum) == 0) {
        printf("SCHED_RR time quantum: %ld.%09ld seconds
",
               (long)quantum.tv_sec, quantum.tv_nsec);
        printf("  = %.1f milliseconds
",
               quantum.tv_sec * 1000.0 + 
               quantum.tv_nsec / 1000000.0);
    } else {
        perror("sched_rr_get_interval");
    }
}
 
/**
 * Configure SCHED_RR scheduling
 */
int configure_sched_rr(int priority) {
    struct sched_param param = { .sched_priority = priority };
    
    if (sched_setscheduler(0, SCHED_RR, &param) != 0) {
        perror("sched_setscheduler SCHED_RR failed");
        return -1;
    }
    
    printf("Configured SCHED_RR with priority %d
", priority);
    show_rr_quantum();
    
    return 0;
}

SCHED_RR Command-Line Usage
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
# Run with SCHED_RR (round-robin)
sudo chrt -r 50 ./my_application
 
# Set running process to SCHED_RR
sudo chrt -r -p 50 <pid>
 
# Query current RR quantum (requires process to be RR)
# There's no direct command; use the C function or:
sudo cat /proc/<pid>/sched | grep policy
 
# Configure system-wide RR quantum (if kernel supports)
# This is kernel-compile-time configurable via:
# CONFIG_HZ and sched_rr_timeslice_ms boot parameter (some kernels)

SCHED_FIFO vs SCHED_RR Decision

SCHED_DEADLINE: Earliest Deadline First Scheduling

SCHED_DEADLINE Parameters

•Runtime (WCET Budget) — Maximum CPU time needed per period. Task is throttled if it exceeds this.
•Deadline — Time by which the current job must complete, relative to period start. Scheduler prioritizes tasks with nearest deadline.
•Period — Time between task activations. After each period, task gets fresh budget and new deadline.
•Relationship: Runtime ≤ Deadline ≤ Period (typical: Deadline = Period)

SCHED_DEADLINE Concept

Conceptual

SCHED_DEADLINE Task Model:
┌─────────────────────────────────────────────────────────────────┐
│                 Period                                          │
│ ◄──────────────────────────────────────────────────────────────►│
│                                                                 │
│ │ Job N                 │ Job N+1               │               │
│ │                       │                       │               │
│ │←──Runtime────►        │←──Runtime────►        │               │
│ │═══════════════        │═══════════════        │               │
│ │              ↑        │              ↑        │               │
│ │         Deadline      │         Deadline      │               │
│ │←────────────────►     │←────────────────►     │               │
│                                                                 │
│ Example Task:                                                   │
│   Period   = 10ms  (task activates every 10ms)                 │
│   Deadline = 8ms   (each job must complete within 8ms)         │
│   Runtime  = 2ms   (each job needs at most 2ms of CPU)         │
│                                                                 │
│ Scheduler ensures task gets 2ms CPU within each 8ms window     │
└─────────────────────────────────────────────────────────────────┘

EDF Scheduling Algorithm:

At any scheduling decision, the kernel runs the task with the earliest absolute deadline. This is provably optimal: if any schedule can meet all deadlines, EDF will.

Admission Control:

sched_deadline_example.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <linux/sched.h>
#include <errno.h>
 
/*
 * Note: SCHED_DEADLINE requires direct syscall usage as
 * glibc sched_setattr() may not be available on all systems.
 */
 
struct sched_attr {
    uint32_t size;
    uint32_t sched_policy;
    uint64_t sched_flags;
    
    /* SCHED_OTHER/BATCH/IDLE */
    int32_t  sched_nice;
    
    /* SCHED_FIFO/RR */
    uint32_t sched_priority;
    
    /* SCHED_DEADLINE */
    uint64_t sched_runtime;   /* in nanoseconds */
    uint64_t sched_deadline;  /* in nanoseconds */
    uint64_t sched_period;    /* in nanoseconds */
};
 
#define SCHED_FLAG_RESET_ON_FORK 0x01
 
/* Syscall wrappers */
static int sched_setattr(pid_t pid, const struct sched_attr *attr,
                         unsigned int flags) {
    return syscall(SYS_sched_setattr, pid, attr, flags);
}
 
static int sched_getattr(pid_t pid, struct sched_attr *attr,
                         unsigned int size, unsigned int flags) {
    return syscall(SYS_sched_getattr, pid, attr, size, flags);
}
 
/**
 * Configure SCHED_DEADLINE for a periodic task
 * 
 * @param runtime_ns  Maximum CPU time per period (nanoseconds)
 * @param deadline_ns Relative deadline from period start
 * @param period_ns   Period of the task (activation interval)
 */
int configure_sched_deadline(uint64_t runtime_ns, 
                              uint64_t deadline_ns,
                              uint64_t period_ns) {
    struct sched_attr attr;
    
    memset(&attr, 0, sizeof(attr));
    attr.size = sizeof(attr);
    attr.sched_policy = SCHED_DEADLINE;
    attr.sched_runtime = runtime_ns;
    attr.sched_deadline = deadline_ns;
    attr.sched_period = period_ns;
    
    if (sched_setattr(0, &attr, 0) != 0) {
        perror("sched_setattr SCHED_DEADLINE failed");
        if (errno == EBUSY) {
            fprintf(stderr, 
                "Admission control failed: insufficient CPU
"
                "Try reducing runtime or increasing period
");
        }
        return -1;
    }
    
    printf("Configured SCHED_DEADLINE:
");
    printf("  Runtime:  %lu ns (%.2f ms)
", 
           runtime_ns, runtime_ns / 1e6);
    printf("  Deadline: %lu ns (%.2f ms)
", 
           deadline_ns, deadline_ns / 1e6);
    printf("  Period:   %lu ns (%.2f ms)
", 
           period_ns, period_ns / 1e6);
    printf("  Utilization: %.1f%%
", 
           100.0 * runtime_ns / period_ns);
    
    return 0;
}
 
/**
 * Example: Video frame processing with SCHED_DEADLINE
 * 
 * Requirements: Process one frame every 16.67ms (60 FPS)
 *   - Each frame takes up to 10ms to process
 *   - Deadline: Must complete before next frame
 */
int main(void) {
    /* 60 FPS video processing */
    uint64_t period_ns   = 16666667;  /* 16.67ms */
    uint64_t deadline_ns = 16666667;  /* Same as period */
    uint64_t runtime_ns  =  8000000;  /* 8ms max (48% utilization) */
    
    /*
     * Admission check happens here: kernel verifies:
     *   sum(runtime_i / period_i) <= CPU_CAPACITY
     * 
     * Our utilization: 8/16.67 = 48%
     * Leaves 52% for other deadline tasks
     */
    if (configure_sched_deadline(runtime_ns, deadline_ns, 
                                  period_ns) != 0) {
        return 1;
    }
    
    /* Main RT loop */
    while (1) {
        /* Process frame - kernel guarantees we finish in time */
        process_video_frame();
        
        /* Yield until next period */
        sched_yield();
        /*
         * After yield, task blocks until next period starts.
         * Kernel enforces that we don't run more than runtime_ns
         * per period, and that we complete by deadline_ns.
         */
    }
    
    return 0;
}

Advantages of SCHED_DEADLINE:

•Optimal Utilization — EDF can achieve 100% CPU utilization while meeting deadlines (SCHED_FIFO/RR with Rate Monotonic is limited to ~69%).
•No Priority Inversion — Tasks don't have fixed priorities, so traditional priority inversion is impossible.
•Admission Control — Prevents overload by rejecting tasks that would cause deadline misses.
•Bandwidth Isolation — Each task is guaranteed its runtime budget regardless of other tasks' behavior.
•Self-Documenting — Task parameters directly express timing requirements, not arbitrary priorities.

SCHED_DEADLINE Limitations

Comparing RT Scheduling Policies

Choosing the right scheduling policy depends on your task characteristics, system requirements, and complexity tolerance. Here's a comprehensive comparison:

RT Scheduling Policy Comparison
Aspect	SCHED_FIFO	SCHED_RR	SCHED_DEADLINE
Task Model	Priority-based	Priority-based with time-slice	Periodic/sporadic with parameters
Priorities	1-99 fixed	1-99 fixed + quantum	Implicit (earliest deadline)
Preemption	By higher priority only	By higher priority + quantum expiry	By earlier deadline only
Admission Control	None	None	Yes - rejects if infeasible
Max Utilization	~69% (RMS)	~69% (RMS)	100% theoretical
Priority Inversion	Possible	Possible	Not applicable
Complexity	Low	Low	Medium-High
WCET Required	No (informal)	No (informal)	Yes (for runtime parameter)
Aperiodic Tasks	Easy	Easy	Requires bandwidth server
Use Case	Simple RT, legacy	Multiple equal-priority	Optimal multimedia, control

Decision Framework:

Policy Selection Decision Tree

Conceptual

How to Choose RT Scheduling Policy:
 
1. Do you have explicit timing requirements (period, deadline)?
   └── YES → Consider SCHED_DEADLINE
   └── NO  → Use SCHED_FIFO or SCHED_RR
 
2. Are tasks periodic/sporadic with known parameters?
   └── YES → SCHED_DEADLINE provides optimal scheduling
   └── NO  → Use priority-based (FIFO/RR)
 
3. Will you have multiple RT tasks at the same priority?
   └── YES → SCHED_RR prevents starvation among them
   └── NO  → SCHED_FIFO is simpler
 
4. Is your system well-characterized with proper WCET analysis?
   └── YES → SCHED_DEADLINE gives guaranteed admission control
   └── NO  → SCHED_FIFO/RR with conservative priorities
 
5. Maximum CPU utilization requirement:
   └── Need > 69% → SCHED_DEADLINE (can approach 100%)
   └── < 69% sufficient → Any policy works
 
Common Patterns:
┌─────────────────────────────────────────────────────────────────┐
│ Pattern                          │ Recommendation              │
├──────────────────────────────────┼─────────────────────────────┤
│ Simple control loop              │ SCHED_FIFO, single priority │
│ Multiple control loops           │ SCHED_FIFO, careful priority│
│ Video/audio processing           │ SCHED_DEADLINE (periodic)   │
│ Legacy RTOS port                 │ SCHED_RR (familiar model)   │
│ Event-driven I/O handler         │ SCHED_FIFO (blocks often)   │
│ Soft real-time + best effort     │ Mix: RT for critical, CFS   │
└─────────────────────────────────────────────────────────────────┘

Practical Advice

RT Throttling and System Safety

Real-time tasks can completely monopolize the CPU, preventing critical system services from running. Linux implements RT throttling to prevent runaway RT tasks from hanging the system.

RT Throttling Mechanism:

By default, Linux reserves a portion of CPU time for non-RT tasks. RT tasks are throttled if they exceed their allocated bandwidth:

RT Throttling Configuration
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# View current RT throttling settings
cat /proc/sys/kernel/sched_rt_period_us    # Period (default: 1000000 = 1s)
cat /proc/sys/kernel/sched_rt_runtime_us   # RT budget (default: 950000 = 0.95s)
 
# Default: RT tasks get 95% of CPU per 1-second period
# Non-RT tasks guaranteed 5% of CPU (50ms per second)
 
# EXAMPLES:
 
# Example 1: Disable RT throttling (DANGEROUS!)
echo -1 > /proc/sys/kernel/sched_rt_runtime_us
# Now RT tasks can use 100% CPU - a buggy RT task will hang system!
 
# Example 2: More conservative throttling
echo 800000 > /proc/sys/kernel/sched_rt_runtime_us
# RT tasks get 80% of CPU, non-RT guaranteed 20%
 
# Example 3: Shorter period for finer granularity
echo 100000 > /proc/sys/kernel/sched_rt_period_us
echo 95000 > /proc/sys/kernel/sched_rt_runtime_us
# Still 95% RT, but checked every 100ms instead of 1s
# Prevents RT bursts longer than 100ms

When Throttling Occurs:

Throttling Behavior

Conceptual

RT Throttling Timeline (default settings: 950ms of 1000ms):
 
Without throttling:
┌─────────────────────────────────────────────────────────────────┐
│ RT Task:   ██████████████████████████████████████████████████  │
│ Non-RT:    (never runs - starved)                              │
│            └─── System becomes unresponsive ───┘               │
└─────────────────────────────────────────────────────────────────┘
 
With throttling:
┌─────────────────────────────────────────────────────────────────┐
│ Period:    │ ◄────────── 1000ms ──────────► │ ◄─── 1000ms ──► │
│                                                                 │
│ RT Task:   █████████████████████████████████│                  │
│            └─── 950ms ───┘ throttled!       │                  │
│                                                                 │
│ Non-RT:                                     ████                │
│            guaranteed 50ms to run           └──┘                │
│                                                                 │
│ You can still SSH in, kill runaway process!                   │
└─────────────────────────────────────────────────────────────────┘

Production Systems

Monitoring RT Scheduling:

RT Scheduling Monitoring
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# View all RT processes and their priorities
ps -eo pid,cls,rtprio,pri,comm --sort=-rtprio | head -20
# cls: TS=SCHED_OTHER, FF=SCHED_FIFO, RR=SCHED_RR, DL=SCHED_DEADLINE
 
# Detailed scheduling info for a process
cat /proc/<pid>/sched
 
# For SCHED_DEADLINE tasks, view parameters:
# (requires reading sched_attr via syscall or specialized tools)
 
# Monitor scheduling live with trace-cmd
sudo trace-cmd record -e sched:sched_switch -e sched:sched_wakeup
sudo trace-cmd report | head -100
 
# Use cyclictest to measure RT latency
# (from rt-tests package)
sudo cyclictest -p 90 -t -m -n
# -p 90: priority 90
# -t: per-CPU threads
# -m: mlockall
# -n: use clock_nanosleep
 
# Output shows min/avg/max latency

Real-Time Scheduling Best Practices

Effective use of RT scheduling requires careful system design and adherence to proven practices:

Priority Assignment Guidelines

•Rate Monotonic Assignment: For SCHED_FIFO/RR, assign higher priorities to tasks with shorter periods. This follows Rate Monotonic theory and is provably optimal for fixed-priority scheduling.
•Separate Priority Levels: Avoid multiple tasks at the same RT priority. Each task's deadline requirements should map to a unique priority.
•Reserve Highest Priorities: Keep 95-99 for kernel threads and critical system functions. Application RT tasks should use 1-94.
•Match IRQ Thread Priorities: Set IRQ handler thread priorities according to the devices' timing requirements. Network IRQs might be lower than motor control IRQs.
•Document Priority Rationale: For each priority assignment, document why that value was chosen and what deadline it supports.

RT Application Guidelines

•Lock Memory: Call mlockall(MCL_CURRENT | MCL_FUTURE) to prevent page faults in RT sections.
•Pre-allocate Resources: Allocate all memory, open all files, and initialize all buffers before entering the RT loop.
•Avoid Non-RT System Calls: Don't call malloc(), printf() to console, or blocking I/O without timeouts in RT code paths.
•Use Appropriate Timers: Use clock_nanosleep() with CLOCK_MONOTONIC, not usleep() or gettimeofday()-based delays.
•Separate RT and Non-RT Code: Design clear boundaries between RT-critical code and best-effort code.
•Test Under Load: Latency problems appear under stress. Test with realistic I/O, network, and memory load.

The Complete RT Application Template

Summary: Real-Time Schedulers

Linux provides three real-time scheduling policies, each suited to different requirements. Let's consolidate the essential concepts:

Key Takeaways

•SCHED_FIFO — Strict priority scheduling without time-slicing. Best for event-driven tasks and when each priority level has one task.
•SCHED_RR — Priority scheduling with round-robin among equal-priority tasks. Useful when multiple tasks must share a priority level.
•SCHED_DEADLINE — EDF scheduling with explicit timing parameters. Optimal for periodic tasks with known requirements. Provides admission control.
•RT Throttling — Linux reserves CPU time for non-RT tasks to prevent system lockup. Disable carefully only for tested production systems.
•All RT Policies Preempt CFS — Any RT task (even priority 1) runs before ALL SCHED_OTHER tasks. Design priorities carefully.
•Priority Assignment Matters — Use Rate Monotonic principles: shorter deadlines get higher priorities.

What's Next:

Page Complete