Loading content...
Soft affinity, as we explored in the previous page, represents the scheduler's best-effort attempt to maintain cache locality. But 'best-effort' isn't always sufficient. Some applications demand guarantees—absolute certainty that a process will never run on certain CPUs, or will always run on a specific set.
This is where hard affinity enters the picture. Hard affinity is an explicit, enforced constraint on CPU placement. Unlike soft affinity (a scheduler preference), hard affinity is a scheduler mandate—the kernel will not schedule the process on any CPU outside its affinity mask, even if those CPUs are idle.
Why Would Anyone Need This?
The question naturally arises: if soft affinity generally works, why impose rigid constraints? The answer lies in specialized workloads where the costs of migration or interference are unacceptable:
By the end of this page, you will understand: how to set hard affinity using system calls and commands, the internal representation of affinity masks, the implications of constraining CPU placement, interactions with other kernel features, and the tradeoffs involved in using hard affinity.
At its core, hard affinity is implemented through a CPU affinity mask—a bitmask where each bit corresponds to a CPU in the system. If a bit is set, the process is allowed to run on that CPU; if cleared, that CPU is prohibited.
The Affinity Mask:
For a system with N CPUs (numbered 0 to N-1), the affinity mask is an N-bit value:
CPU Number: 7 6 5 4 3 2 1 0
─────────────────────────────────────────
Mask Bits: │ 0 │ 1 │ 0 │ 1 │ 0 │ 1 │ 0 │ 1 │
─────────────────────────────────────────
= 0x55 = CPUs 0, 2, 4, 6 (even CPUs only)
Default Affinity:
When a process is created, it inherits the affinity mask of its parent, which typically includes all CPUs. This maximal affinity is represented as all bits set (e.g., 0xFF for 8 CPUs, 0xFFFFFFFF for 32 CPUs).
Relationship to Soft Affinity:
Hard and soft affinity work together:
If a task's previous CPU is outside its affinity mask (perhaps the mask was modified), the scheduler must migrate the task to an allowed CPU.
| Hex Value | Binary | Allowed CPUs | Meaning |
|---|---|---|---|
0xFF | 11111111 | 0-7 (all) | Default: run anywhere |
0x01 | 00000001 | 0 only | Pin to CPU 0 |
0x0F | 00001111 | 0-3 | Run on first 4 CPUs |
0xF0 | 11110000 | 4-7 | Run on last 4 CPUs |
0x55 | 01010101 | 0,2,4,6 | Even CPUs only |
0x03 | 00000011 | 0,1 | Run on specific core pair |
For systems with more than 64 CPUs, a single 64-bit integer cannot represent the affinity mask. Linux uses cpu_set_t, which can represent up to 1024 CPUs by default (defined by CPU_SETSIZE). High-core-count systems may need kernel configuration adjustments for larger masks.
Linux provides the sched_setaffinity() and sched_getaffinity() system calls for managing hard affinity programmatically.
System Call Signatures:
123456789101112131415161718192021222324
#define _GNU_SOURCE#include <sched.h> /* * Set the CPU affinity mask for a process * * @param pid Process ID (0 = calling process) * @param cpusetsize Size of mask buffer in bytes * @param mask Pointer to cpu_set_t specifying allowed CPUs * @return 0 on success, -1 on error (check errno) */int sched_setaffinity(pid_t pid, size_t cpusetsize, const cpu_set_t *mask); /* * Get the current CPU affinity mask for a process * * @param pid Process ID (0 = calling process) * @param cpusetsize Size of mask buffer in bytes * @param mask Pointer to cpu_set_t to receive current mask * @return 0 on success, -1 on error (check errno) */int sched_getaffinity(pid_t pid, size_t cpusetsize, cpu_set_t *mask);Working with cpu_set_t:
The cpu_set_t type is opaque—you manipulate it using macros provided in <sched.h>:
12345678910111213141516171819202122
/* Initialize and clear all CPUs from set */void CPU_ZERO(cpu_set_t *set); /* Add CPU 'cpu' to the set */void CPU_SET(int cpu, cpu_set_t *set); /* Remove CPU 'cpu' from the set */void CPU_CLR(int cpu, cpu_set_t *set); /* Test if CPU 'cpu' is in the set (returns non-zero if yes) */int CPU_ISSET(int cpu, cpu_set_t *set); /* Count number of CPUs in the set */int CPU_COUNT(cpu_set_t *set); /* Logical operations between sets */void CPU_AND(cpu_set_t *dest, cpu_set_t *src1, cpu_set_t *src2);void CPU_OR(cpu_set_t *dest, cpu_set_t *src1, cpu_set_t *src2);void CPU_XOR(cpu_set_t *dest, cpu_set_t *src1, cpu_set_t *src2); /* Test if two sets are equal */int CPU_EQUAL(cpu_set_t *set1, cpu_set_t *set2);Complete Example: Pinning a Process to Specific CPUs
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
#define _GNU_SOURCE#include <stdio.h>#include <stdlib.h>#include <sched.h>#include <unistd.h>#include <errno.h>#include <string.h> void print_affinity(pid_t pid, const char *msg) { cpu_set_t mask; CPU_ZERO(&mask); if (sched_getaffinity(pid, sizeof(mask), &mask) == -1) { perror("sched_getaffinity"); return; } printf("%s: CPUs allowed: ", msg); for (int i = 0; i < CPU_SETSIZE; i++) { if (CPU_ISSET(i, &mask)) { printf("%d ", i); } } printf("(count: %d)\n", CPU_COUNT(&mask));} int main(int argc, char *argv[]) { cpu_set_t mask; pid_t pid = getpid(); /* Show initial affinity (inherited from parent) */ print_affinity(0, "Initial affinity"); /* Restrict to CPUs 0 and 2 only */ CPU_ZERO(&mask); CPU_SET(0, &mask); CPU_SET(2, &mask); printf("\nSetting affinity to CPUs 0 and 2...\n"); if (sched_setaffinity(0, sizeof(mask), &mask) == -1) { perror("sched_setaffinity"); exit(EXIT_FAILURE); } /* Verify the change */ print_affinity(0, "After setting affinity"); /* Show which CPU we're currently running on */ int cpu = sched_getcpu(); printf("Currently running on CPU: %d\n", cpu); /* Demonstrate that we only run on allowed CPUs */ printf("\nRunning tight loop, observe with: watch taskset -p %d\n", pid); volatile long x = 0; for (long i = 0; i < 10000000000L; i++) { x += i; /* Busy work */ } return 0;}The sched_getcpu() function returns the CPU the calling thread is currently executing on. It's implemented very efficiently using the VDSO, making it suitable for performance monitoring and debugging without significant overhead.
While system calls provide programmatic control, Linux provides excellent command-line tools for managing affinity without code changes.
The taskset Command:
taskset is the primary tool for getting and setting CPU affinity. It can be used to launch new processes with a specified affinity or modify running processes.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
# ===== LAUNCHING WITH AFFINITY ===== # Run command on CPU 0 onlytaskset 0x1 ./my_program # Run command on CPUs 0, 1, 2, 3 (using hex mask)taskset 0xf ./my_program # Run command on CPUs 0, 1, 2, 3 (using -c with CPU list)taskset -c 0-3 ./my_program # Run command on CPUs 0, 2, 4 (comma-separated list)taskset -c 0,2,4 ./my_program # ===== QUERYING AFFINITY ===== # Get affinity for a running processtaskset -p <pid># Output: pid <pid>'s current affinity mask: ff # Get affinity with CPU list formattaskset -cp <pid># Output: pid <pid>'s current affinity list: 0-7 # ===== MODIFYING RUNNING PROCESSES ===== # Set affinity of running process to CPU 0taskset -p 0x1 <pid> # Set affinity of running process to CPUs 2-5taskset -cp 2-5 <pid> # Set affinity for all threads of a processtaskset -ap 0x3 <pid> # -a = all tasks (threads) # ===== PRACTICAL EXAMPLES ===== # Run CPU-intensive benchmark on single core for consistencytaskset -c 0 ./benchmark --iterations=1000000 # Isolate database from general workload (assuming 8 CPUs)# System processes: CPUs 0-3# Database: CPUs 4-7taskset -c 4-7 /usr/bin/postgres # Run two competing processes on separate CPU setstaskset -c 0-3 ./process_a &taskset -c 4-7 ./process_b &The numactl Command:
For NUMA systems, numactl provides more sophisticated control, combining CPU affinity with memory placement policies:
12345678910111213141516171819
# View NUMA topologynumactl --hardware# Shows: nodes, CPUs per node, memory per node, distances # Run process on specific NUMA node (CPU + memory binding)numactl --cpunodebind=0 --membind=0 ./my_program # Run on CPUs of node 1, interleave memory across all nodesnumactl --cpunodebind=1 --interleave=all ./my_program # Bind to specific CPUs within NUMA frameworknumactl --physcpubind=0,2,4,6 ./my_program # Preferred memory allocation on node 0, CPUs anywherenumactl --preferred=0 ./my_program # Check NUMA placement of running processnumastat -p <pid># Shows memory pages per NUMA nodeUse taskset for pure CPU affinity when you only care about CPU placement. Use numactl when you need coordinated CPU and memory placement, or when working with NUMA-aware applications. For NUMA systems, numactl is generally the better choice as it handles both dimensions of locality.
Understanding how the kernel implements hard affinity reveals both its power and its limitations.
Task Structure Affinity Fields:
As we saw in the soft affinity discussion, the task_struct contains affinity-related fields:
12345678910111213141516171819202122
struct task_struct { /* ... */ /* * cpus_ptr points to cpus_mask, which is the effective * affinity mask used by the scheduler. This is the * intersection of: * - user_cpus_ptr (set via sched_setaffinity) * - cpuset's cpus_allowed (cgroup CPU constraints) * - offline CPUs mask (can't schedule on offline CPUs) */ cpumask_t *cpus_ptr; /* Points to effective mask */ cpumask_t cpus_mask; /* Storage for effective mask */ /* * Original user-requested affinity. When cpusets change, * the kernel can reconstruct effective mask from this. */ cpumask_t user_cpus_ptr; /* ... */};The sched_setaffinity() Implementation:
When sched_setaffinity() is called, the kernel performs several checks and operations:
CAP_SYS_NICE or own the target process (same UID).cpus_allowed if process is in a cgroup with CPU restrictions.user_cpus_ptr for later cpuset changes.cpus_mask, cpus_ptr pointed to it.1234567891011121314151617181920212223242526272829303132333435363738394041424344
/* * Simplified sched_setaffinity() implementation * Actual kernel code is significantly more complex */long sched_setaffinity(pid_t pid, const struct cpumask *in_mask){ struct task_struct *p; struct cpumask new_mask; int retval; /* Find target task */ p = find_process_by_pid(pid); if (!p) return -ESRCH; /* Check permissions */ retval = check_affinity_permission(p); if (retval) return retval; /* Validate: at least one CPU must be online */ cpumask_and(&new_mask, in_mask, cpu_online_mask); if (cpumask_empty(&new_mask)) return -EINVAL; /* Intersect with cpuset constraints */ cpumask_and(&new_mask, &new_mask, cpuset_cpus_allowed(p)); if (cpumask_empty(&new_mask)) return -EINVAL; /* Apply the new affinity */ task_lock(p); p->user_cpus_ptr = *in_mask; /* Store user request */ set_cpus_allowed_ptr(p, &new_mask); /* Set effective */ task_unlock(p); /* If task is on disallowed CPU, migrate it now */ if (!cpumask_test_cpu(task_cpu(p), &new_mask)) { /* Trigger migration thread to move task */ stop_one_cpu(task_cpu(p), migration_cpu_stop, p); } return 0;}When a running task's affinity is changed to exclude its current CPU, the kernel must immediately migrate it. This involves stopping the task, moving it to another CPU's runqueue, and potentially flushing cache lines. This migration is synchronous and can cause noticeable latency for the affected task.
In modern applications, affinity often needs to be set per-thread rather than per-process. Each thread in Linux is a separate task_struct with its own affinity mask.
Pthreads Affinity Interface:
The pthread library provides its own affinity functions that wrap the system calls:
1234567891011121314151617181920212223242526272829
#define _GNU_SOURCE#include <pthread.h>#include <sched.h> /* * Set affinity for a specific thread * thread: pthread_t handle of target thread * cpusetsize: size of cpuset structure * cpuset: pointer to CPU set */int pthread_setaffinity_np(pthread_t thread, size_t cpusetsize, const cpu_set_t *cpuset); /* * Get affinity of a specific thread */int pthread_getaffinity_np(pthread_t thread, size_t cpusetsize, cpu_set_t *cpuset); /* * Set default affinity for threads created with these attrs */int pthread_attr_setaffinity_np(pthread_attr_t *attr, size_t cpusetsize, const cpu_set_t *cpuset); int pthread_attr_getaffinity_np(pthread_attr_t *attr, size_t cpusetsize, cpu_set_t *cpuset);Example: NUMA-Aware Thread Pool
A common pattern is to create worker threads and pin each to a specific CPU or NUMA node to maximize cache locality and minimize NUMA penalties:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
#define _GNU_SOURCE#include <pthread.h>#include <sched.h>#include <stdio.h>#include <stdlib.h> #define NUM_WORKERS 8 typedef struct { int worker_id; int assigned_cpu;} worker_ctx_t; void* worker_routine(void* arg) { worker_ctx_t* ctx = (worker_ctx_t*)arg; cpu_set_t cpuset; /* Pin this thread to its assigned CPU */ CPU_ZERO(&cpuset); CPU_SET(ctx->assigned_cpu, &cpuset); if (pthread_setaffinity_np(pthread_self(), sizeof(cpuset), &cpuset) != 0) { perror("pthread_setaffinity_np"); return NULL; } /* Verify we're running on correct CPU */ int current_cpu = sched_getcpu(); printf("Worker %d: started on CPU %d (assigned: %d)\n", ctx->worker_id, current_cpu, ctx->assigned_cpu); /* Worker loop - always runs on assigned CPU */ while (1) { /* Process work items from local queue */ /* All memory allocations will prefer local NUMA node */ /* (with proper memory allocation policy) */ /* ... work ... */ } return NULL;} int main() { pthread_t workers[NUM_WORKERS]; worker_ctx_t contexts[NUM_WORKERS]; /* Get number of online CPUs */ int num_cpus = sysconf(_SC_NPROCESSORS_ONLN); printf("System has %d online CPUs\n", num_cpus); /* Create workers, round-robin across CPUs */ for (int i = 0; i < NUM_WORKERS; i++) { contexts[i].worker_id = i; contexts[i].assigned_cpu = i % num_cpus; if (pthread_create(&workers[i], NULL, worker_routine, &contexts[i]) != 0) { perror("pthread_create"); exit(1); } } printf("All workers started\n"); /* Join workers */ for (int i = 0; i < NUM_WORKERS; i++) { pthread_join(workers[i], NULL); } return 0;}For best results, set affinity immediately after thread creation (or use pthread_attr_setaffinity_np before creation). This minimizes the chance of the thread building cache state on a different CPU before being migrated to its assigned CPU.
Hard affinity doesn't exist in isolation—it interacts with numerous other kernel features, sometimes in subtle ways.
CPU Hotplug:
When a CPU goes offline (hotunplugged), all tasks with affinity to that CPU must be migrated. The kernel:
When the CPU comes back online, the kernel recalculates effective masks from saved user masks.
12345678910111213141516171819
# Pin process to CPU 4 onlytaskset -cp 4 $PID# Output: pid $PID's current affinity list: 4 # Take CPU 4 offlineecho 0 > /sys/devices/system/cpu/cpu4/online # Check affinity nowtaskset -cp $PID# Output: pid $PID's current affinity list: 0-3,5-7# (Kernel expanded mask since CPU 4 unavailable) # Bring CPU 4 back onlineecho 1 > /sys/devices/system/cpu/cpu4/online # Check affinity - should be restoredtaskset -cp $PID# Output: pid $PID's current affinity list: 4# (Kernel restored from saved user mask)cpusets (cgroups v1/v2):
Cpusets allow hierarchical CPU allocation to process groups. The effective affinity is the intersection of:
sched_setaffinity())Changing cpuset constraints automatically recalculates all contained tasks' effective affinities.
| Source | Priority | Modifiable By | Purpose |
|---|---|---|---|
| Online CPU mask | Highest | Admin (hotplug) | Physical CPU availability |
| cpuset constraint | High | cgroup admin | Resource partitioning |
| Per-task mask | Normal | Application | Application optimization |
| Scheduler preference | Lowest | Scheduler | Cache locality hints |
Real-Time Scheduling:
Real-time processes (SCHED_FIFO, SCHED_RR) often use hard affinity to ensure deterministic execution. Combining RT scheduling with affinity provides:
However, care must be taken: a RT task pinned to a single CPU can starve all other tasks on that CPU.
The kernel boot parameter isolcpus= removes specified CPUs from the default affinity mask of all processes. Only processes explicitly bound to those CPUs will run there. This is often used with hard affinity to create guaranteed-quiet CPUs for latency-sensitive work. However, isolcpus is somewhat deprecated in favor of cpuset-based isolation.
Windows provides comparable hard affinity mechanisms, with both API and GUI options.
Windows API Functions:
123456789101112131415161718192021222324252627282930313233343536373839
#include <windows.h> /* * Set process affinity mask * hProcess: handle to the process * dwProcessAffinityMask: bitmask of allowed processors */BOOL SetProcessAffinityMask( HANDLE hProcess, DWORD_PTR dwProcessAffinityMask); /* * Get current process and system affinity masks * lpProcessAffinityMask: receives current process mask * lpSystemAffinityMask: receives system's available CPUs */BOOL GetProcessAffinityMask( HANDLE hProcess, PDWORD_PTR lpProcessAffinityMask, PDWORD_PTR lpSystemAffinityMask); /* * Set thread affinity (thread-level control) * Returns previous affinity mask, or 0 on error */DWORD_PTR SetThreadAffinityMask( HANDLE hThread, DWORD_PTR dwThreadAffinityMask); /* * Set thread's ideal processor (soft affinity hint) */DWORD SetThreadIdealProcessor( HANDLE hThread, DWORD dwIdealProcessor);Example: Setting Affinity on Windows
1234567891011121314151617181920212223242526272829303132333435363738394041424344
#include <windows.h>#include <stdio.h> int main() { HANDLE hProcess = GetCurrentProcess(); HANDLE hThread = GetCurrentThread(); DWORD_PTR procMask, sysMask; /* Get current masks */ if (!GetProcessAffinityMask(hProcess, &procMask, &sysMask)) { printf("GetProcessAffinityMask failed: %d\n", GetLastError()); return 1; } printf("System mask: 0x%llx\n", (unsigned long long)sysMask); printf("Process mask: 0x%llx\n", (unsigned long long)procMask); /* Set process to run on CPUs 0 and 1 only */ DWORD_PTR newMask = 0x3; /* Binary: 11 = CPUs 0 and 1 */ if (!SetProcessAffinityMask(hProcess, newMask)) { printf("SetProcessAffinityMask failed: %d\n", GetLastError()); return 1; } printf("Process affinity set to: 0x%llx\n", (unsigned long long)newMask); /* Pin current thread to CPU 0 specifically */ DWORD_PTR prevMask = SetThreadAffinityMask(hThread, 0x1); if (prevMask == 0) { printf("SetThreadAffinityMask failed: %d\n", GetLastError()); return 1; } printf("Thread pinned to CPU 0\n"); /* Work loop - guaranteed to run on CPU 0 */ while (1) { /* ... */ } return 0;}Task Manager Affinity:
Windows also allows setting affinity through the Task Manager GUI:
This is convenient for quick adjustments but doesn't persist across process restarts.
PowerShell Affinity:
12345678910
# Get process and view affinity$process = Get-Process -Name "notepad"$process.ProcessorAffinity # Set process to use only CPUs 0-3$process.ProcessorAffinity = 0xF # Start new process with specific affinity$p = Start-Process -FilePath "myapp.exe" -PassThru$p.ProcessorAffinity = 0x3 # CPUs 0 and 1Hard affinity provides explicit control over CPU placement, transforming scheduling preferences into scheduling requirements. Let's consolidate our understanding:
sched_setaffinity() and taskset on Linux, SetProcessAffinityMask() and Task Manager on Windows.What's Next:
We've seen soft affinity (preferences) and hard affinity (constraints). But why do we care so much about where processes run? In the next page, we'll explore cache effects—the performance phenomena that make affinity matter in the first place. Understanding cache behavior will clarify why even small affinity decisions can have large performance impacts.
You now understand hard processor affinity—how to set it, how it's implemented, and how it interacts with other system features. You can programmatically pin processes to specific CPUs for deterministic scheduling. Next, we'll examine the cache effects that drive the need for processor affinity in the first place.