Thread Fundamentals - Learning Module

Loading content...

0/227

Thread Definition

The Evolution of Execution Abstraction

In the early days of computing, the process was the sole unit of execution. Each program ran as a single, monolithic entity—one sequence of instructions executing from start to finish. This model, while conceptually simple, carried fundamental limitations that became increasingly apparent as computing demands evolved.

Consider a word processor from this era. When the user requests to save a large document, the entire application freezes. Spell-checking? The user must wait, cursor blinking impatiently. Printing? Everything halts. The problem isn't the hardware—it's the abstraction. A single thread of execution cannot simultaneously respond to user input, perform computation, and interact with I/O devices.

The solution that emerged was the thread—a finer-grained unit of execution that would revolutionize how we structure concurrent programs and fundamentally reshape operating system design.

What You Will Learn

By the end of this page, you will possess a rigorous understanding of what a thread is, how it relates to the process abstraction, the formal definition used in operating system literature, and the historical context that drove its development. You will understand threads not as a mere programming convenience, but as a fundamental evolution in how we model concurrent execution.

Formal Definition of a Thread

A thread (sometimes called a lightweight process or LWP) is the basic unit of CPU utilization. It represents the smallest sequence of programmed instructions that can be managed independently by the operating system scheduler.

More precisely, a thread comprises:

A thread ID — A unique identifier within the process
A program counter (PC) — The address of the next instruction to execute
A register set — The current values of CPU registers
A stack — The call stack for function invocations and local variables

Critically, threads belonging to the same process share:

The code section (text segment)
The data section (global and static variables)
The heap (dynamically allocated memory)
Open file descriptors and I/O resources
Signal handlers and signal disposition
The process's virtual address space

The Fundamental Insight

A thread is what executes. A process is what owns resources. These are orthogonal concepts that early operating systems conflated but modern systems carefully separate. Understanding this distinction is the key to mastering concurrent programming.

Formal Definition (Operating System Textbooks):

A thread is a single sequential flow of control within a process. It has its own program counter, stack, and register state, but shares the process's address space and system resources with other threads in the same process.

This definition embodies a crucial architectural decision: separate "what executes" from "what is owned." A process becomes a container—an environment of resources—while threads become the entities that actually perform computation within that container.

thread_structure.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/* Conceptual representation of thread state in an OS kernel */
 
struct thread_control_block {
    /* Unique thread identifier within the process */
    tid_t               thread_id;
    
    /* Execution state */
    enum thread_state   state;          /* RUNNING, READY, BLOCKED, etc. */
    
    /* CPU context - saved/restored on context switch */
    struct cpu_context {
        unsigned long   program_counter;
        unsigned long   stack_pointer;
        unsigned long   registers[NUM_REGISTERS];
        unsigned long   flags_register;
    } context;
    
    /* Thread-local stack */
    void               *stack_base;
    size_t              stack_size;
    
    /* Pointer to owning process */
    struct process     *process;        /* Shared with sibling threads */
    
    /* Scheduling information */
    int                 priority;
    unsigned long       time_slice;
    unsigned long       cpu_time;
    
    /* Thread-local storage pointer */
    void               *tls_area;
    
    /* Links for scheduler queues */
    struct list_head    run_queue_link;
    struct list_head    process_thread_list;
};

Historical Context and Evolution

The thread abstraction did not emerge fully formed. It evolved over decades as operating system designers grappled with the limitations of the process model and the demands of increasingly concurrent workloads.

The Process-Only Era (1960s–1970s)

Early operating systems like UNIX provided only the process abstraction. To achieve concurrency, programs would fork() child processes. This worked but carried significant overhead:

Creating a new process required duplicating the entire address space
Inter-process communication required expensive kernel mechanisms (pipes, shared memory segments)
Each process maintained separate copies of static data
Context switching between processes was costly (full TLB flush, cache pollution)

The Emergence of Lightweight Processes (1980s)

Systems like Mach and Chorus introduced the concept of tasks and threads. A task (analogous to a process) provided resource ownership, while threads provided execution. This separation enabled:

Faster creation: No address space duplication
Efficient communication: Threads share memory directly
Lower overhead context switches: Same address space, simpler state

The Standardization Era (1990s–Present)

POSIX threads (Pthreads) standardized the thread API in 1995, enabling portable multi-threaded programming. Operating systems converged on a model where:

Processes own resources and define protection domains
Threads execute within processes and are the schedulable entities
The kernel schedules threads, not processes

Evolution of Execution Models
Era	Primary Abstraction	Concurrency Mechanism	Limitations
1960s–1970s	Process only	fork() to create child processes	High overhead, separate address spaces, expensive IPC
1980s	Tasks + Threads	Lightweight processes within tasks	Non-portable, vendor-specific APIs
1990s–2000s	Processes + Pthreads	POSIX-standardized thread API	Many-to-one or one-to-one limitations
2000s–Present	Hybrid models	Native threads + green threads + coroutines	Complexity of choosing the right model

Why "Lightweight Process"?

The term "lightweight process" (LWP) emphasizes that threads are process-like entities (they can be scheduled, they have state, they can block) but with dramatically reduced overhead. In some systems like Solaris, LWP specifically refers to the kernel-level entity that backs user-level threads.

The Anatomy of a Thread

To truly understand threads, we must dissect their components. A thread is not a complex entity—its simplicity is precisely what makes it powerful. Let's examine each component in detail.

Core Thread Components

•Thread ID (TID) — A unique identifier distinguishing this thread from others within the same process. The TID is typically an integer assigned by the kernel and remains constant throughout the thread's lifetime. In Linux, the gettid() system call returns this value.
•Program Counter (PC) — Also called the instruction pointer (IP), this register holds the memory address of the next instruction to execute. Each thread maintains its own PC, enabling threads to execute different code paths simultaneously within the same program.
•Register Set — The complete CPU register state, including general-purpose registers, floating-point registers, and special-purpose registers. When a thread is context-switched out, its register set is saved; when switched in, the registers are restored.
•Stack — Each thread has its own stack for function call management, local variables, and return addresses. The stack is typically allocated from the process's address space, often growing downward from high addresses. Stack size is configurable but commonly defaults to 1–8 MB.
•Thread-Local Storage (TLS) — A mechanism for variables that are global in scope but private to each thread. TLS enables patterns where each thread has its own instance of a variable without explicit synchronization.

Converting Mermaid diagram...

Stack Isolation and Safety

Each thread's stack is a critical component of thread isolation. Stacks grow and shrink dynamically as functions are called and return. Consider what happens during execution:

Function call: Push return address, push old frame pointer, allocate space for local variables
Function execution: Access local variables via offsets from the frame pointer
Function return: Deallocate local space, pop frame pointer, pop return address, jump back

Because each thread has its own stack, local variables are inherently thread-safe—no two threads will ever share stack-allocated data (unless addresses are explicitly passed, which is dangerous).

Stack Overflow and Thread Safety

While stacks provide isolation, they are finite. A thread that recurses too deeply or allocates large arrays on the stack can overflow its stack, potentially corrupting adjacent memory. Guard pages (non-accessible memory pages at stack boundaries) help detect overflows, but cannot prevent all damage. Always be mindful of stack usage in recursive algorithms.

Thread State and Lifecycle

Threads, like processes, progress through a series of states during their lifetime. Understanding these states is essential for debugging concurrent programs and reasoning about thread behavior.

The Five Primary Thread States:

Thread States in Detail
State	Description	Transition Triggers
New (Born)	Thread has been created but not yet started. Data structures are allocated, but no execution has begun.	Thread creation call (pthread_create, CreateThread)
Ready (Runnable)	Thread is prepared to execute and waiting for CPU allocation. It could run immediately if scheduled.	Start called, I/O complete, lock acquired, time slice expired
Running	Thread is actively executing on a CPU core. At any instant, at most N threads can be running on N cores.	Scheduler dispatches thread to CPU
Blocked (Waiting)	Thread cannot proceed until some event occurs. The thread is not consuming CPU cycles while blocked.	Waiting for I/O, lock, condition variable, sleep, join
Terminated (Dead)	Thread has completed execution or was cancelled. Resources may remain until joined.	Return from entry function, pthread_exit, cancellation

Converting Mermaid diagram...

State Transitions in Practice:

Thread Creation:
  main() calls pthread_create(&thread, NULL, worker, arg);
    → Thread enters NEW state
    → Immediately transitions to READY (ready to be scheduled)

Scheduler Dispatch:
  OS scheduler selects thread from ready queue
    → Thread transitions from READY to RUNNING
    → CPU's program counter loaded with thread's PC
    → Thread's registers restored

Blocking Operation:
  Thread calls read(fd, buffer, size) on a slow device
    → Thread transitions from RUNNING to BLOCKED
    → Thread placed on wait queue for that I/O
    → Scheduler selects another thread to run

I/O Completion:
  Device signals interrupt, data available
    → Thread transitions from BLOCKED to READY
    → Thread placed on ready queue
    → (May or may not run immediately, depends on scheduling)

Preemption:
  Thread exhausts time slice (quantum)
    → Timer interrupt fires
    → Thread transitions from RUNNING to READY
    → Scheduler selects next thread (might be same thread)

Termination:
  Thread's function returns or calls pthread_exit()
    → Thread transitions to TERMINATED
    → Resources held until another thread calls pthread_join()

Zombie Threads

Just like processes, threads can become 'zombies.' A terminated thread whose exit status hasn't been collected (via pthread_join) remains in a terminated state, consuming kernel resources. Detached threads (created with PTHREAD_CREATE_DETACHED or via pthread_detach) automatically release resources upon termination.

The Conceptual Model: Thread as Execution Context

Perhaps the most illuminating way to understand threads is through the lens of execution context. At any moment, a CPU can only execute one sequence of instructions. The state required to resume that sequence later is the execution context.

The Minimal Execution Context:

Imagine pausing a CPU mid-execution. To later resume exactly where you left off, you must preserve:

Where you are: The program counter
What values you're working with: The register set
What function calls are pending: The stack

This minimal state—PC, registers, and stack—is precisely what defines a thread. Everything else (code, data, heap, files) is environmental—shared context that any thread in the process can access.

The Illusion of Simultaneity:

On a single-core CPU, only one thread truly executes at any instant. The operating system creates the illusion of simultaneous execution by rapidly switching between threads—saving one thread's context, loading another's. This is preemptive multitasking.

On a multi-core CPU, threads can execute truly simultaneously—one thread per core. This is parallel execution, and it's where threads provide genuine performance gains for CPU-bound workloads.

context_switch.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
/* Simplified context switch (conceptual) */
 
/* This structure holds everything needed to resume a thread */
struct execution_context {
    unsigned long rax, rbx, rcx, rdx;     /* General-purpose registers */
    unsigned long rsi, rdi, rbp, rsp;     /* Stack and base pointers */
    unsigned long r8, r9, r10, r11;       /* Additional registers (x86-64) */
    unsigned long r12, r13, r14, r15;
    unsigned long rip;                     /* Instruction pointer (PC) */
    unsigned long rflags;                  /* CPU flags */
    unsigned long fs_base, gs_base;        /* Segment bases (TLS) */
    /* Floating-point state would also be saved */
};
 
/*
 * switch_context: The heart of the thread scheduler
 * 
 * Saves current thread's context, restores next thread's context.
 * After this function "returns," we're executing a different thread!
 */
void switch_context(struct thread *current, struct thread *next) {
    /* Step 1: Save current thread's registers to its context structure */
    save_registers(&current->context);
    
    /* Step 2: Switch to next thread's stack */
    /* WARNING: After this, 'current' and 'next' may be invalid! */
    /* We're now using next's stack, so local variables change meaning */
    
    /* Step 3: Restore next thread's registers from its context */
    restore_registers(&next->context);
    
    /* Step 4: "Return" - but we return to wherever 'next' was suspended */
    /* The restored rip register determines where execution continues */
}

The Magic Moment

The profound insight of threading is that switch_context 'returns' to a different place than it was called from. When we restore the program counter and stack of another thread, we resume that thread's execution mid-flight. The calling thread doesn't 'see' the return—it's suspended until someone later switches back to it.

Threads in Modern Operating Systems

Modern operating systems universally support threads, though implementation details vary. Understanding these implementations provides insight into thread behavior and performance characteristics.

Native POSIX Thread Library (NPTL)

Linux implements threads using the clone() system call with specific flags. In Linux's view, threads and processes are both "tasks"—the difference lies in what they share:

/* Creating a thread with clone() */
clone(CLONE_VM |        /* Share virtual memory */
      CLONE_FS |        /* Share filesystem info */
      CLONE_FILES |     /* Share file descriptors */
      CLONE_SIGHAND |   /* Share signal handlers */
      CLONE_THREAD |    /* Same thread group */
      CLONE_SYSVSEM,    /* Share SysV semaphore adjust values */
      stack_top,        /* New stack for the thread */
      ...);

Key Characteristics:

Threads are first-class kernel entities (1:1 model)
Each thread has a unique TID (Thread ID) visible via gettid()
All threads in a process share a TGID (Thread Group ID) = PID
Scheduler sees individual threads, not processes
Thread creation is fast: ~4 microseconds on modern hardware

The 1:1 Model Dominance

Modern general-purpose operating systems have converged on the 1:1 threading model: one user thread corresponds to one kernel thread. This approach provides true parallelism and simplifies the implementation at the cost of somewhat higher thread creation overhead compared to pure user-space threads.

Thread Identity and Identification

In a multi-threaded environment, threads need to identify themselves and each other. Several identification mechanisms exist, each serving different purposes.

thread_identification.c
C (POSIX)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include <pthread.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <stdio.h>
 
void *thread_func(void *arg) {
    /* Method 1: POSIX thread ID (opaque type) */
    pthread_t posix_tid = pthread_self();
    
    /* Method 2: System thread ID (Linux-specific) */
    pid_t kernel_tid = syscall(SYS_gettid);
    
    /* Method 3: Process ID (shared by all threads) */
    pid_t process_pid = getpid();
    
    printf("Thread report:\n");
    printf("  POSIX thread ID:  %lu\n", (unsigned long)posix_tid);
    printf("  Kernel thread ID: %d\n", kernel_tid);
    printf("  Process ID:       %d\n", process_pid);
    
    /* Comparing thread IDs */
    pthread_t main_thread = *(pthread_t*)arg;
    if (pthread_equal(pthread_self(), main_thread)) {
        printf("  This IS the main thread\n");
    } else {
        printf("  This is NOT the main thread\n");
    }
    
    return NULL;
}
 
int main() {
    pthread_t tid1, tid2;
    pthread_t main_tid = pthread_self();
    
    /* Create threads, passing main's ID for comparison */
    pthread_create(&tid1, NULL, thread_func, &main_tid);
    pthread_create(&tid2, NULL, thread_func, &main_tid);
    
    pthread_join(tid1, NULL);
    pthread_join(tid2, NULL);
    
    return 0;
}

Thread Identification Mechanisms

•pthread_t (POSIX) — An opaque type representing a thread. Its internal structure is implementation-defined—it could be an integer, a pointer, or a structure. Never assume its format. Use pthread_equal() for comparisons, never ==.
•Kernel TID (Linux) — A system-wide unique integer. Useful for debugging (appears in /proc), tracing (perf, strace), and setting thread-specific attributes via system calls. Obtained via gettid() or syscall(SYS_gettid).
•Process ID (PID) — All threads in a process share the same PID, which is actually the main thread's TID (the TGID in kernel terminology). getpid() returns this value from any thread.
•Thread Names — Many systems support naming threads for debugging. Linux: pthread_setname_np(). This name appears in tools like top, htop, and debuggers.

The pthread_t Trap

A common bug: comparing pthread_t values with == instead of pthread_equal(). On some systems this works by accident (pthread_t might be an integer), but on others it fails silently (pthread_t might be a structure). Always use pthread_equal() for portable, correct code.

Summary: Understanding Thread Definition

We have established a rigorous foundation for understanding threads. Let's consolidate the essential concepts:

Key Takeaways

•A thread is the basic unit of CPU utilization — It represents the smallest sequence of instructions that an operating system can independently schedule and execute.
•Threads comprise minimal private state — Thread ID, program counter, register set, and stack. Everything else is shared with sibling threads.
•Threads share their process's resources — Code, data, heap, file descriptors, and address space are shared, enabling efficient communication but requiring synchronization.
•The thread abstraction separates execution from ownership — Processes own resources; threads perform computation within that resource environment.
•Thread states mirror process states — New, Ready, Running, Blocked, and Terminated, with transitions driven by scheduling and blocking events.
•Context switching is the mechanism of concurrency — Saving one thread's execution context and restoring another's creates the illusion (or reality) of simultaneous execution.
•Modern OSes use the 1:1 model — Each user-level thread corresponds to a kernel-schedulable entity, enabling true parallelism on multi-core systems.

What's Next:

With a solid understanding of what a thread is, we're ready to explore how threads compare to the process abstraction we've studied earlier. The next page examines the Thread vs Process distinction in depth—exploring when to use each, the performance characteristics of both, and the fundamental tradeoffs in concurrent program design.

Foundation Established

You now possess a comprehensive understanding of the thread abstraction—its definition, components, lifecycle, and implementation across major operating systems. This foundation is essential for everything that follows in concurrent programming.

Thread Definition

The Evolution of Execution Abstraction

The solution that emerged was the thread—a finer-grained unit of execution that would revolutionize how we structure concurrent programs and fundamentally reshape operating system design.

What You Will Learn

Formal Definition of a Thread

More precisely, a thread comprises:

A thread ID — A unique identifier within the process
A program counter (PC) — The address of the next instruction to execute
A register set — The current values of CPU registers
A stack — The call stack for function invocations and local variables

Critically, threads belonging to the same process share:

The code section (text segment)
The data section (global and static variables)
The heap (dynamically allocated memory)
Open file descriptors and I/O resources
Signal handlers and signal disposition
The process's virtual address space

The Fundamental Insight

Formal Definition (Operating System Textbooks):

A thread is a single sequential flow of control within a process. It has its own program counter, stack, and register state, but shares the process's address space and system resources with other threads in the same process.

thread_structure.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/* Conceptual representation of thread state in an OS kernel */
 
struct thread_control_block {
    /* Unique thread identifier within the process */
    tid_t               thread_id;
    
    /* Execution state */
    enum thread_state   state;          /* RUNNING, READY, BLOCKED, etc. */
    
    /* CPU context - saved/restored on context switch */
    struct cpu_context {
        unsigned long   program_counter;
        unsigned long   stack_pointer;
        unsigned long   registers[NUM_REGISTERS];
        unsigned long   flags_register;
    } context;
    
    /* Thread-local stack */
    void               *stack_base;
    size_t              stack_size;
    
    /* Pointer to owning process */
    struct process     *process;        /* Shared with sibling threads */
    
    /* Scheduling information */
    int                 priority;
    unsigned long       time_slice;
    unsigned long       cpu_time;
    
    /* Thread-local storage pointer */
    void               *tls_area;
    
    /* Links for scheduler queues */
    struct list_head    run_queue_link;
    struct list_head    process_thread_list;
};

Historical Context and Evolution

The Process-Only Era (1960s–1970s)

Early operating systems like UNIX provided only the process abstraction. To achieve concurrency, programs would fork() child processes. This worked but carried significant overhead:

Creating a new process required duplicating the entire address space
Inter-process communication required expensive kernel mechanisms (pipes, shared memory segments)
Each process maintained separate copies of static data
Context switching between processes was costly (full TLB flush, cache pollution)

The Emergence of Lightweight Processes (1980s)

Systems like Mach and Chorus introduced the concept of tasks and threads. A task (analogous to a process) provided resource ownership, while threads provided execution. This separation enabled:

Faster creation: No address space duplication
Efficient communication: Threads share memory directly
Lower overhead context switches: Same address space, simpler state

The Standardization Era (1990s–Present)

POSIX threads (Pthreads) standardized the thread API in 1995, enabling portable multi-threaded programming. Operating systems converged on a model where:

Processes own resources and define protection domains
Threads execute within processes and are the schedulable entities
The kernel schedules threads, not processes

Evolution of Execution Models
Era	Primary Abstraction	Concurrency Mechanism	Limitations
1960s–1970s	Process only	fork() to create child processes	High overhead, separate address spaces, expensive IPC
1980s	Tasks + Threads	Lightweight processes within tasks	Non-portable, vendor-specific APIs
1990s–2000s	Processes + Pthreads	POSIX-standardized thread API	Many-to-one or one-to-one limitations
2000s–Present	Hybrid models	Native threads + green threads + coroutines	Complexity of choosing the right model

Why "Lightweight Process"?

The Anatomy of a Thread

To truly understand threads, we must dissect their components. A thread is not a complex entity—its simplicity is precisely what makes it powerful. Let's examine each component in detail.

Core Thread Components

•Thread ID (TID) — A unique identifier distinguishing this thread from others within the same process. The TID is typically an integer assigned by the kernel and remains constant throughout the thread's lifetime. In Linux, the gettid() system call returns this value.
•Program Counter (PC) — Also called the instruction pointer (IP), this register holds the memory address of the next instruction to execute. Each thread maintains its own PC, enabling threads to execute different code paths simultaneously within the same program.
•Register Set — The complete CPU register state, including general-purpose registers, floating-point registers, and special-purpose registers. When a thread is context-switched out, its register set is saved; when switched in, the registers are restored.
•Stack — Each thread has its own stack for function call management, local variables, and return addresses. The stack is typically allocated from the process's address space, often growing downward from high addresses. Stack size is configurable but commonly defaults to 1–8 MB.
•Thread-Local Storage (TLS) — A mechanism for variables that are global in scope but private to each thread. TLS enables patterns where each thread has its own instance of a variable without explicit synchronization.

Converting Mermaid diagram...

Stack Isolation and Safety

Each thread's stack is a critical component of thread isolation. Stacks grow and shrink dynamically as functions are called and return. Consider what happens during execution:

Function call: Push return address, push old frame pointer, allocate space for local variables
Function execution: Access local variables via offsets from the frame pointer
Function return: Deallocate local space, pop frame pointer, pop return address, jump back

Because each thread has its own stack, local variables are inherently thread-safe—no two threads will ever share stack-allocated data (unless addresses are explicitly passed, which is dangerous).

Stack Overflow and Thread Safety

Thread State and Lifecycle

Threads, like processes, progress through a series of states during their lifetime. Understanding these states is essential for debugging concurrent programs and reasoning about thread behavior.

The Five Primary Thread States:

Thread States in Detail
State	Description	Transition Triggers
New (Born)	Thread has been created but not yet started. Data structures are allocated, but no execution has begun.	Thread creation call (pthread_create, CreateThread)
Ready (Runnable)	Thread is prepared to execute and waiting for CPU allocation. It could run immediately if scheduled.	Start called, I/O complete, lock acquired, time slice expired
Running	Thread is actively executing on a CPU core. At any instant, at most N threads can be running on N cores.	Scheduler dispatches thread to CPU
Blocked (Waiting)	Thread cannot proceed until some event occurs. The thread is not consuming CPU cycles while blocked.	Waiting for I/O, lock, condition variable, sleep, join
Terminated (Dead)	Thread has completed execution or was cancelled. Resources may remain until joined.	Return from entry function, pthread_exit, cancellation

Converting Mermaid diagram...

State Transitions in Practice:

Thread Creation:
  main() calls pthread_create(&thread, NULL, worker, arg);
    → Thread enters NEW state
    → Immediately transitions to READY (ready to be scheduled)

Scheduler Dispatch:
  OS scheduler selects thread from ready queue
    → Thread transitions from READY to RUNNING
    → CPU's program counter loaded with thread's PC
    → Thread's registers restored

Blocking Operation:
  Thread calls read(fd, buffer, size) on a slow device
    → Thread transitions from RUNNING to BLOCKED
    → Thread placed on wait queue for that I/O
    → Scheduler selects another thread to run

I/O Completion:
  Device signals interrupt, data available
    → Thread transitions from BLOCKED to READY
    → Thread placed on ready queue
    → (May or may not run immediately, depends on scheduling)

Preemption:
  Thread exhausts time slice (quantum)
    → Timer interrupt fires
    → Thread transitions from RUNNING to READY
    → Scheduler selects next thread (might be same thread)

Termination:
  Thread's function returns or calls pthread_exit()
    → Thread transitions to TERMINATED
    → Resources held until another thread calls pthread_join()

Zombie Threads

The Conceptual Model: Thread as Execution Context

The Minimal Execution Context:

Imagine pausing a CPU mid-execution. To later resume exactly where you left off, you must preserve:

Where you are: The program counter
What values you're working with: The register set
What function calls are pending: The stack

The Illusion of Simultaneity:

context_switch.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
/* Simplified context switch (conceptual) */
 
/* This structure holds everything needed to resume a thread */
struct execution_context {
    unsigned long rax, rbx, rcx, rdx;     /* General-purpose registers */
    unsigned long rsi, rdi, rbp, rsp;     /* Stack and base pointers */
    unsigned long r8, r9, r10, r11;       /* Additional registers (x86-64) */
    unsigned long r12, r13, r14, r15;
    unsigned long rip;                     /* Instruction pointer (PC) */
    unsigned long rflags;                  /* CPU flags */
    unsigned long fs_base, gs_base;        /* Segment bases (TLS) */
    /* Floating-point state would also be saved */
};
 
/*
 * switch_context: The heart of the thread scheduler
 * 
 * Saves current thread's context, restores next thread's context.
 * After this function "returns," we're executing a different thread!
 */
void switch_context(struct thread *current, struct thread *next) {
    /* Step 1: Save current thread's registers to its context structure */
    save_registers(&current->context);
    
    /* Step 2: Switch to next thread's stack */
    /* WARNING: After this, 'current' and 'next' may be invalid! */
    /* We're now using next's stack, so local variables change meaning */
    
    /* Step 3: Restore next thread's registers from its context */
    restore_registers(&next->context);
    
    /* Step 4: "Return" - but we return to wherever 'next' was suspended */
    /* The restored rip register determines where execution continues */
}

The Magic Moment

Threads in Modern Operating Systems

Modern operating systems universally support threads, though implementation details vary. Understanding these implementations provides insight into thread behavior and performance characteristics.

Native POSIX Thread Library (NPTL)

Linux implements threads using the clone() system call with specific flags. In Linux's view, threads and processes are both "tasks"—the difference lies in what they share:

/* Creating a thread with clone() */
clone(CLONE_VM |        /* Share virtual memory */
      CLONE_FS |        /* Share filesystem info */
      CLONE_FILES |     /* Share file descriptors */
      CLONE_SIGHAND |   /* Share signal handlers */
      CLONE_THREAD |    /* Same thread group */
      CLONE_SYSVSEM,    /* Share SysV semaphore adjust values */
      stack_top,        /* New stack for the thread */
      ...);

Key Characteristics:

Threads are first-class kernel entities (1:1 model)
Each thread has a unique TID (Thread ID) visible via gettid()
All threads in a process share a TGID (Thread Group ID) = PID
Scheduler sees individual threads, not processes
Thread creation is fast: ~4 microseconds on modern hardware

The 1:1 Model Dominance

Thread Identity and Identification

In a multi-threaded environment, threads need to identify themselves and each other. Several identification mechanisms exist, each serving different purposes.

thread_identification.c
C (POSIX)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#include <pthread.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <stdio.h>
 
void *thread_func(void *arg) {
    /* Method 1: POSIX thread ID (opaque type) */
    pthread_t posix_tid = pthread_self();
    
    /* Method 2: System thread ID (Linux-specific) */
    pid_t kernel_tid = syscall(SYS_gettid);
    
    /* Method 3: Process ID (shared by all threads) */
    pid_t process_pid = getpid();
    
    printf("Thread report:\n");
    printf("  POSIX thread ID:  %lu\n", (unsigned long)posix_tid);
    printf("  Kernel thread ID: %d\n", kernel_tid);
    printf("  Process ID:       %d\n", process_pid);
    
    /* Comparing thread IDs */
    pthread_t main_thread = *(pthread_t*)arg;
    if (pthread_equal(pthread_self(), main_thread)) {
        printf("  This IS the main thread\n");
    } else {
        printf("  This is NOT the main thread\n");
    }
    
    return NULL;
}
 
int main() {
    pthread_t tid1, tid2;
    pthread_t main_tid = pthread_self();
    
    /* Create threads, passing main's ID for comparison */
    pthread_create(&tid1, NULL, thread_func, &main_tid);
    pthread_create(&tid2, NULL, thread_func, &main_tid);
    
    pthread_join(tid1, NULL);
    pthread_join(tid2, NULL);
    
    return 0;
}

Thread Identification Mechanisms

•pthread_t (POSIX) — An opaque type representing a thread. Its internal structure is implementation-defined—it could be an integer, a pointer, or a structure. Never assume its format. Use pthread_equal() for comparisons, never ==.
•Kernel TID (Linux) — A system-wide unique integer. Useful for debugging (appears in /proc), tracing (perf, strace), and setting thread-specific attributes via system calls. Obtained via gettid() or syscall(SYS_gettid).
•Process ID (PID) — All threads in a process share the same PID, which is actually the main thread's TID (the TGID in kernel terminology). getpid() returns this value from any thread.
•Thread Names — Many systems support naming threads for debugging. Linux: pthread_setname_np(). This name appears in tools like top, htop, and debuggers.

The pthread_t Trap

Summary: Understanding Thread Definition

We have established a rigorous foundation for understanding threads. Let's consolidate the essential concepts:

Key Takeaways

•A thread is the basic unit of CPU utilization — It represents the smallest sequence of instructions that an operating system can independently schedule and execute.
•Threads comprise minimal private state — Thread ID, program counter, register set, and stack. Everything else is shared with sibling threads.
•Threads share their process's resources — Code, data, heap, file descriptors, and address space are shared, enabling efficient communication but requiring synchronization.
•The thread abstraction separates execution from ownership — Processes own resources; threads perform computation within that resource environment.
•Thread states mirror process states — New, Ready, Running, Blocked, and Terminated, with transitions driven by scheduling and blocking events.
•Context switching is the mechanism of concurrency — Saving one thread's execution context and restoring another's creates the illusion (or reality) of simultaneous execution.
•Modern OSes use the 1:1 model — Each user-level thread corresponds to a kernel-schedulable entity, enabling true parallelism on multi-core systems.

What's Next:

Foundation Established