Operating SystemsProcess Definition

Process Definition

LevelBeginner

Duration60 mins

TopicProcess Definition

1 / 5

Program vs Process

The Most Important Distinction in Operating Systems

Before you can understand how operating systems manage running software, you must grasp a distinction that seems subtle but carries profound implications: the difference between a program and a process.

This isn't mere semantics. The program-process distinction is the conceptual foundation upon which the entire edifice of process management, scheduling, memory protection, and inter-process communication is built. Confusing these concepts leads to fundamental misunderstandings about how computers actually work.

Consider this: when you double-click an application icon on your desktop, what exactly happens? Is the application "running"? If you double-click it again, do you have two applications or one? And when the application is paused waiting for input, is it still running? These questions have precise answers—but only once you truly understand what a process is.

What You Will Learn

By the end of this page, you will understand the precise technical distinction between programs and processes, why this distinction matters for operating system design, how the same program can spawn multiple independent processes, and how this separation enables the process isolation that modern computing depends upon.

What is a Program?

A program is a passive, static entity. It is a collection of instructions stored in a file on disk—nothing more, nothing less. A program does not execute, does not consume CPU cycles, and does not hold data in memory. It simply exists as potential.

The Essential Characteristics of a Program:

Program Characteristics

•Static Entity — A program is a sequence of bytes stored on permanent storage (hard drive, SSD). It does not change during execution because it is never 'executing'—the process is.
•Passive Artifact — A program has no agency. It cannot perform actions, request resources, or respond to events. It is inert, like a blueprint sitting on a shelf.
•Compiled or Interpreted Instructions — Programs contain machine code (compiled languages) or source code/bytecode (interpreted languages) that describes what to do without actually doing it.
•Persistent Storage — Programs persist on disk even when the system is powered off. They survive reboots, crashes, and years of storage.
•Single Representation — There is typically one program file (or set of files) for a given application. This single artifact can be copied but each copy is functionally identical.

Anatomy of a Program File:

On Unix-like systems, a compiled program is typically an ELF (Executable and Linkable Format) file. On Windows, it's a PE (Portable Executable) file. These formats define structured sections:

Structure of an Executable Program File
Section	Contents	Purpose
`.text`	Machine code instructions	The actual executable logic of the program
`.data`	Initialized global variables	Variables with predefined starting values
`.bss`	Uninitialized global variables	Variables that start as zero; space reserved but not stored
`.rodata`	Read-only data (constants, strings)	Literal values embedded in the program
Header	Metadata (entry point, architecture, etc.)	Tells the OS loader how to set up the process

Examining a program file on Linux
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
# View the sections of a compiled program
$ readelf -S /bin/ls
 
Section Headers:
  [Nr] Name              Type             Address           Offset
  [ 1] .interp           PROGBITS         0000000000000318  00000318
  [14] .text             PROGBITS         0000000000004d90  00004d90
  [15] .rodata           PROGBITS         0000000000019000  00019000
  [25] .data             PROGBITS         0000000000024040  00023040
  [26] .bss              NOBITS           0000000000024160  00023148
 
# The .text section contains 59,536 bytes of machine code
# The program itself is just this inert data sitting on disk

Programs are Recipes, Not Meals

Think of a program as a recipe in a cookbook. The recipe describes ingredients and steps, but until someone reads the recipe and performs those steps in a kitchen, no food is produced. The cookbook can sit on a shelf for decades—it only becomes relevant when a cook (the CPU) executes it. The execution, not the recipe, is what matters.

What is a Process?

A process is an active, dynamic entity. It is a program in execution—but crucially, it is far more than just executing code. A process is the complete execution context that the operating system creates to run a program.

When the OS loads a program into memory and begins execution, it creates a process that includes:

The code from the program's text section, now loaded into memory
Data sections with global and static variables
A heap for dynamic memory allocation
A stack for function calls, local variables, and return addresses
CPU register contents including the program counter
Open resources like file handles, network sockets, and devices
OS metadata tracking the process state, permissions, and relationships

Process Characteristics

•Dynamic Entity — A process changes constantly during execution. Variables are modified, functions are called and return, memory is allocated and freed.
•Active Agent — A process performs actions: reading files, sending network packets, computing results. It has agency within the bounds set by the OS.
•Execution Context — A process encapsulates everything needed to execute: memory layout, CPU state, open resources, scheduling priority, and more.
•Transient Existence — Processes are created and destroyed. They exist only while executing and disappear when they terminate (unlike programs which persist).
•System Resource Consumer — Processes consume CPU time, memory, I/O bandwidth, and other finite system resources that the OS must manage.

Converting Mermaid diagram...

The Process Memory Layout in Detail:

When a process is created, the operating system establishes a virtual address space—a region of memory addresses that the process believes is entirely its own. This memory is organized into distinct segments:

Process Memory Segments
Segment	Location	Purpose	Growth Direction
Text (Code)	Low addresses	Executable machine instructions	Fixed size, read-only
Data	Above text	Initialized global/static variables	Fixed at load time
BSS	Above data	Uninitialized global/static variables	Fixed, zeroed at start
Heap	Above BSS	Dynamic memory allocation (malloc, new)	Grows upward ↑
Stack	High addresses	Function calls, local variables, return addresses	Grows downward ↓

Stack vs Heap Collision

The heap grows upward while the stack grows downward. If they meet, catastrophic failure occurs—a stack overflow or heap exhaustion. Modern operating systems place guard pages between them to detect this, but resource exhaustion remains a real concern in systems programming.

The Critical Differences

Understanding the program-process distinction requires examining their differences across multiple dimensions. These differences are not academic—they have profound implications for how we reason about system behavior.

Program vs Process: Comprehensive Comparison
Dimension	Program	Process
Nature	Passive entity (stored data)	Active entity (execution context)
Location	Stored on disk	Resides in memory
Lifetime	Persists until deleted	Exists only during execution
Instances	Single file	Multiple concurrent instances possible
Resources	Consumes disk space only	Consumes CPU, memory, I/O, handles
State	Immutable (code doesn't change)	Mutable (data changes constantly)
Visibility	Visible in file system	Visible in task manager / ps command
Identity	File path / inode	Process ID (PID)
Termination	Deletion from file system	Exit or kill by OS
Creation	Compilation or copying	fork() / exec() / CreateProcess()

Program: Static Properties

•Stored as a file on secondary storage
•Contains compiled code and data definitions
•Single authoritative copy
•No execution state
•No system resource consumption (except disk)
•Survives system reboots

Process: Dynamic Properties

•Loaded into main memory (RAM)
•Actively executing instructions
•Multiple independent instances
•Current state (registers, PC, stack)
•Actively consuming system resources
•Lost on termination or crash

Observing the difference
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# The PROGRAM /usr/bin/python3 is a file
$ ls -la /usr/bin/python3
-rwxr-xr-x 1 root root 5494584 Nov 14  2023 /usr/bin/python3
 
# The PROCESS is what runs when we execute it
$ python3 -c "import time; time.sleep(60)" &
[1] 12345
 
# View the process (notice the PID, memory, CPU usage)
$ ps aux | grep python
user    12345  0.1  0.3  15324  6848 pts/0  S  10:00  0:00 python3 -c...
 
# One program file, but we can spawn multiple processes
$ python3 -c "import time; time.sleep(60)" &
[2] 12346
$ python3 -c "import time; time.sleep(60)" &
[3] 12347
 
# Now we have 3 independent processes from the same program
$ ps aux | grep python
user    12345  0.1  0.3  15324  6848 pts/0  S  10:00  0:00 python3 ...
user    12346  0.1  0.3  15324  6840 pts/0  S  10:00  0:00 python3 ...
user    12347  0.1  0.3  15324  6852 pts/0  S  10:00  0:00 python3 ...

Why This Distinction Matters

The program-process distinction isn't merely conceptual purity—it has practical consequences that ripple through every layer of system design.

Practical Implications

•Resource Management — The OS allocates resources (memory, CPU time, handles) to processes, not programs. Understanding this is essential for debugging resource issues.
•Isolation and Security — Memory protection boundaries exist between processes, not programs. One process cannot access another's memory—this is the foundation of system security.
•Concurrency — Multiple users can run the same program simultaneously as separate processes without interference. This enables multi-user systems.
•Accounting and Limits — The OS tracks and limits resource usage per-process. You can limit a process to 1GB of RAM, but you cannot limit a 'program'.
•Debugging — When a crash occurs, you debug a process—its current state, stack trace, memory contents. The program source is just reference material.
•Persistence vs Volatility — Understanding that process state is volatile (lost on crash) while program code persists explains why you save your work regularly.

A Common Misconception

Many beginners think 'running a program' means the program itself is doing something. But programs don't run—processes do. When you double-click an icon, the OS creates a NEW PROCESS based on the program's instructions. The program remains unchanged. This is why updates often require 'restarting the application'—the existing process is using old code in memory.

Real-World Scenario: Updating Running Software

Consider what happens when you update your web browser while it's running:

The installer replaces the program files on disk with new versions
Your running browser continues executing from the old code in memory
The browser process doesn't automatically get the new code
Only when you restart (kill old process, create new process) do you get the update

This behavior is a direct consequence of the program-process distinction. The process has its own copy of the code in memory, independent of the files on disk. Modern operating systems with dynamic linking complicate this slightly, but the principle remains.

From Program to Process: The Loading Sequence

The transformation from program to process is a carefully orchestrated sequence of operations. Understanding this sequence reveals the true complexity hidden behind a simple 'run' command.

Converting Mermaid diagram...

Loading Steps in Detail

•Request to Execute — The user (or another process) requests execution via system call (e.g., exec(), CreateProcess()).
•Locate and Validate — The OS locates the program file, verifies the executable format (ELF, PE), and checks execution permissions.
•Allocate Virtual Memory — The OS creates a new virtual address space for the process, isolated from all other processes.
•Map Code Sections — The .text (code) and .rodata (constants) sections are mapped into memory, typically as read-only.
•Initialize Data — The .data section is loaded with initial values; .bss is allocated and zero-filled.
•Prepare Stack — An initial stack is created at high addresses with the command-line arguments and environment variables.
•Dynamic Linking — If the program uses shared libraries, the dynamic linker (ld.so) resolves and loads dependencies.
•Create PCB — The kernel creates a Process Control Block containing all process metadata.
•Set Initial CPU State — The program counter is set to the entry point; registers are initialized.
•Enter Ready Queue — The process is placed in the scheduler's ready queue, awaiting CPU time.

Demand Paging Optimization

Modern operating systems often don't actually load the entire program into memory immediately. Through demand paging, only the pages actually needed are loaded. The OS sets up page table entries pointing to the disk file, and pages are loaded only when first accessed. This dramatically speeds up process creation for large programs.

Historical Context: Evolution of the Process Concept

The program-process distinction emerged from the evolution of computing systems. Understanding this history illuminates why modern operating systems are designed as they are.

The Early Days: No Distinction Needed

In the earliest computers (1940s-1950s), there was no operating system and no distinction between program and process. A single program was loaded, executed to completion, and then the next program was manually loaded. The computer was the process.

Evolution of the Process Concept
Era	System Type	Program-Process Relationship
1940s-50s	Single-program systems	No distinction; one program runs until complete
1960s	Batch systems	Programs (jobs) queued; executed sequentially
1960s	Multiprogramming	Multiple programs in memory simultaneously
1960s-70s	Time-sharing (Multics)	Process concept formalized; multiple users share CPU
1970s	Unix	Process abstraction refined; fork/exec model
1980s-Present	Modern OS	Full process isolation with virtual memory

The Birth of the Process Concept

The term 'process' was coined in the 1960s during the development of Multics at MIT. The Multics designers needed a term to describe a program in execution, distinguishing it from the static code on storage.

The key insight was that the same program could be executed multiple times simultaneously by different users. Each execution needed its own context: its own data, its own position in the code, its own state. The process was this execution context.

Unix's Contribution

Unix (1969-1973) refined the process concept with the revolutionary fork() and exec() model:

fork() creates a new process by duplicating an existing one
exec() replaces the current process's program with a new program

This separation of process creation (fork) from program loading (exec) provides remarkable flexibility and became the model for most subsequent Unix-like systems.

The Unifying Abstraction

The process concept was so successful because it unified everything the OS needed to track about an execution. Rather than managing code, data, and resources separately, the process bundles them together. This abstraction simplified OS design and enabled the sophisticated multi-tasking systems we take for granted today.

Common Confusions and Clarifications

Even experienced developers sometimes conflate related concepts. Let's clarify common confusions:

Threads are lightweight execution units within a process. A process is a container that can hold one or more threads.

Process: Heavy-weight, isolated address space, separate resources
Thread: Light-weight, shares process address space, shares most resources

Key differences:

Threads share memory; processes do not (by default)
Creating a thread is faster than creating a process
A crash in one thread can affect all threads in that process
Processes can contain multiple threads; threads cannot contain processes

When in Doubt, Ask About Context

These terms are context-sensitive. 'Process' in OS theory means one thing; 'process' in a business context means another. Always clarify which abstraction level is being discussed.

Summary: Program vs Process

We have established the foundational distinction upon which process management is built. Let's consolidate the key insights:

Key Takeaways

•A program is passive and static — It's a file containing instructions, stored on disk, with no execution state.
•A process is active and dynamic — It's a program in execution, living in memory, with state and resources.
•One program can spawn many processes — Each execution creates an independent instance with its own memory and state.
•The OS manages processes, not programs — Resources, scheduling, and protection are all per-process.
•Process memory is structured — Text, data, heap, and stack segments serve distinct purposes.
•Loading transforms program to process — A complex sequence sets up the execution environment.
•The distinction enables modern computing — Multi-user, multi-tasking systems depend on process isolation.

What's Next:

Now that we understand what distinguishes a program from a process, we'll examine the process as the fundamental unit of execution in an operating system. The next page explores why the OS centers its entire design around the process abstraction and what this means for system architecture.

Page Complete

You now understand the critical distinction between programs and processes. This foundational knowledge is essential for understanding everything that follows in process management—from process states and scheduling to inter-process communication and memory management.

1 / 5

Loading learning content...

Operating SystemsProcess Definition

Process Definition

LevelBeginner

Duration60 mins

TopicProcess Definition

1 / 5

Program vs Process

The Most Important Distinction in Operating Systems

What You Will Learn

What is a Program?

The Essential Characteristics of a Program:

Program Characteristics

•Static Entity — A program is a sequence of bytes stored on permanent storage (hard drive, SSD). It does not change during execution because it is never 'executing'—the process is.
•Passive Artifact — A program has no agency. It cannot perform actions, request resources, or respond to events. It is inert, like a blueprint sitting on a shelf.
•Compiled or Interpreted Instructions — Programs contain machine code (compiled languages) or source code/bytecode (interpreted languages) that describes what to do without actually doing it.
•Persistent Storage — Programs persist on disk even when the system is powered off. They survive reboots, crashes, and years of storage.
•Single Representation — There is typically one program file (or set of files) for a given application. This single artifact can be copied but each copy is functionally identical.

Anatomy of a Program File:

On Unix-like systems, a compiled program is typically an ELF (Executable and Linkable Format) file. On Windows, it's a PE (Portable Executable) file. These formats define structured sections:

Structure of an Executable Program File
Section	Contents	Purpose
`.text`	Machine code instructions	The actual executable logic of the program
`.data`	Initialized global variables	Variables with predefined starting values
`.bss`	Uninitialized global variables	Variables that start as zero; space reserved but not stored
`.rodata`	Read-only data (constants, strings)	Literal values embedded in the program
Header	Metadata (entry point, architecture, etc.)	Tells the OS loader how to set up the process

Examining a program file on Linux
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
# View the sections of a compiled program
$ readelf -S /bin/ls
 
Section Headers:
  [Nr] Name              Type             Address           Offset
  [ 1] .interp           PROGBITS         0000000000000318  00000318
  [14] .text             PROGBITS         0000000000004d90  00004d90
  [15] .rodata           PROGBITS         0000000000019000  00019000
  [25] .data             PROGBITS         0000000000024040  00023040
  [26] .bss              NOBITS           0000000000024160  00023148
 
# The .text section contains 59,536 bytes of machine code
# The program itself is just this inert data sitting on disk

Programs are Recipes, Not Meals

What is a Process?

When the OS loads a program into memory and begins execution, it creates a process that includes:

The code from the program's text section, now loaded into memory
Data sections with global and static variables
A heap for dynamic memory allocation
A stack for function calls, local variables, and return addresses
CPU register contents including the program counter
Open resources like file handles, network sockets, and devices
OS metadata tracking the process state, permissions, and relationships

Process Characteristics

•Dynamic Entity — A process changes constantly during execution. Variables are modified, functions are called and return, memory is allocated and freed.
•Active Agent — A process performs actions: reading files, sending network packets, computing results. It has agency within the bounds set by the OS.
•Execution Context — A process encapsulates everything needed to execute: memory layout, CPU state, open resources, scheduling priority, and more.
•Transient Existence — Processes are created and destroyed. They exist only while executing and disappear when they terminate (unlike programs which persist).
•System Resource Consumer — Processes consume CPU time, memory, I/O bandwidth, and other finite system resources that the OS must manage.

Converting Mermaid diagram...

The Process Memory Layout in Detail:

Process Memory Segments
Segment	Location	Purpose	Growth Direction
Text (Code)	Low addresses	Executable machine instructions	Fixed size, read-only
Data	Above text	Initialized global/static variables	Fixed at load time
BSS	Above data	Uninitialized global/static variables	Fixed, zeroed at start
Heap	Above BSS	Dynamic memory allocation (malloc, new)	Grows upward ↑
Stack	High addresses	Function calls, local variables, return addresses	Grows downward ↓

Stack vs Heap Collision

The Critical Differences

Program vs Process: Comprehensive Comparison
Dimension	Program	Process
Nature	Passive entity (stored data)	Active entity (execution context)
Location	Stored on disk	Resides in memory
Lifetime	Persists until deleted	Exists only during execution
Instances	Single file	Multiple concurrent instances possible
Resources	Consumes disk space only	Consumes CPU, memory, I/O, handles
State	Immutable (code doesn't change)	Mutable (data changes constantly)
Visibility	Visible in file system	Visible in task manager / ps command
Identity	File path / inode	Process ID (PID)
Termination	Deletion from file system	Exit or kill by OS
Creation	Compilation or copying	fork() / exec() / CreateProcess()

Program: Static Properties

•Stored as a file on secondary storage
•Contains compiled code and data definitions
•Single authoritative copy
•No execution state
•No system resource consumption (except disk)
•Survives system reboots

Process: Dynamic Properties

•Loaded into main memory (RAM)
•Actively executing instructions
•Multiple independent instances
•Current state (registers, PC, stack)
•Actively consuming system resources
•Lost on termination or crash

Observing the difference
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# The PROGRAM /usr/bin/python3 is a file
$ ls -la /usr/bin/python3
-rwxr-xr-x 1 root root 5494584 Nov 14  2023 /usr/bin/python3
 
# The PROCESS is what runs when we execute it
$ python3 -c "import time; time.sleep(60)" &
[1] 12345
 
# View the process (notice the PID, memory, CPU usage)
$ ps aux | grep python
user    12345  0.1  0.3  15324  6848 pts/0  S  10:00  0:00 python3 -c...
 
# One program file, but we can spawn multiple processes
$ python3 -c "import time; time.sleep(60)" &
[2] 12346
$ python3 -c "import time; time.sleep(60)" &
[3] 12347
 
# Now we have 3 independent processes from the same program
$ ps aux | grep python
user    12345  0.1  0.3  15324  6848 pts/0  S  10:00  0:00 python3 ...
user    12346  0.1  0.3  15324  6840 pts/0  S  10:00  0:00 python3 ...
user    12347  0.1  0.3  15324  6852 pts/0  S  10:00  0:00 python3 ...

Why This Distinction Matters

The program-process distinction isn't merely conceptual purity—it has practical consequences that ripple through every layer of system design.

Practical Implications

•Resource Management — The OS allocates resources (memory, CPU time, handles) to processes, not programs. Understanding this is essential for debugging resource issues.
•Isolation and Security — Memory protection boundaries exist between processes, not programs. One process cannot access another's memory—this is the foundation of system security.
•Concurrency — Multiple users can run the same program simultaneously as separate processes without interference. This enables multi-user systems.
•Accounting and Limits — The OS tracks and limits resource usage per-process. You can limit a process to 1GB of RAM, but you cannot limit a 'program'.
•Debugging — When a crash occurs, you debug a process—its current state, stack trace, memory contents. The program source is just reference material.
•Persistence vs Volatility — Understanding that process state is volatile (lost on crash) while program code persists explains why you save your work regularly.

A Common Misconception

Real-World Scenario: Updating Running Software

Consider what happens when you update your web browser while it's running:

The installer replaces the program files on disk with new versions
Your running browser continues executing from the old code in memory
The browser process doesn't automatically get the new code
Only when you restart (kill old process, create new process) do you get the update

From Program to Process: The Loading Sequence

The transformation from program to process is a carefully orchestrated sequence of operations. Understanding this sequence reveals the true complexity hidden behind a simple 'run' command.

Converting Mermaid diagram...

Loading Steps in Detail

•Request to Execute — The user (or another process) requests execution via system call (e.g., exec(), CreateProcess()).
•Locate and Validate — The OS locates the program file, verifies the executable format (ELF, PE), and checks execution permissions.
•Allocate Virtual Memory — The OS creates a new virtual address space for the process, isolated from all other processes.
•Map Code Sections — The .text (code) and .rodata (constants) sections are mapped into memory, typically as read-only.
•Initialize Data — The .data section is loaded with initial values; .bss is allocated and zero-filled.
•Prepare Stack — An initial stack is created at high addresses with the command-line arguments and environment variables.
•Dynamic Linking — If the program uses shared libraries, the dynamic linker (ld.so) resolves and loads dependencies.
•Create PCB — The kernel creates a Process Control Block containing all process metadata.
•Set Initial CPU State — The program counter is set to the entry point; registers are initialized.
•Enter Ready Queue — The process is placed in the scheduler's ready queue, awaiting CPU time.

Demand Paging Optimization

Historical Context: Evolution of the Process Concept

The program-process distinction emerged from the evolution of computing systems. Understanding this history illuminates why modern operating systems are designed as they are.

The Early Days: No Distinction Needed

Evolution of the Process Concept
Era	System Type	Program-Process Relationship
1940s-50s	Single-program systems	No distinction; one program runs until complete
1960s	Batch systems	Programs (jobs) queued; executed sequentially
1960s	Multiprogramming	Multiple programs in memory simultaneously
1960s-70s	Time-sharing (Multics)	Process concept formalized; multiple users share CPU
1970s	Unix	Process abstraction refined; fork/exec model
1980s-Present	Modern OS	Full process isolation with virtual memory

The Birth of the Process Concept

Unix's Contribution

Unix (1969-1973) refined the process concept with the revolutionary fork() and exec() model:

fork() creates a new process by duplicating an existing one
exec() replaces the current process's program with a new program

This separation of process creation (fork) from program loading (exec) provides remarkable flexibility and became the model for most subsequent Unix-like systems.

The Unifying Abstraction

Common Confusions and Clarifications

Even experienced developers sometimes conflate related concepts. Let's clarify common confusions:

Threads are lightweight execution units within a process. A process is a container that can hold one or more threads.

Process: Heavy-weight, isolated address space, separate resources
Thread: Light-weight, shares process address space, shares most resources

Key differences:

Threads share memory; processes do not (by default)
Creating a thread is faster than creating a process
A crash in one thread can affect all threads in that process
Processes can contain multiple threads; threads cannot contain processes

When in Doubt, Ask About Context

These terms are context-sensitive. 'Process' in OS theory means one thing; 'process' in a business context means another. Always clarify which abstraction level is being discussed.

Summary: Program vs Process

We have established the foundational distinction upon which process management is built. Let's consolidate the key insights:

Key Takeaways

•A program is passive and static — It's a file containing instructions, stored on disk, with no execution state.
•A process is active and dynamic — It's a program in execution, living in memory, with state and resources.
•One program can spawn many processes — Each execution creates an independent instance with its own memory and state.
•The OS manages processes, not programs — Resources, scheduling, and protection are all per-process.
•Process memory is structured — Text, data, heap, and stack segments serve distinct purposes.
•Loading transforms program to process — A complex sequence sets up the execution environment.
•The distinction enables modern computing — Multi-user, multi-tasking systems depend on process isolation.

What's Next:

Page Complete

1 / 5