Process States - Learning Module

Loading content...

0/227

New State: Process Creation and Admission

The Birth of a Process

Every process that has ever executed on a computer—from the simplest shell script to the most complex database server—began its existence in a single, fundamental state: New. This state represents the critical transition point where a static program transforms into a dynamic, schedulable entity capable of competing for system resources.

The New state is far more than a mere bookkeeping label. It represents a complex orchestration of memory allocation, data structure initialization, security validation, and resource reservation. Understanding this state deeply reveals how operating systems balance resource efficiency with process creation speed, and how design decisions at this level ripple through system performance.

What You Will Learn

By the end of this page, you will understand the complete anatomy of process creation—from the moment a user requests program execution to the instant the process is admitted to the Ready queue. You'll learn about PCB initialization, memory mapping, resource allocation strategies, and admission control mechanisms that prevent system overload.

Understanding the New State

The New state represents a process that has been created but has not yet been admitted to the pool of executable processes. A process in this state exists in a liminal space—it has been acknowledged by the operating system, but it cannot yet compete for CPU time.

Formal Definition

A process is in the New state when:

The operating system has received a request to create the process
Initial data structures (primarily the PCB) are being allocated and initialized
The process has not yet been admitted to the Ready queue
No CPU time has been allocated to the process

This state exists because process creation is not instantaneous. Creating a process involves multiple operations—memory allocation, security validation, resource reservation—that take time and may fail. The New state provides a holding area while these operations complete.

Why Separate New from Ready?

You might wonder why we need a separate New state—why not create the process and immediately mark it Ready? The separation exists because: (1) Process creation may fail partway through, and cleanup is easier if the process hasn't been admitted to scheduler queues; (2) Admission control can evaluate whether to accept the process based on system load; (3) The new process may require parent process actions (like setting up pipes) before it can run.

New State Characteristics
Characteristic	Description	Implications
CPU Eligibility	Process cannot receive CPU time	No context switching overhead for New processes
Memory Status	Virtual address space being established	Page tables may be incomplete; no physical frames allocated for code/data yet
PCB State	Being initialized with process metadata	Not yet linked into scheduler data structures
Resource Handles	Being allocated (file descriptors, etc.)	May inherit handles from parent process
Duration	Typically very brief (microseconds to milliseconds)	Extended duration indicates resource contention or admission control delay

Process Creation Triggers

Processes don't spontaneously appear—they are explicitly created in response to specific system events. Understanding these triggers reveals the different pathways into the New state and the varying requirements each pathway imposes.

The Four Canonical Creation Triggers

Operating systems textbooks identify four primary events that cause process creation:

Process Creation Events

•System Initialization (Boot) — When the operating system starts, it creates essential system processes: the init/systemd daemon, login managers, system services, and device handlers. These processes establish the foundation upon which all user activity depends.
•User Request (Interactive) — A user explicitly requests program execution—double-clicking an icon, typing a command in a terminal, or launching an application from a menu. The shell or window manager translates this gesture into a process creation syscall.
•Process Spawning (fork/exec) — A running process creates a child process to perform work. Web servers spawn worker processes; shells create children for pipeline commands; build systems spawn compiler instances. This is the most common creation mechanism.
•Batch Job Initiation — In batch processing systems, the job scheduler creates processes to execute queued work. Modern equivalents include cron jobs, CI/CD pipelines, and serverless function invocations.

Creation Trigger Characteristics

Each trigger has distinct characteristics that affect process creation:

System initialization processes are privileged, often running as root/SYSTEM, with access to hardware and kernel services. They're created before user authentication is possible, so security context is established through boot configuration rather than user credentials.

User-requested processes inherit the user's security context, environment variables, and resource limits. The shell typically performs path resolution, argument parsing, and environment setup before invoking the creation syscall.

Process spawning is the most complex case. The child process must decide what to inherit from the parent (open files, environment, working directory) versus what to reset. UNIX's fork-exec model gives the parent explicit control over this inheritance.

Batch jobs operate in controlled environments with predefined resource allocations. The scheduling system may impose quotas, time limits, and priority adjustments not applied to interactive processes.

The Init Process: Process 1

Every UNIX-like system has a special process with PID 1 (init, systemd, or launchd on macOS). This process is the ancestor of all other processes and bears special responsibilities: it cannot be killed (even by root), it adopts orphaned processes, and its termination causes system shutdown. Understanding that all processes trace their lineage back to init helps conceptualize the process hierarchy.

Process Control Block Initialization

The Process Control Block (PCB) is the kernel's representation of a process—a data structure containing all information the operating system needs to manage the process. During the New state, the PCB is allocated and populated with initial values.

PCB Allocation

Most operating systems maintain a pool of pre-allocated PCB structures to accelerate process creation. Allocating a PCB from a pool is O(1), while dynamic allocation might require memory allocator locks and potential page faults.

Key Insight: The PCB is a kernel-space structure. User processes cannot
directly access or modify their own PCB—they can only influence it
indirectly through system calls.

Fields Initialized During New State

PCB Fields Initialized in New State
Field Category	Specific Fields	Initial Value Source
Process Identity	PID, PPID, Process Group ID, Session ID	Assigned by kernel (PID), inherited from parent (PPID, PGID, SID)
Process State	State field set to 'NEW'	Constant value during creation
CPU Context	Program Counter, Stack Pointer, General Registers	PC set to program entry point; SP to top of user stack; registers zeroed or inherited
Memory Management	Page table pointer, memory limits, segments	Page table allocated; limits from parent or defaults; segments configured for program
Scheduling Info	Priority, scheduling class, time slice	Inherited from parent or set to defaults; may be adjusted by nice value
File Descriptors	Open file table, current directory, root directory	Typically inherited from parent; FDs 0,1,2 (stdin/stdout/stderr) essential
Signal Handling	Signal masks, pending signals, handlers	Handlers inherited; pending signals cleared; masks may be inherited
Resource Accounting	CPU time used, memory usage, I/O counts	All counters initialized to zero
Security Context	UID, GID, capabilities, SELinux context	Inherited from parent or set based on executable's setuid/setgid bits

The fork() Initialization Path (UNIX)

In UNIX systems using fork(), PCB initialization has a unique characteristic: the child's PCB is largely a copy of the parent's PCB. This is why forked children inherit open files, signal handlers, and environment—these are simply copied from the parent's PCB.

However, certain fields are reset rather than copied:

PID: Assigned a new, unique process ID
PPID: Set to the parent's PID (the forking process)
Resource usage counters: Reset to zero (the child hasn't used any resources yet)
Pending signals: Cleared (signals pending for parent shouldn't affect child)
File locks: Not inherited (locks are per-process, not per-file)
Memory locks: Not inherited (mlock() doesn't transfer)

Copy-on-Write Optimization

Modern UNIX systems don't actually copy the parent's memory during fork(). Instead, both processes share the same physical pages, marked read-only. Only when either process writes to a page is a copy made. This copy-on-write (COW) optimization makes fork() nearly instantaneous, even for processes with gigabytes of memory.

PCB Size and Process Creation Performance

The Linux task_struct (its PCB equivalent) is approximately 1.5KB in size and references additional allocated structures. Windows' EPROCESS structure is larger still. PCB allocation and initialization is one of the most performance-critical paths in the kernel—it's executed for every process creation. Kernel developers carefully optimize this path, using techniques like slab allocation, lazy initialization, and pre-computed defaults.

Memory Space Establishment

While the PCB represents the process to the kernel, the virtual address space represents the process's computational environment. During the New state, this address space is established—though not necessarily populated with physical memory.

Virtual Address Space Layout

A typical process address space contains several segments:

Virtual Address Space Layout
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
┌─────────────────────────────┐  High Address (e.g., 0x7FFFFFFFFFFF on x64)
│                             │
│     Kernel Space            │  ← Not accessible from user mode
│     (shared by all procs)   │
│                             │
├─────────────────────────────┤
│     Stack                   │  ← Grows downward; local variables, return addresses
│          ↓                  │
│                             │
│     (unmapped gap)          │  ← Guard region between stack and heap
│                             │
│          ↑                  │
│     Heap                    │  ← Grows upward; malloc/new allocations
│                             │
├─────────────────────────────┤
│     BSS (Uninitialized)     │  ← Global variables initialized to zero
├─────────────────────────────┤
│     Data (Initialized)      │  ← Global variables with explicit values
├─────────────────────────────┤
│     Text (Code)             │  ← Executable instructions (read-only)
└─────────────────────────────┘  Low Address (e.g., 0x400000 on Linux x64)

Lazy Memory Allocation

A crucial optimization: during the New state, the OS typically does not allocate physical memory for the entire address space. Instead, it creates page table entries that are marked "not present." When the process actually accesses a page, a page fault occurs, and only then is physical memory allocated.

This demand paging strategy means:

Process creation is fast (no copying of executable code)
Memory isn't wasted on code/data that's never accessed
Large programs can start quickly even if they use lots of memory later

Page Table Creation

The page table is the data structure that maps virtual addresses to physical addresses. During New state:

Top-level page table is allocated (PGD on x86_64: 4KB)
Kernel mappings are shared from a global template (not copied per-process)
User-space entries are populated lazily or copied from parent (fork) with COW

The page table is one of the few structures that must be allocated during New state because the CPU needs it to translate any memory access the process makes.

Memory Allocation Timing
Memory Type	Allocated During New?	When Actually Populated?
Page Table (top level)	Yes	Immediately during creation
Code Pages	No (virtual only)	On first instruction fetch (demand paging)
Stack Pages	No (virtual only)	On first stack access; may pre-allocate some
Heap Pages	No	On first malloc()/brk() call
Shared Library Code	No	Mapped from shared cache on first use
BSS Segment	No	Zero-page mapped; COW on first write

The Zero Page Optimization

For BSS (uninitialized global variables that should be zero), the kernel doesn't allocate unique pages. Instead, all BSS pages initially point to a single, read-only 'zero page.' When the process writes to any BSS location, a page fault occurs, a real page is allocated, zeroed, and mapped in place of the zero page reference. This means a process with 1GB of uninitialized arrays uses almost no physical memory until it actually writes to those arrays.

Resource Allocation and Inheritance

Beyond memory, a new process requires various system resources. The New state is when these resources are allocated or inherited from the parent process.

File Descriptor Inheritance

In UNIX systems, by default, open file descriptors are inherited across fork():

stdin (fd 0): Standard input, typically the terminal
stdout (fd 1): Standard output, typically the terminal
stderr (fd 2): Standard error, typically the terminal
Other open files: Inherited unless marked close-on-exec (O_CLOEXEC)

This inheritance is what makes shell pipelines work. When you type ls | grep foo, the shell:

Creates a pipe (two file descriptors: read end and write end)
Forks a child for ls, with stdout redirected to pipe's write end
Forks a child for grep, with stdin redirected to pipe's read end

Without file descriptor inheritance, inter-process communication would require complex setup.

Resources Allocated or Inherited

•File Descriptors — Open file handles, including regular files, pipes, sockets, and device files. Child inherits parent's entire file descriptor table by default.
•Environment Variables — The environ array (PATH, HOME, USER, etc.) is copied to the child's address space. Modifications by child don't affect parent.
•Current and Root Directory — Child inherits parent's working directory (cwd) and root directory (for chroot environments).
•Signal Handlers — Custom signal handlers are inherited; pending signals are not. Signal masks may be inherited depending on system.
•Resource Limits — RLIMIT values (max open files, max memory, CPU time limits) inherited from parent. Can be further restricted but not increased without privilege.
•Umask — File creation mask controlling default permissions on new files.
•Controlling Terminal — Session and process group relationships inherited; controlling terminal usually inherited.
•Nice Value — Scheduling priority adjustment inherited from parent.

The Close-on-Exec Flag

Not all file descriptors should be inherited. Consider a server that opens sensitive configuration files—child processes handling untrusted requests shouldn't have access to these files.

The O_CLOEXEC flag (or fcntl(fd, F_SETFD, FD_CLOEXEC)) marks a file descriptor to be automatically closed when the process calls exec(). This prevents accidental leakage of file descriptors to child programs.

Resource Limits (rlimits)

Resource limits constrain what a process can consume:

Limit	Description	Default (typical)
RLIMIT_NOFILE	Max open file descriptors	1024 soft, 1M hard
RLIMIT_NPROC	Max processes for user	4096
RLIMIT_AS	Max virtual address space	Unlimited
RLIMIT_STACK	Max stack size	8MB soft, unlimited hard
RLIMIT_CPU	Max CPU time in seconds	Unlimited

These limits are inherited during process creation and can be reduced (but not increased beyond the hard limit) by unprivileged processes.

File Descriptor Leaks

One of the most common bugs in systems programming is file descriptor leakage. If a parent process opens many files without O_CLOEXEC and then forks/execs, child processes inherit (and keep open) descriptors they don't need. This wastes kernel resources and can cause security issues or resource exhaustion. Modern practice: always use O_CLOEXEC unless you specifically need descriptor inheritance.

Security and Credential Setup

Security context establishment during the New state determines what the process can access throughout its lifetime. This involves user identity, group membership, capabilities, and mandatory access control labels.

User and Group Identity

Every process carries a security identity consisting of:

Real UID/GID (ruid, rgid): The actual user who owns the process. Used for accounting and signals.

Effective UID/GID (euid, egid): Used for permission checks. Usually equals real UID but differs for setuid programs.

Saved UID/GID (suid, sgid): Preserved copy of euid/egid from before exec(). Allows temporary privilege drop and restoration.

Supplementary Groups: Additional group memberships beyond the primary group.

setuid-execution-example.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// Consider: /usr/bin/passwd owned by root with setuid bit set
// User "alice" (uid=1000) executes passwd
 
// Before exec():
//   ruid=1000, euid=1000, suid=1000  (alice's shell)
 
// During new process creation for passwd:
// 1. Kernel sees setuid bit on /usr/bin/passwd
// 2. Sets euid = owner of file (root = 0)
// 3. Process credentials become:
//    ruid=1000 (still alice)
//    euid=0    (root - for permission checks)
//    suid=0    (saved for potential restoration)
 
// Now passwd can:
// - Read/write /etc/shadow (requires root)
// - But process remains accountable to alice (ruid=1000)
// - Can drop privileges: seteuid(getuid()) sets euid back to 1000

Linux Capabilities

Modern Linux has moved beyond simple root/non-root distinction to fine-grained capabilities. Instead of granting all-or-nothing root access, specific capabilities can be granted:

Capability	Description
CAP_NET_BIND_SERVICE	Bind to ports < 1024
CAP_SYS_ADMIN	Various sysadmin operations
CAP_NET_RAW	Use raw sockets
CAP_CHOWN	Change file ownership
CAP_SETUID	Manipulate process UIDs

During New state, capabilities can be inherited from the parent, granted by file capability attributes, or dropped by the process itself.

Mandatory Access Control

Beyond traditional UNIX permissions, systems like SELinux (Linux), AppArmor (Linux), and Sandbox (macOS) impose additional constraints:

SELinux contexts: Every process runs with a security label (e.g., unconfined_t, httpd_t). The policy determines what labels can access what resources.
AppArmor profiles: Processes are confined to specific file paths, network access, and capabilities based on executable name.
macOS Sandbox: Applications declare entitlements; the kernel enforces access based on these declarations.

These contexts are established during the New state and cannot be elevated later (only further restricted).

Principle of Least Privilege

Well-designed systems start processes with minimal privileges and grant additional access only when proven necessary. The New state is critical for this: credentials established here persist throughout process lifetime. A process that starts with excessive privileges cannot be meaningfully constrained later—the damage may already be done before constraints are applied.

Admission Control: Gating Entry to Ready Queue

Not every process that enters the New state successfully transitions to Ready. Admission control is the operating system's mechanism for deciding whether to accept a new process into the system.

Why Admission Control Exists

Resources are finite. If the system creates processes without limits:

Memory exhaustion causes thrashing (all time spent swapping, no real work done)
Scheduler overhead grows with process count, degrading all processes
File descriptor tables fill up, causing I/O failures
Fork bombs can bring entire systems down in seconds

Admission control prevents system collapse by rejecting or delaying process creation when resources are scarce.

Admission Control Checks

•Process Count Limits: RLIMIT_NPROC limits how many processes a single user can create. Prevents individual users from exhausting process table slots.
•System-wide Process Limit: The kernel's kernel.pid_max (Linux) or similar limits cap total processes system-wide. Default is typically 32768; can be increased.
•Memory Availability: If the system lacks memory for even minimal process structures (PCB, page tables), creation fails with ENOMEM.
•Cgroup Limits: Container environments limit processes within a cgroup. A container configured for max 100 processes cannot create the 101st.
•Security Policy: SELinux/AppArmor may prevent a process from forking if the domain transition would be denied.

fork-bomb.sh
1
2
3
4
5
6
7
8
9
10
11
# Classic fork bomb - DO NOT RUN THIS
# Each invocation creates two children, which each create two more...
# Exponential growth exhausts process table in seconds
 
:(){ :|:& };:
 
# Protection mechanisms:
# 1. RLIMIT_NPROC - limits processes per user
#    $ ulimit -u 100  # Limit to 100 processes
# 2. cgroups - limits processes per container/group
# 3. systemd - can limit processes per service

Admission Control Responses

When admission control rejects a process:

Immediate Failure: The fork()/CreateProcess() call returns an error (EAGAIN, ENOMEM, etc.). The parent process must handle this gracefully.

Delayed Admission: In some real-time systems, new processes may wait in the New state until resources become available. This is less common in general-purpose systems.

Priority-based Admission: The scheduler might admit high-priority processes while rejecting low-priority ones under resource pressure.

Long-Term vs Short-Term Admission

Long-term admission (also called job scheduling): Decides which jobs from a batch queue should be initiated. Common in HPC and mainframe environments.

Short-term admission: The immediate decision of whether a fork() should succeed. This is what most developers encounter directly.

In modern systems, long-term admission is often delegated to container orchestrators (Kubernetes) or job schedulers (Slurm) rather than the kernel itself.

Overcommit and Process Creation

Linux's memory overcommit policy affects process creation. With overcommit enabled (default), fork() rarely fails due to memory—the kernel assumes not all allocated memory will be used. With strict overcommit (overcommit_memory=2), the kernel may reject fork() if committed memory would exceed physical RAM plus swap. This is a form of admission control based on memory promises.

Transition to Ready State: Admission Complete

Once initialization completes and admission control approves the process, the operating system performs the final transition from New to Ready. This transition is one of the most critical moments in a process's lifecycle—it marks the process as eligible for CPU time.

What Happens During New → Ready Transition

Transition Steps

•State Field Update: PCB's state field changes from NEW to READY
•Ready Queue Insertion: PCB is linked into the scheduler's ready queue(s). Position depends on priority and scheduling policy.
•Scheduler Notification: On SMP systems, the scheduler may wake up idle CPUs to consider the new runnable process.
•Statistics Update: Kernel updates counts of runnable processes for load calculation.
•Parent Notification: The fork() syscall returns in the parent process, indicating the child is now running (or at least ready to run).
•Watchers Notified: Process auditing, tracing, and monitoring infrastructure is notified of the new process.

The Ready Queue Structure

Depending on the scheduling algorithm, "ready queue" might actually be:

Single FIFO queue: Simples possible implementation
Priority queues: Separate queue per priority level (O(1) scheduler)
Red-black tree: Keyed by virtual runtime (CFS scheduler in Linux)
Run lists per CPU: SMP systems often maintain per-CPU ready structures

Where the new process is inserted determines when it will run. High-priority processes go to the front; low-priority processes to the back.

Immediate Preemption Possibility

If the new process has higher priority than the currently running process, a preemption may occur immediately:

New process inserted into ready queue
Scheduler notes new process has higher priority
Current process is preempted (moved back to ready)
New process selected to run

This means a process can transition from New → Ready → Running in rapid succession—potentially running within microseconds of its creation.

Process Birth Complete

The transition from New to Ready completes the process creation lifecycle. The process now has all resources it needs to run: a PCB with initialized state, a virtual address space, inherited resources, and security credentials. It awaits only CPU time—which the scheduler will provide based on its priority, the system's scheduling policy, and competition from other processes.

Summary: The New State in Process Lifecycle

We've completed a comprehensive exploration of the New state—the first state in a process's lifecycle. Let's consolidate the key concepts:

Key Takeaways

•The New state represents process creation — A process is in New state while its PCB and resources are being initialized but before it can compete for CPU time.
•Four triggers create processes — System boot, user requests, process spawning (fork), and batch job initiation all result in New state entry.
•PCB initialization is critical — The Process Control Block must be allocated and populated with identity, state, memory pointers, and scheduling information.
•Memory setup is mostly lazy — Virtual address space is defined but physical memory is allocated on demand through page faults.
•Resources are inherited — File descriptors, environment, working directory, and limits typically come from the parent process.
•Security context is established once — User identity, groups, capabilities, and MAC labels are set during New state and constrain the process forever.
•Admission control gates entry — Process limits, memory availability, and security policies may reject or delay process creation.
•Transition to Ready enables scheduling — Once admitted, the process is placed in the ready queue and becomes eligible for CPU time.

What's next:

With the New state understood, we move to the Ready state—where processes wait for their turn on the CPU. We'll explore ready queue management, CPU-bound vs I/O-bound process characteristics, and how schedulers select the next process to run.

Page Complete

You now understand the New state comprehensively—from process creation triggers through PCB initialization, memory establishment, resource inheritance, security setup, admission control, and finally transition to the Ready state. This foundation prepares you for understanding how processes compete for and utilize CPU resources in subsequent states.