Containers - Learning Module

Loading content...

0/227

Container Concept

The Application Deployment Revolution

In the evolution of software deployment, few innovations have been as transformative as containers. Containers represent a fundamental shift in how we package, distribute, and run applications—solving problems that plagued software delivery for decades.

Before containers, the phrase "it works on my machine" was an all-too-common refrain. Applications would function perfectly in development environments but mysteriously fail in production. Dependencies conflicted, library versions clashed, and operations teams spent countless hours debugging environment inconsistencies.

Containers emerged as the elegant solution: bundle the application with its entire runtime environment, creating a portable, self-contained unit that runs identically everywhere. This simple idea has revolutionized software engineering, enabling microservices architectures, accelerating CI/CD pipelines, and fundamentally changing how we think about infrastructure.

What You Will Learn

By the end of this page, you will understand what containers are at a fundamental level, why they exist, and how they solve the software deployment problem. You'll learn the core concepts that underpin all container technologies—from Docker to Kubernetes—setting the foundation for deeper exploration of container internals.

The Software Deployment Problem

To truly appreciate containers, we must first understand the problem they solve. Software deployment has always been challenging because applications don't exist in isolation—they depend on complex webs of libraries, runtimes, configurations, and system services.

The dependency matrix problem:

Consider a typical web application:

It requires a specific Python version (say, 3.11)
It depends on dozens of Python packages with precise versions
It needs a PostgreSQL client library compiled for the right glibc version
It expects certain environment variables and configuration files
It assumes specific directory structures and permissions

Now imagine you need to run this application alongside another application that requires Python 3.9, a different set of packages, and a MySQL client library. On a single server, version conflicts become inevitable. Even across servers, subtle differences in operating system versions, installed packages, or configuration files can cause mysterious failures.

Classic Deployment Challenges

•Dependency Conflicts — Application A needs libssl 1.0, Application B needs libssl 1.1. Both can't be the system default simultaneously.
•Environment Drift — Over time, servers accumulate manual changes, patches, and configurations. No two 'identical' servers remain truly identical.
•Resource Contention — Applications running on the same server can interfere with each other through shared resources like ports, file paths, or system limits.
•Slow Provisioning — Setting up new servers manually is error-prone and time-consuming. Configuration management helps but adds complexity.
•Testing Fidelity — Development environments rarely match production exactly. Bugs slip through because of environmental differences.

Historical Context

These problems aren't new—operations engineers have battled them since the earliest days of computing. Solutions like static linking, virtualenv for Python, and configuration management tools (Puppet, Chef, Ansible) helped, but each solved only part of the problem. Containers represent a holistic solution that addresses the entire deployment stack.

What Containers Actually Are

A container is an isolated user-space instance that packages an application along with all its dependencies—libraries, frameworks, configuration files, and binaries—into a single, portable unit. Crucially, containers share the host operating system's kernel while maintaining strict isolation from each other and from the host.

The key insight:

Unlike virtual machines that virtualize hardware and run complete operating systems, containers virtualize at the operating system level. They leverage kernel features to create isolated environments without the overhead of running multiple kernels. This makes containers:

Lightweight — Starting a container takes milliseconds, not minutes
Efficient — Containers share OS resources, avoiding duplication
Portable — A container runs identically on any compatible host
Dense — Hundreds of containers can run on a single server

container-conceptual-model.txt
┌─────────────────────────────────────────────────────────────────┐
│                     HOST OPERATING SYSTEM                       │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                     SHARED KERNEL                           ││
│  │  (Linux kernel providing namespaces, cgroups, capabilities) ││
│  └─────────────────────────────────────────────────────────────┘│
│                              │                                  │
│      ┌───────────────────────┼───────────────────────┐         │
│      │                       │                       │         │
│  ┌───▼───┐               ┌───▼───┐               ┌───▼───┐     │
│  │CONTAINER A│           │CONTAINER B│           │CONTAINER C│ │
│  ├─────────┤             ├─────────┤             ├─────────┤   │
│  │ App A   │             │ App B   │             │ App C   │   │
│  │ Python  │             │ Node.js │             │ Java    │   │
│  │ 3.11    │             │ 20.x    │             │ 21      │   │
│  │ Deps A  │             │ Deps B  │             │ Deps C  │   │
│  │ Config  │             │ Config  │             │ Config  │   │
│  │         │             │         │             │         │   │
│  │ /app    │             │ /app    │             │ /app    │   │
│  │ /lib    │             │ /lib    │             │ /lib    │   │
│  └─────────┘             └─────────┘             └─────────┘   │
│   Isolated               Isolated                Isolated      │
│   Filesystem             Filesystem              Filesystem    │
│   Process Tree           Process Tree            Process Tree  │
│   Network Stack          Network Stack           Network Stack │
│   User IDs               User IDs                User IDs      │
└─────────────────────────────────────────────────────────────────┘
 
Each container believes it's running on its own private system,
but they all share the same kernel—making them extremely efficient.

The container abstraction provides:

Filesystem Isolation — Each container has its own root filesystem, independent of the host and other containers. A container can have completely different binaries, libraries, and configuration files than the host or its neighbors.
Process Isolation — Processes inside a container can only see other processes in the same container. Each container has its own PID namespace where the main process is PID 1.
Network Isolation — Containers get their own network stack, including interfaces, routing tables, and IP addresses. They can be connected through virtual networks.
Resource Limits — Each container can be constrained to use specific amounts of CPU, memory, and I/O bandwidth, preventing any single container from monopolizing host resources.
User Isolation — Containers can map user IDs to different ranges, so root inside a container may not be root on the host.

Containers from the Process Perspective

At their core, containers are still processes. When you run a container, you're starting a regular Linux process—but with special kernel restrictions that create isolation. Understanding this is crucial because it explains both the efficiency and the security model of containers.

The containerized process:

When the Docker daemon (or any container runtime) starts a container, it:

Creates new namespaces for process isolation
Configures cgroups for resource limits
Sets up a chroot-like filesystem view
Drops capabilities to reduce privileges
Applies seccomp profiles to filter system calls
Finally, executes the specified entry point

The result is a process that:

Cannot see processes outside its namespace
Cannot access files outside its root filesystem
Cannot use more resources than its cgroup allows
Cannot perform dangerous system operations
Believes it's running as PID 1 on its own system

process-comparison.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# View all processes on the host (partial output)
$ ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT COMMAND
root         1  0.0  0.1 168000 13252 ?        Ss   /sbin/init
root       154  0.0  0.1  42700  8856 ?        Ss   /lib/systemd/systemd-journald
root       189  0.0  0.0  31232  3600 ?        Ss   /lib/systemd/systemd-udevd
...
root     12340  0.2  0.5 712000 42000 ?        Sl   /usr/bin/containerd
root     14521  0.1  0.3 312000 28000 ?        Sl   docker-containerd-shim
www-data 14537  0.0  0.2 156000 18000 ?        Ss   nginx: master process
...
 
# From INSIDE the container, the process sees a completely different world
$ docker exec container_nginx ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT COMMAND
root           1  0.5  0.2 156000 18000 ?        Ss   nginx: master process
nginx         24  0.0  0.1  86544  9600 ?        S    nginx: worker process
nginx         25  0.0  0.1  86544  9600 ?        S    nginx: worker process
nginx         26  0.0  0.1  86544  9600 ?        S    nginx: worker process
nginx         27  0.0  0.1  86544  9600 ?        S    nginx: worker process
 
# The container's nginx master is PID 14537 on host, but PID 1 inside container!
# The container cannot see any host processes—complete isolation

Key Insight: Containers Are NOT VMs

This process-based nature is why containers are so much lighter than VMs. A VM runs an entire operating system kernel, boot process, and system services. A container is just a sandboxed process. Starting a container is essentially starting one process with special restrictions—which is why container startup is measured in milliseconds, not minutes.

The namespace illusion:

Linux namespaces are the key technology that makes containers possible. Each namespace type isolates a different system resource:

Namespace	What It Isolates	Container Effect
PID	Process ID numbers	Container thinks its main process is PID 1
MNT	Mount points	Container has its own filesystem hierarchy
NET	Network interfaces/routing	Container has its own IP, ports, routes
UTS	Hostname and domain name	Container has its own hostname
IPC	System V IPC, message queues	Container's IPC is isolated from others
USER	User and group IDs	Container can have its own root user
CGROUP	cgroup root directory	Container sees only its own cgroup hierarchy

A container is essentially a process running in a different set of namespaces than the host. The kernel manages all processes identically—namespaces just change what each process can see.

The Layered Filesystem Model

One of the most elegant innovations in container technology is the layered filesystem. Rather than copying entire filesystems for each container, container runtimes use union filesystems (also called overlay filesystems) to stack read-only layers with a thin writable layer on top.

How layers work:

A container image is composed of multiple layers, each representing a set of filesystem changes:

Base layer: A minimal OS filesystem (e.g., Ubuntu, Alpine, or scratch)
Intermediate layers: Application installation, configuration changes
Top layer: Final application setup

When running a container, the runtime:

Stacks all image layers as read-only
Adds a thin writable layer on top
Presents a unified view to the container

Changes made by the running container only affect the writable layer. The image layers remain immutable.

layer-architecture.txt
CONTAINER RUNTIME VIEW
======================
 
┌─────────────────────────────────────────────────────────────────┐
│                    CONTAINER'S VIEW                             │
│                 (Unified filesystem)                             │
│  /                                                               │
│  ├── bin/          ◄── Merged from all layers below             │
│  ├── etc/          ◄── Upper layer wins if conflict              │
│  ├── lib/                                                        │
│  ├── usr/                                                        │
│  ├── var/                                                        │
│  └── app/                                                        │
│       └── myapp.py  ◄── From your application layer             │
└─────────────────────────────────────────────────────────────────┘
                              │
                              │ Union Mount (OverlayFS)
                              │
┌─────────────────────────────────────────────────────────────────┐
│  WRITABLE LAYER (Container-specific, ephemeral)                 │
│  ─────────────────────────────────────────────────────────────  │
│  • Any files modified or created at runtime                     │
│  • Deleted files marked with "whiteout" entries                 │
│  • Lost when container is removed (unless committed)            │
└─────────────────────────────────────────────────────────────────┘
                              │
┌─────────────────────────────────────────────────────────────────┐
│  LAYER 4: Your Application (READ-ONLY)                          │
│  ─────────────────────────────────────────────────────────────  │
│  COPY . /app                                                     │
│  + /app/myapp.py                                                 │
│  + /app/requirements.txt                                         │
└─────────────────────────────────────────────────────────────────┘
                              │
┌─────────────────────────────────────────────────────────────────┐
│  LAYER 3: pip install (READ-ONLY)                               │
│  ─────────────────────────────────────────────────────────────  │
│  RUN pip install flask requests                                  │
│  + /usr/local/lib/python3.11/site-packages/flask/              │
│  + /usr/local/lib/python3.11/site-packages/requests/           │
└─────────────────────────────────────────────────────────────────┘
                              │
┌─────────────────────────────────────────────────────────────────┐
│  LAYER 2: Python Installation (READ-ONLY)                       │
│  ─────────────────────────────────────────────────────────────  │
│  FROM python:3.11-slim                                          │
│  + /usr/local/bin/python3.11                                     │
│  + /usr/local/lib/python3.11/                                    │
└─────────────────────────────────────────────────────────────────┘
                              │
┌─────────────────────────────────────────────────────────────────┐
│  LAYER 1: Debian Slim (READ-ONLY)                               │
│  ─────────────────────────────────────────────────────────────  │
│  + /bin/, /lib/, /usr/, /etc/ (minimal Debian filesystem)      │
└─────────────────────────────────────────────────────────────────┘

Benefits of Layered Filesystems

•Efficient Storage — Layers are shared between images. If 10 containers use the same base image, it's stored only once on disk. Storage scales with unique content, not container count.
•Fast Distribution — When pulling images, only missing layers are downloaded. If you already have the base layers, pulling an updated application layer takes seconds.
•Caching During Builds — Each Dockerfile instruction creates a layer. Unchanged layers are cached, making rebuilds fast. Only modified layers and their descendants are rebuilt.
•Immutability and Reproducibility — Image layers are content-addressed and immutable. An image with the same digest always contains exactly the same content.
•Copy-on-Write Efficiency — Running containers don't copy the entire image. They use copy-on-write so that only modified files consume additional space.

OverlayFS Under the Hood

Modern container runtimes typically use OverlayFS, a union filesystem built into the Linux kernel since version 3.18 (2014). OverlayFS combines a 'lower' directory (read-only layers) with an 'upper' directory (writable layer) to present a unified 'merged' view. When files are modified, they're copied up to the upper directory (copy-on-write).

Container Lifecycle

Understanding the container lifecycle is essential for working effectively with containers. Unlike traditional servers that run continuously for months or years, containers are designed to be ephemeral—created, run, stopped, and destroyed frequently.

The container lifecycle states:

lifecycle-states.txt
                    CONTAINER LIFECYCLE
                    ===================
 
    [Image]                                     [Registry]
        │                                              │
        │ docker create                                │ docker pull
        │ (sets up container but doesn't start)        │ (download image)
        ▼                                              │
    ┌─────────┐                                        │
    │ Created │◄─────────────────────────────────────────┘
    │         │
    └────┬────┘
         │ docker start
         │ (start the main process)
         ▼
    ┌─────────┐     docker pause          ┌──────────┐
    │ Running │ ────────────────────────► │  Paused  │
    │         │ ◄──────────────────────── │          │
    └────┬────┘     docker unpause        └──────────┘
         │
         │ Process exits OR docker stop
         │ (SIGTERM → SIGKILL if needed)
         ▼
    ┌─────────┐
    │ Stopped │ ◄─── Container still exists, filesystem preserved
    │         │      Can be restarted with docker start
    └────┬────┘
         │
         │ docker rm (remove container)
         ▼
    ┌─────────┐
    │ Deleted │ ─── Container no longer exists
    │         │     Writable layer is removed
    └─────────┘     Image layers remain
 
    ALTERNATIVE FLOW:
    docker run = docker create + docker start (most common)
    docker run --rm = auto-remove container when it stops

Lifecycle operations in detail:

Operation	What Happens	State Change
create	Allocate resources, set up namespaces, prepare filesystem	∅ → Created
start	Execute the entry point command in the prepared environment	Created → Running
stop	Send SIGTERM, wait grace period, then SIGKILL	Running → Stopped
kill	Immediately send SIGKILL (or specified signal)	Running → Stopped
pause	Freeze all processes using cgroups freezer	Running → Paused
unpause	Resume frozen processes	Paused → Running
restart	Stop then start (with optional delay)	Running → Stopped → Running
rm	Remove container and its writable layer	Stopped → Deleted
rm -f	Force remove (kill if running, then remove)	Any → Deleted

Embrace Ephemeral Containers

A key container best practice is treating containers as ephemeral—they can be destroyed and recreated at any time. Store persistent data in volumes, not in the container's writable layer. This enables rolling updates, scaling, and self-healing without data loss.

Container Standards and OCI

As containers gained popularity, the industry recognized the need for standardization. The Open Container Initiative (OCI) was formed in 2015 to create vendor-neutral standards, ensuring containers could be built and run across different tools and platforms.

OCI defines three specifications:

Runtime Specification (runtime-spec): Defines how to run a container—the configuration format, lifecycle operations, and execution environment. This ensures that a container created by one tool can be run by another compliant runtime.
Image Specification (image-spec): Defines the container image format—how layers are structured, how metadata is stored, and how images are addressed. This enables images to be built by one tool and run by another.
Distribution Specification (distribution-spec): Defines how to push and pull images from registries. This standardizes registry APIs so that Docker Hub, GitHub Container Registry, and private registries all speak the same protocol.

OCI Specification Details
Specification	Purpose	Key Components
Runtime-spec	Container execution	config.json (process, mounts, namespaces), lifecycle states, hooks
Image-spec	Image format	Manifest, config blob, layer digests, content addressing (SHA256)
Distribution-spec	Registry API	Push/pull endpoints, manifest operations, blob storage

The container ecosystem today:

Thanks to OCI standards, the container ecosystem is rich and diverse:

Container Runtimes: Docker, containerd, CRI-O, runc, gVisor, Kata Containers
Container Engines: Docker Engine, Podman, Buildah, kaniko
Orchestrators: Kubernetes, Docker Swarm, Nomad, Amazon ECS
Registries: Docker Hub, GitHub Container Registry, Harbor, Quay, Amazon ECR

All these tools interoperate because they implement OCI standards. You can build an image with Docker, store it in Harbor, and run it with CRI-O on Kubernetes.

runc: The Reference Runtime

runc is the OCI reference runtime—a lightweight tool that spawns containers according to the OCI runtime-spec. It's used under the hood by Docker, containerd, and CRI-O. When you 'docker run' a container, the request eventually reaches runc, which creates the namespaces, sets up cgroups, and executes the container process.

Why Containers Changed Everything

Containers represent more than a technology improvement—they enable fundamental shifts in how software is built, deployed, and operated. Their impact extends across development practices, infrastructure management, and organizational structures.

The developer experience transformation:

Before containers:

Development environments differed from production
"Works on my machine" was a common debugging dead-end
Onboarding new developers took days of environment setup
Testing against production-like environments was expensive

With containers:

Development and production run identical images
"Works on my machine" → "Works in this container (everywhere)"
Onboarding is git clone && docker compose up
Every test can run against production-identical environments

How Containers Enable Modern Practices

•Microservices Architecture — Containers provide natural boundaries for services. Each service runs in its own container with its own dependencies, scaling independently.
•Continuous Integration/Deployment — Containers enable reproducible builds and deployments. The same image that passes CI tests is the exact image deployed to production.
•Infrastructure as Code — Container images and orchestration configs are code. Infrastructure is version controlled, reviewed, and reproduced automatically.
•Immutable Infrastructure — Instead of patching servers, deploy new containers from new images. Roll back by deploying previous images.
•DevOps Culture — Containers reduce the gap between development and operations. Both work with the same artifact (the container image).
•Cloud-Native Development — Containers are the standard deployment unit for Kubernetes and cloud platforms. They enable portability across clouds.

The Shipping Container Analogy

Just as standardized shipping containers revolutionized global trade by providing a uniform way to transport goods regardless of what's inside or what vehicle carries them, software containers standardize application packaging. The 'what's inside' doesn't matter—the container provides the contract that all tooling understands.

Summary: Understanding Containers

We've established a solid foundation for understanding containers. Let's consolidate the key concepts:

Key Takeaways

•Containers solve the software deployment problem — By bundling applications with their dependencies, containers eliminate environment inconsistencies that cause "works on my machine" failures.
•Containers are isolated processes, not VMs — They use kernel features (namespaces, cgroups) to create isolation while sharing the host kernel, making them lightweight and fast.
•Layered filesystems enable efficiency — Images are built from stacked layers. Shared layers save storage, and copy-on-write minimizes runtime overhead.
•Containers are ephemeral by design — They can be started, stopped, and destroyed rapidly. Persistent data belongs in volumes, not in container filesystems.
•OCI standards ensure interoperability — Container images and runtimes follow open standards, enabling a diverse ecosystem of tools that work together.
•Containers enable modern software practices — They're foundational to microservices, CI/CD, immutable infrastructure, and cloud-native development.

What's next:

Now that we understand what containers are conceptually, the next page dives deep into the comparison between containers and virtual machines. We'll explore their architectural differences, performance characteristics, security models, and when to use each technology.

Page Complete

You now understand containers at a conceptual level—what they are, how they work, and why they matter. This foundation is essential for everything that follows: comparing containers to VMs, understanding Docker internals, working with images, and orchestrating containers at scale.