Operating SystemsVirtualization

Containers

LevelIntermediate

Duration90 mins

TopicVirtualization

4 / 5

Container Images

Blueprints for Reproducible Containers

Container images are the foundational artifacts of container technology. They capture everything needed to run an application—code, dependencies, configuration, and filesystem—in an immutable, portable format. An image built once runs identically everywhere, solving the fundamental challenge of software distribution.

But images are more than just "zip files of applications." They feature sophisticated layer mechanics that enable efficient storage and distribution, content-addressable storage that guarantees integrity, and standardized formats that ensure interoperability across the ecosystem.

Understanding how images work—from Dockerfile instructions to registry distribution—transforms you from an image consumer to an image architect. You'll build smaller, faster, more secure images and troubleshoot issues that mystify developers who treat images as black boxes.

What You Will Learn

By the end of this page, you will understand container image structure and the OCI image specification, how Dockerfiles create layers, best practices for building optimized and secure images, how content-addressable storage works, and how images are distributed through registries.

Image Structure

A container image consists of two main components: layers (the filesystem content) and metadata (configuration and layer references). Understanding this structure is essential for building and troubleshooting images.

Layers:

Each layer represents a set of filesystem changes—files added, modified, or deleted. Layers are:

Immutable — Once created, a layer never changes
Content-Addressed — Identified by SHA256 hash of content
Stackable — Combined using union filesystems to create a unified view
Shareable — Multiple images can share common layers

image-structure.txt
CONTAINER IMAGE STRUCTURE
=========================
 
┌─────────────────────────────────────────────────────────────────────────┐
│                         IMAGE MANIFEST                                   │
│  (JSON document that describes the image)                               │
│                                                                          │
│  {                                                                       │
│    "schemaVersion": 2,                                                   │
│    "mediaType": "application/vnd.oci.image.manifest.v1+json",           │
│    "config": {                                                           │
│      "mediaType": "application/vnd.oci.image.config.v1+json",           │
│      "digest": "sha256:abc123...",  ◄── Points to config blob           │
│      "size": 7023                                                        │
│    },                                                                    │
│    "layers": [                      ◄── Ordered list of layers          │
│      { "digest": "sha256:111...", "size": 32543210 },  ◄── Layer 1     │
│      { "digest": "sha256:222...", "size": 1543210 },   ◄── Layer 2     │
│      { "digest": "sha256:333...", "size": 432100 }     ◄── Layer 3     │
│    ]                                                                     │
│  }                                                                       │
└─────────────────────────────────────────────────────────────────────────┘
                        │                           │
                        │                           │
                        ▼                           ▼
┌─────────────────────────────────────┐ ┌──────────────────────────────────┐
│         CONFIG BLOB                  │ │      LAYER BLOBS                 │
│  (JSON: runtime configuration)       │ │  (tar.gz: filesystem changes)   │
│                                      │ │                                  │
│  {                                   │ │  Layer 1: Base OS               │
│    "architecture": "amd64",          │ │  ┌────────────────────────────────│
│    "os": "linux",                    │ │  │ /bin/sh, /lib/*, /etc/*      │
│    "config": {                       │ │  │ (Debian/Alpine/Ubuntu base)  │
│      "Env": ["PATH=/usr/local/..."], │ │  └────────────────────────────────│
│      "Cmd": ["/bin/sh"],             │ │                                  │
│      "WorkingDir": "/app",           │ │  Layer 2: Runtime               │
│      "ExposedPorts": {"80/tcp":{}},  │ │  ┌────────────────────────────────│
│      "Labels": {...}                 │ │  │ /usr/local/bin/python        │
│    },                                │ │  │ /usr/local/lib/python3.11/   │
│    "rootfs": {                       │ │  └────────────────────────────────│
│      "type": "layers",               │ │                                  │
│      "diff_ids": [                   │ │  Layer 3: Application           │
│        "sha256:aaa...",              │ │  ┌────────────────────────────────│
│        "sha256:bbb...",              │ │  │ /app/main.py                 │
│        "sha256:ccc..."               │ │  │ /app/requirements.txt        │
│      ]                               │ │  └────────────────────────────────│
│    },                                │ │                                  │
│    "history": [...]                  │ └──────────────────────────────────┘
│  }                                   │
└─────────────────────────────────────┘
 
                        UNIFIED VIEW (what container sees)
┌─────────────────────────────────────────────────────────────────────────┐
│  /                                                                       │
│  ├── bin/            (from Layer 1)                                      │
│  ├── lib/            (from Layer 1)                                      │
│  ├── etc/            (from Layer 1, maybe modified by Layer 2)          │
│  ├── usr/                                                                │
│  │   └── local/                                                          │
│  │       ├── bin/python  (from Layer 2)                                 │
│  │       └── lib/python3.11/  (from Layer 2)                            │
│  └── app/                                                                │
│      ├── main.py     (from Layer 3)                                      │
│      └── requirements.txt (from Layer 3)                                │
└─────────────────────────────────────────────────────────────────────────┘

Content Addressing

Every blob (layer or config) is identified by its SHA256 digest. This means identical content always has the same identifier, enabling deduplication across images. If two images share a base layer, it's stored only once. This also provides integrity verification—if content is corrupted, the digest won't match.

Dockerfiles and Layer Creation

A Dockerfile is a script that defines how to build an image. Each instruction in a Dockerfile creates a new layer (with some exceptions). Understanding which instructions create layers is crucial for optimizing image size and build performance.

Layer-Creating Instructions:

Dockerfile Instructions
Instruction	Creates Layer?	Purpose	Example
`FROM`	Yes (uses existing)	Base image	`FROM python:3.11-slim`
`RUN`	Yes	Execute commands	`RUN apt-get update && apt-get install -y curl`
`COPY`	Yes	Copy local files	`COPY ./app /app`
`ADD`	Yes	Copy + extract archives	`ADD archive.tar.gz /app`
`ENV`	No (metadata)	Set environment variable	`ENV NODE_ENV=production`
`EXPOSE`	No (metadata)	Document port	`EXPOSE 8080`
`CMD`	No (metadata)	Default command	`CMD ["python", "app.py"]`
`ENTRYPOINT`	No (metadata)	Main executable	`ENTRYPOINT ["docker-entrypoint.sh"]`
`WORKDIR`	No (metadata)	Working directory	`WORKDIR /app`
`USER`	No (metadata)	Runtime user	`USER appuser`
`LABEL`	No (metadata)	Image metadata	`LABEL version="1.0"`
`ARG`	No (build-time)	Build argument	`ARG VERSION=latest`

Dockerfile
Dockerfile
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Each instruction is shown with its layer impact
 
# Layer 1: Pull base image (reused from registry)
FROM python:3.11-slim
 
# No layer: just metadata
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
 
# No layer: just metadata
WORKDIR /app
 
# Layer 2: Install system dependencies
# Combine commands with && to minimize layers
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        gcc \
        libpq-dev \
    && rm -rf /var/lib/apt/lists/*   # Clean up in same layer!
 
# Layer 3: Install Python dependencies
# Copy requirements first for better caching
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
 
# Layer 4: Copy application code
# This changes frequently, so it's a separate layer
COPY . .
 
# No layer: just metadata
EXPOSE 8000
USER appuser
 
# No layer: just metadata
CMD ["gunicorn", "app:app", "-b", "0.0.0.0:8000"]

Layer Caching Trap

Docker caches layers and reuses them if nothing has changed. But cache invalidation cascades—if Layer 2 changes, Layers 3 and 4 must be rebuilt even if their instructions are identical. That's why we copy requirements.txt before copying all code: if only code changes, pip install is cached.

Multi-Stage Builds

Multi-stage builds are a powerful technique for creating small, production-optimized images. They allow you to use different base images for building and running your application, copying only the necessary artifacts to the final image.

The problem multi-stage solves:

Build-time dependencies often dwarf runtime requirements. A Go application might need:

Go compiler (500MB+) → to build
libc (5MB) → to run

Without multi-stage builds, your image includes the entire Go toolchain. With multi-stage, you build in one image and copy just the binary to a minimal runtime image.

multi-stage.dockerfile
Dockerfile
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# Multi-stage build for a Go application
# Each FROM starts a new stage; only final stage goes to image
 
#############################################
# STAGE 1: Build
#############################################
FROM golang:1.21-alpine AS builder
 
# Build dependencies
RUN apk add --no-cache git ca-certificates
 
WORKDIR /build
 
# Copy go module files first (cache dependency downloads)
COPY go.mod go.sum ./
RUN go mod download
 
# Copy source code
COPY . .
 
# Build binary
# CGO_ENABLED=0 for static binary
# -ldflags for smaller binary
RUN CGO_ENABLED=0 GOOS=linux go build \
    -ldflags='-w -s -extldflags "-static"' \
    -o /app/server ./cmd/server
 
#############################################
# STAGE 2: Runtime
#############################################
FROM scratch AS runtime
# 'scratch' is an empty image - minimal possible size
 
# Copy CA certificates for HTTPS
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
 
# Copy only the compiled binary
COPY --from=builder /app/server /server
 
# Non-root user (numeric since no passwd in scratch)
USER 65534
 
EXPOSE 8080
ENTRYPOINT ["/server"]
 
# Result:
# - Builder stage: ~900MB (Go toolchain + dependencies)
# - Runtime stage: ~10MB (just the binary + certs)
# - Savings: 99% size reduction!

Common multi-stage patterns:

Pattern	Build Stage	Runtime Stage	Use Case
Compiled Language	golang, rust	scratch, alpine	Go, Rust, C binaries
Node.js	node (npm install)	node:slim (run only)	React, Next.js apps
Python	python + build tools	python-slim	Apps with C extensions
Java	maven/gradle	openjdk:jre	Spring Boot apps
Testing	Full + test tools	Slim runtime	Run tests, deploy tested
CI/CD Assets	Full toolchain	nginx	Build static sites

Named Stages

Use AS to name stages (AS builder, AS runtime). You can then use --target to build specific stages: 'docker build --target builder' builds only the builder stage, useful for debugging build issues or running tests in CI.

Image Optimization Best Practices

Optimized images are smaller, faster to build, faster to distribute, and more secure. Here are essential best practices organized by impact:

1. Choose Minimal Base Images:

Base Image Size Comparison
Base Image	Size	Use Case	Trade-offs
`scratch`	0 MB	Static binaries (Go, Rust)	No shell, no utilities
`alpine`	~5 MB	General purpose, small	musl libc (some compat issues)
`distroless`	~20 MB	Security-focused, minimal	No shell, harder to debug
`python:3.11-slim`	~150 MB	Python apps	Debian-based, glibc
`python:3.11`	~1 GB	Full development	Includes compilers, tools
`ubuntu:22.04`	~75 MB	General purpose	Familiar tools, larger

optimization-practices.dockerfile
Dockerfile
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# ❌ BAD: Large, inefficient Dockerfile
FROM python:3.11
 
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean     # Cleaning in separate layer doesn't help!
CMD ["python", "app.py"]
 
# Problems:
# - Full python image (~1GB)
# - Each RUN is a new layer
# - apt-get clean in separate layer = files still in previous layer
# - Copies everything including .git, __pycache__, etc.
# - Poor cache utilization
 
# ────────────────────────────────────────────────────────
 
# ✅ GOOD: Optimized Dockerfile
FROM python:3.11-slim AS base
 
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PIP_NO_CACHE_DIR=1 \
    PIP_DISABLE_PIP_VERSION_CHECK=1
 
FROM base AS builder
 
WORKDIR /app
 
# Install build dependencies
RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc libpq-dev && \
    rm -rf /var/lib/apt/lists/*
 
# Install Python dependencies into a virtual env
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
 
COPY requirements.txt .
RUN pip install -r requirements.txt
 
FROM base AS runtime
 
# Copy virtual environment from builder
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
 
# Create non-root user
RUN useradd -m -r appuser && \
    mkdir -p /app && chown appuser:appuser /app
USER appuser
WORKDIR /app
 
# Copy only application code
COPY --chown=appuser:appuser ./src .
 
EXPOSE 8000
CMD ["gunicorn", "app:app", "-b", "0.0.0.0:8000"]
 
# Improvements:
# - Slim base image (~150MB vs ~1GB)
# - Multi-stage eliminates build dependencies
# - Combined RUN commands with cleanup
# - Virtual env for clean dependency copy
# - Non-root user for security
# - Proper cache ordering

Essential Optimization Checklist

•Use .dockerignore — Exclude .git, node_modules, pycache, *.log, build artifacts. Reduces context size and prevents leaking secrets.
•Combine RUN commands — Merge related commands with && and clean up in the same layer. What's deleted in a later layer still exists in earlier layers.
•Order by change frequency — Put rarely-changing instructions (OS updates, dependency install) before frequently-changing ones (COPY source code).
•Use specific tags — Never use :latest in production. Pin versions like python:3.11.4-slim-bookworm for reproducibility.
•Remove package manager caches — Delete /var/lib/apt/lists/*, pip cache, npm cache in the same RUN command.
•Avoid ADD for remote URLs — Use RUN curl instead for better caching and error handling.
•Minimize layers for frequently pulled images — Each layer adds overhead during pull. Combine where sensible.

Security Best Practices

Image security is critical—a vulnerability in your base image or dependencies becomes a vulnerability in every container running that image. Adopting security practices during image creation prevents costly remediation later.

Security Principles for Container Images:

Image Security Essentials

•Never run as root — Use USER instruction to switch to non-root user. Reduces impact if container is compromised.
•Use read-only filesystem — Run containers with --read-only and mount tmpfs for writable directories.
•Minimize attack surface — Fewer packages = fewer vulnerabilities. Use distroless or minimal base images.
•Never bake secrets into images — Use environment variables, secrets managers, or Docker secrets. Secrets in layers are extractable.
•Verify base image signatures — Enable Docker Content Trust (DOCKER_CONTENT_TRUST=1) for signed images.
•Scan images for vulnerabilities — Use Trivy, Grype, Snyk, or registry-integrated scanning.
•Keep base images updated — Rebuild and redeploy when base images receive security updates.

secure-dockerfile.dockerfile
Dockerfile
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Security-focused Dockerfile example
 
FROM python:3.11-slim-bookworm AS builder
 
# Don't run as root during build either
RUN groupadd -g 1001 appgroup && \
    useradd -u 1001 -g appgroup appuser
 
WORKDIR /app
 
# Pin versions for reproducibility
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt
 
FROM gcr.io/distroless/python3-debian12 AS runtime
 
# Distroless: no shell, no package manager, minimal CVEs
 
# Copy from builder with correct ownership
COPY --from=builder /home/appuser/.local /home/appuser/.local
COPY --chown=1001:1001 ./src /app
 
# Use numeric UID (no passwd file in distroless)
USER 1001
 
ENV PATH="/home/appuser/.local/bin:$PATH"
WORKDIR /app
 
# Expose (documentation only)
EXPOSE 8000
 
# Fixed entrypoint (not shell form)
ENTRYPOINT ["python", "app.py"]
 
# No CMD default arguments (explicit > implicit)

Scanning Images for Vulnerabilities:

vulnerability-scanning.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Scan image with Trivy (popular open-source scanner)
$ trivy image myapp:latest
 
myapp:latest (debian 12.0)
===========================
Total: 23 (UNKNOWN: 0, LOW: 15, MEDIUM: 6, HIGH: 2, CRITICAL: 0)
 
┌───────────────────┬────────────────┬──────────┬───────────────────────────────┐
│      Library      │ Vulnerability  │ Severity │         Installed Version     │
├───────────────────┼────────────────┼──────────┼───────────────────────────────┤
│ curl              │ CVE-2023-38545 │ HIGH     │ 7.88.1-10                     │
│ openssl           │ CVE-2023-3446  │ MEDIUM   │ 3.0.9-1                       │
└───────────────────┴────────────────┴──────────┴───────────────────────────────┘
 
# Scan during CI/CD build to prevent vulnerable images from deploying
# Fail build if HIGH or CRITICAL vulnerabilities found:
$ trivy image --exit-code 1 --severity HIGH,CRITICAL myapp:latest
 
# Scan Dockerfile for misconfigurations
$ trivy config ./Dockerfile
 
Dockerfile (dockerfile)
========================
Tests: 23 (SUCCESSES: 20, FAILURES: 3)
Failures: 3
 
MEDIUM: Specify version for base image
─────────────────────────────────────
Use specific image version instead of 'latest'

Automate Security Scanning

Integrate vulnerability scanning into your CI/CD pipeline. Block deployments if critical vulnerabilities are found. Use registry-side scanning (Docker Hub, Harbor, ECR) to catch issues in stored images. Set up alerts for new CVEs affecting your images.

Registry Operations

Container registries store and distribute images. Understanding registry operations—authentication, pushing, pulling, and tagging—is essential for working with containers in any environment.

Registry Concepts:

Concept	Description	Example
Registry	Server that stores images	docker.io, gcr.io, ghcr.io
Repository	Collection of related images	library/nginx, mycompany/myapp
Tag	Version identifier within a repository	v1.0.0, latest, dev
Digest	Immutable content identifier	@sha256:abc123...
Manifest	JSON describing image layers	Retrieved on pull

registry-operations.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# Image naming: [REGISTRY/][NAMESPACE/]NAME[:TAG|@DIGEST]
 
# Examples:
docker.io/library/nginx:latest          # Docker Hub official
docker.io/myuser/myapp:v1.0.0          # Docker Hub user
gcr.io/my-project/backend:abcdef       # Google Container Registry
ghcr.io/myorg/frontend:main            # GitHub Container Registry
registry.example.com/api:2.1.0         # Private registry
 
# ───────────────────────────────────────────────────────
 
# Authentication
$ docker login                          # Docker Hub
$ docker login ghcr.io                  # GitHub
$ docker login gcr.io                   # GCR (use gcloud helper)
$ cat ~/.docker/config.json             # Credentials stored here
 
# ───────────────────────────────────────────────────────
 
# Push workflow
# 1. Build image
$ docker build -t myapp:v1.0.0 .
 
# 2. Tag for target registry
$ docker tag myapp:v1.0.0 ghcr.io/myorg/myapp:v1.0.0
$ docker tag myapp:v1.0.0 ghcr.io/myorg/myapp:latest
 
# 3. Push to registry
$ docker push ghcr.io/myorg/myapp:v1.0.0
The push refers to repository [ghcr.io/myorg/myapp]
5f70bf18a086: Pushed        # Only uploads layers registry doesn't have
2a15ad3e3c6b: Layer already exists
v1.0.0: digest: sha256:abc123... size: 1156
 
# ───────────────────────────────────────────────────────
 
# Pull workflow
$ docker pull nginx:1.25.3
1.25.3: Pulling from library/nginx
8a1e25ce7c4f: Already exists     # Layer already in local cache
a9c9c5e96c3c: Downloading  [==>  ]  1.2MB/15MB
...
Digest: sha256:def456...
Status: Downloaded newer image for nginx:1.25.3
 
# Pull by digest (immutable, guaranteed exact image)
$ docker pull nginx@sha256:def456...
 
# ───────────────────────────────────────────────────────
 
# Inspect remote image without pulling
$ docker manifest inspect nginx:latest
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "...",
         "size": 1156,
         "digest": "sha256:abc...",
         "platform": {"architecture": "amd64", "os": "linux"}
      },
      {
         "platform": {"architecture": "arm64", "os": "linux"}
      }
   ]
}

Don't Trust :latest

:latest is just a convention, not a guarantee of the newest version. It's mutable—the image it points to changes. For production, always use specific version tags or digests. docker pull myapp:latest today might give a completely different image than tomorrow.

Multi-Architecture Images:

Modern registries support multi-architecture images (manifest lists). A single tag can reference different images for amd64, arm64, etc. Docker automatically pulls the right architecture:

multi-arch.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Build and push multi-architecture image with buildx
$ docker buildx create --use
 
$ docker buildx build \
    --platform linux/amd64,linux/arm64 \
    --tag ghcr.io/myorg/myapp:v1.0.0 \
    --push .
 
# This creates:
# - Image for amd64 (Intel/AMD servers)
# - Image for arm64 (AWS Graviton, Apple Silicon)
# - Manifest list pointing to both
 
# When users pull, they automatically get the right architecture:
$ docker pull ghcr.io/myorg/myapp:v1.0.0  # Gets correct arch

Layer and Image Inspection

Understanding how to inspect images helps with debugging, size optimization, and security audits. Docker provides several tools for exploring image internals.

Essential Inspection Commands:

image-inspection.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# View image history (shows Dockerfile commands and layer sizes)
$ docker history nginx:latest
IMAGE          CREATED       CREATED BY                                      SIZE
a6bd71f48f68   2 weeks ago   /bin/sh -c #(nop)  CMD ["nginx" "-g" "daemon…   0B
<missing>      2 weeks ago   /bin/sh -c #(nop)  STOPSIGNAL SIGQUIT           0B
<missing>      2 weeks ago   /bin/sh -c #(nop)  EXPOSE 80                    0B
<missing>      2 weeks ago   /bin/sh -c #(nop)  ENTRYPOINT ["/docker-entr…   0B
<missing>      2 weeks ago   /bin/sh -c set -x  && apt-get update  && apt…   88.5MB
<missing>      2 weeks ago   /bin/sh -c #(nop)  ENV NGINX_VERSION=1.25.3     0B
...
 
# Detailed image metadata (JSON format)
$ docker inspect nginx:latest | jq '.[0].Config'
{
  "Hostname": "",
  "Env": ["PATH=...", "NGINX_VERSION=1.25.3"],
  "Cmd": ["nginx", "-g", "daemon off;"],
  "ExposedPorts": {"80/tcp": {}},
  "Labels": {...}
}
 
# View image layers and sizes
$ docker inspect nginx:latest | jq '.[0].RootFS.Layers'
[
  "sha256:aad...",   # Layer 1
  "sha256:bbd...",   # Layer 2
  ...
]
 
# ───────────────────────────────────────────────────────
 
# Dive: Interactive layer exploration (third-party tool)
# Install: https://github.com/wagoodman/dive
$ dive nginx:latest
 
# Shows:
# - Each layer's contents
# - What was added/modified/deleted
# - Wasted space (files deleted in later layers)
# - Image efficiency score
 
# Example output:
# Layer 1: Base Debian (+78 MB)
# Layer 2: apt-get install nginx (+88 MB)
#          Added: /usr/sbin/nginx, /etc/nginx/*, ...
# Layer 3: Configuration (+1.2 KB)
#          Modified: /etc/nginx/nginx.conf
 
# ───────────────────────────────────────────────────────
 
# Export image filesystem for analysis
$ docker save nginx:latest | tar -xf - -C /tmp/nginx-image
$ ls /tmp/nginx-image
blobs/    index.json    manifest.json    oci-layout
 
# Or export a container's filesystem
$ docker export $(docker create nginx:latest) | tar -tf - | head -20
bin/
boot/
dev/
...

Analyzing Image Size

Use 'dive' to identify layer bloat. Common issues: deleted files still in previous layers, package caches not cleaned, unnecessary build dependencies included. Fix these by combining RUN commands and cleaning in the same layer, or using multi-stage builds.

Summary: Container Images

We've explored container images in depth—from their structure and creation to optimization and distribution. Let's consolidate the key concepts:

Key Takeaways

•Images consist of layers and metadata — Layers are immutable, content-addressed filesystem changes. Metadata includes configuration, environment, and runtime settings.
•Dockerfile instructions create layers — RUN, COPY, and ADD create layers. Order by change frequency for optimal caching. Combine commands to minimize layers.
•Multi-stage builds dramatically reduce size — Build in one image, copy only artifacts to minimal runtime image. Can reduce image size by 90%+.
•Security starts at the image level — Use minimal base images, never run as root, never bake in secrets, regularly scan for vulnerabilities.
•Registries distribute images efficiently — Content-addressing enables deduplication. Only missing layers are transferred. Multi-arch support covers different platforms.
•Inspect images to understand and optimize — Use docker history, docker inspect, and dive to analyze layers, identify bloat, and verify contents.

What's next:

Now that we understand container images and how to build them effectively, the final page explores container orchestration—how systems like Kubernetes manage containers at scale, handling scheduling, scaling, networking, and self-healing across clusters of machines.

Page Complete

You now understand the complete lifecycle of container images—from Dockerfile to registry to running container. This knowledge is foundational for building production-grade containerized applications with optimal size, performance, and security characteristics.

4 / 5

Loading learning content...

Operating SystemsVirtualization

Containers

LevelIntermediate

Duration90 mins

TopicVirtualization

4 / 5

Container Images

Blueprints for Reproducible Containers

What You Will Learn

Image Structure

Layers:

Each layer represents a set of filesystem changes—files added, modified, or deleted. Layers are:

Immutable — Once created, a layer never changes
Content-Addressed — Identified by SHA256 hash of content
Stackable — Combined using union filesystems to create a unified view
Shareable — Multiple images can share common layers

image-structure.txt
CONTAINER IMAGE STRUCTURE
=========================
 
┌─────────────────────────────────────────────────────────────────────────┐
│                         IMAGE MANIFEST                                   │
│  (JSON document that describes the image)                               │
│                                                                          │
│  {                                                                       │
│    "schemaVersion": 2,                                                   │
│    "mediaType": "application/vnd.oci.image.manifest.v1+json",           │
│    "config": {                                                           │
│      "mediaType": "application/vnd.oci.image.config.v1+json",           │
│      "digest": "sha256:abc123...",  ◄── Points to config blob           │
│      "size": 7023                                                        │
│    },                                                                    │
│    "layers": [                      ◄── Ordered list of layers          │
│      { "digest": "sha256:111...", "size": 32543210 },  ◄── Layer 1     │
│      { "digest": "sha256:222...", "size": 1543210 },   ◄── Layer 2     │
│      { "digest": "sha256:333...", "size": 432100 }     ◄── Layer 3     │
│    ]                                                                     │
│  }                                                                       │
└─────────────────────────────────────────────────────────────────────────┘
                        │                           │
                        │                           │
                        ▼                           ▼
┌─────────────────────────────────────┐ ┌──────────────────────────────────┐
│         CONFIG BLOB                  │ │      LAYER BLOBS                 │
│  (JSON: runtime configuration)       │ │  (tar.gz: filesystem changes)   │
│                                      │ │                                  │
│  {                                   │ │  Layer 1: Base OS               │
│    "architecture": "amd64",          │ │  ┌────────────────────────────────│
│    "os": "linux",                    │ │  │ /bin/sh, /lib/*, /etc/*      │
│    "config": {                       │ │  │ (Debian/Alpine/Ubuntu base)  │
│      "Env": ["PATH=/usr/local/..."], │ │  └────────────────────────────────│
│      "Cmd": ["/bin/sh"],             │ │                                  │
│      "WorkingDir": "/app",           │ │  Layer 2: Runtime               │
│      "ExposedPorts": {"80/tcp":{}},  │ │  ┌────────────────────────────────│
│      "Labels": {...}                 │ │  │ /usr/local/bin/python        │
│    },                                │ │  │ /usr/local/lib/python3.11/   │
│    "rootfs": {                       │ │  └────────────────────────────────│
│      "type": "layers",               │ │                                  │
│      "diff_ids": [                   │ │  Layer 3: Application           │
│        "sha256:aaa...",              │ │  ┌────────────────────────────────│
│        "sha256:bbb...",              │ │  │ /app/main.py                 │
│        "sha256:ccc..."               │ │  │ /app/requirements.txt        │
│      ]                               │ │  └────────────────────────────────│
│    },                                │ │                                  │
│    "history": [...]                  │ └──────────────────────────────────┘
│  }                                   │
└─────────────────────────────────────┘
 
                        UNIFIED VIEW (what container sees)
┌─────────────────────────────────────────────────────────────────────────┐
│  /                                                                       │
│  ├── bin/            (from Layer 1)                                      │
│  ├── lib/            (from Layer 1)                                      │
│  ├── etc/            (from Layer 1, maybe modified by Layer 2)          │
│  ├── usr/                                                                │
│  │   └── local/                                                          │
│  │       ├── bin/python  (from Layer 2)                                 │
│  │       └── lib/python3.11/  (from Layer 2)                            │
│  └── app/                                                                │
│      ├── main.py     (from Layer 3)                                      │
│      └── requirements.txt (from Layer 3)                                │
└─────────────────────────────────────────────────────────────────────────┘

Content Addressing

Dockerfiles and Layer Creation

Layer-Creating Instructions:

Dockerfile Instructions
Instruction	Creates Layer?	Purpose	Example
`FROM`	Yes (uses existing)	Base image	`FROM python:3.11-slim`
`RUN`	Yes	Execute commands	`RUN apt-get update && apt-get install -y curl`
`COPY`	Yes	Copy local files	`COPY ./app /app`
`ADD`	Yes	Copy + extract archives	`ADD archive.tar.gz /app`
`ENV`	No (metadata)	Set environment variable	`ENV NODE_ENV=production`
`EXPOSE`	No (metadata)	Document port	`EXPOSE 8080`
`CMD`	No (metadata)	Default command	`CMD ["python", "app.py"]`
`ENTRYPOINT`	No (metadata)	Main executable	`ENTRYPOINT ["docker-entrypoint.sh"]`
`WORKDIR`	No (metadata)	Working directory	`WORKDIR /app`
`USER`	No (metadata)	Runtime user	`USER appuser`
`LABEL`	No (metadata)	Image metadata	`LABEL version="1.0"`
`ARG`	No (build-time)	Build argument	`ARG VERSION=latest`

Dockerfile
Dockerfile
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Each instruction is shown with its layer impact
 
# Layer 1: Pull base image (reused from registry)
FROM python:3.11-slim
 
# No layer: just metadata
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
 
# No layer: just metadata
WORKDIR /app
 
# Layer 2: Install system dependencies
# Combine commands with && to minimize layers
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        gcc \
        libpq-dev \
    && rm -rf /var/lib/apt/lists/*   # Clean up in same layer!
 
# Layer 3: Install Python dependencies
# Copy requirements first for better caching
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
 
# Layer 4: Copy application code
# This changes frequently, so it's a separate layer
COPY . .
 
# No layer: just metadata
EXPOSE 8000
USER appuser
 
# No layer: just metadata
CMD ["gunicorn", "app:app", "-b", "0.0.0.0:8000"]

Layer Caching Trap

Multi-Stage Builds

The problem multi-stage solves:

Build-time dependencies often dwarf runtime requirements. A Go application might need:

Go compiler (500MB+) → to build
libc (5MB) → to run

Without multi-stage builds, your image includes the entire Go toolchain. With multi-stage, you build in one image and copy just the binary to a minimal runtime image.

multi-stage.dockerfile
Dockerfile
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# Multi-stage build for a Go application
# Each FROM starts a new stage; only final stage goes to image
 
#############################################
# STAGE 1: Build
#############################################
FROM golang:1.21-alpine AS builder
 
# Build dependencies
RUN apk add --no-cache git ca-certificates
 
WORKDIR /build
 
# Copy go module files first (cache dependency downloads)
COPY go.mod go.sum ./
RUN go mod download
 
# Copy source code
COPY . .
 
# Build binary
# CGO_ENABLED=0 for static binary
# -ldflags for smaller binary
RUN CGO_ENABLED=0 GOOS=linux go build \
    -ldflags='-w -s -extldflags "-static"' \
    -o /app/server ./cmd/server
 
#############################################
# STAGE 2: Runtime
#############################################
FROM scratch AS runtime
# 'scratch' is an empty image - minimal possible size
 
# Copy CA certificates for HTTPS
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
 
# Copy only the compiled binary
COPY --from=builder /app/server /server
 
# Non-root user (numeric since no passwd in scratch)
USER 65534
 
EXPOSE 8080
ENTRYPOINT ["/server"]
 
# Result:
# - Builder stage: ~900MB (Go toolchain + dependencies)
# - Runtime stage: ~10MB (just the binary + certs)
# - Savings: 99% size reduction!

Common multi-stage patterns:

Pattern	Build Stage	Runtime Stage	Use Case
Compiled Language	golang, rust	scratch, alpine	Go, Rust, C binaries
Node.js	node (npm install)	node:slim (run only)	React, Next.js apps
Python	python + build tools	python-slim	Apps with C extensions
Java	maven/gradle	openjdk:jre	Spring Boot apps
Testing	Full + test tools	Slim runtime	Run tests, deploy tested
CI/CD Assets	Full toolchain	nginx	Build static sites

Named Stages

Image Optimization Best Practices

Optimized images are smaller, faster to build, faster to distribute, and more secure. Here are essential best practices organized by impact:

1. Choose Minimal Base Images:

Base Image Size Comparison
Base Image	Size	Use Case	Trade-offs
`scratch`	0 MB	Static binaries (Go, Rust)	No shell, no utilities
`alpine`	~5 MB	General purpose, small	musl libc (some compat issues)
`distroless`	~20 MB	Security-focused, minimal	No shell, harder to debug
`python:3.11-slim`	~150 MB	Python apps	Debian-based, glibc
`python:3.11`	~1 GB	Full development	Includes compilers, tools
`ubuntu:22.04`	~75 MB	General purpose	Familiar tools, larger

optimization-practices.dockerfile
Dockerfile
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# ❌ BAD: Large, inefficient Dockerfile
FROM python:3.11
 
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean     # Cleaning in separate layer doesn't help!
CMD ["python", "app.py"]
 
# Problems:
# - Full python image (~1GB)
# - Each RUN is a new layer
# - apt-get clean in separate layer = files still in previous layer
# - Copies everything including .git, __pycache__, etc.
# - Poor cache utilization
 
# ────────────────────────────────────────────────────────
 
# ✅ GOOD: Optimized Dockerfile
FROM python:3.11-slim AS base
 
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1 \
    PIP_NO_CACHE_DIR=1 \
    PIP_DISABLE_PIP_VERSION_CHECK=1
 
FROM base AS builder
 
WORKDIR /app
 
# Install build dependencies
RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc libpq-dev && \
    rm -rf /var/lib/apt/lists/*
 
# Install Python dependencies into a virtual env
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
 
COPY requirements.txt .
RUN pip install -r requirements.txt
 
FROM base AS runtime
 
# Copy virtual environment from builder
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
 
# Create non-root user
RUN useradd -m -r appuser && \
    mkdir -p /app && chown appuser:appuser /app
USER appuser
WORKDIR /app
 
# Copy only application code
COPY --chown=appuser:appuser ./src .
 
EXPOSE 8000
CMD ["gunicorn", "app:app", "-b", "0.0.0.0:8000"]
 
# Improvements:
# - Slim base image (~150MB vs ~1GB)
# - Multi-stage eliminates build dependencies
# - Combined RUN commands with cleanup
# - Virtual env for clean dependency copy
# - Non-root user for security
# - Proper cache ordering

Essential Optimization Checklist

•Use .dockerignore — Exclude .git, node_modules, pycache, *.log, build artifacts. Reduces context size and prevents leaking secrets.
•Combine RUN commands — Merge related commands with && and clean up in the same layer. What's deleted in a later layer still exists in earlier layers.
•Order by change frequency — Put rarely-changing instructions (OS updates, dependency install) before frequently-changing ones (COPY source code).
•Use specific tags — Never use :latest in production. Pin versions like python:3.11.4-slim-bookworm for reproducibility.
•Remove package manager caches — Delete /var/lib/apt/lists/*, pip cache, npm cache in the same RUN command.
•Avoid ADD for remote URLs — Use RUN curl instead for better caching and error handling.
•Minimize layers for frequently pulled images — Each layer adds overhead during pull. Combine where sensible.

Security Best Practices

Security Principles for Container Images:

Image Security Essentials

•Never run as root — Use USER instruction to switch to non-root user. Reduces impact if container is compromised.
•Use read-only filesystem — Run containers with --read-only and mount tmpfs for writable directories.
•Minimize attack surface — Fewer packages = fewer vulnerabilities. Use distroless or minimal base images.
•Never bake secrets into images — Use environment variables, secrets managers, or Docker secrets. Secrets in layers are extractable.
•Verify base image signatures — Enable Docker Content Trust (DOCKER_CONTENT_TRUST=1) for signed images.
•Scan images for vulnerabilities — Use Trivy, Grype, Snyk, or registry-integrated scanning.
•Keep base images updated — Rebuild and redeploy when base images receive security updates.

secure-dockerfile.dockerfile
Dockerfile
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Security-focused Dockerfile example
 
FROM python:3.11-slim-bookworm AS builder
 
# Don't run as root during build either
RUN groupadd -g 1001 appgroup && \
    useradd -u 1001 -g appgroup appuser
 
WORKDIR /app
 
# Pin versions for reproducibility
COPY requirements.txt .
RUN pip install --no-cache-dir --user -r requirements.txt
 
FROM gcr.io/distroless/python3-debian12 AS runtime
 
# Distroless: no shell, no package manager, minimal CVEs
 
# Copy from builder with correct ownership
COPY --from=builder /home/appuser/.local /home/appuser/.local
COPY --chown=1001:1001 ./src /app
 
# Use numeric UID (no passwd file in distroless)
USER 1001
 
ENV PATH="/home/appuser/.local/bin:$PATH"
WORKDIR /app
 
# Expose (documentation only)
EXPOSE 8000
 
# Fixed entrypoint (not shell form)
ENTRYPOINT ["python", "app.py"]
 
# No CMD default arguments (explicit > implicit)

Scanning Images for Vulnerabilities:

vulnerability-scanning.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Scan image with Trivy (popular open-source scanner)
$ trivy image myapp:latest
 
myapp:latest (debian 12.0)
===========================
Total: 23 (UNKNOWN: 0, LOW: 15, MEDIUM: 6, HIGH: 2, CRITICAL: 0)
 
┌───────────────────┬────────────────┬──────────┬───────────────────────────────┐
│      Library      │ Vulnerability  │ Severity │         Installed Version     │
├───────────────────┼────────────────┼──────────┼───────────────────────────────┤
│ curl              │ CVE-2023-38545 │ HIGH     │ 7.88.1-10                     │
│ openssl           │ CVE-2023-3446  │ MEDIUM   │ 3.0.9-1                       │
└───────────────────┴────────────────┴──────────┴───────────────────────────────┘
 
# Scan during CI/CD build to prevent vulnerable images from deploying
# Fail build if HIGH or CRITICAL vulnerabilities found:
$ trivy image --exit-code 1 --severity HIGH,CRITICAL myapp:latest
 
# Scan Dockerfile for misconfigurations
$ trivy config ./Dockerfile
 
Dockerfile (dockerfile)
========================
Tests: 23 (SUCCESSES: 20, FAILURES: 3)
Failures: 3
 
MEDIUM: Specify version for base image
─────────────────────────────────────
Use specific image version instead of 'latest'

Automate Security Scanning

Registry Operations

Container registries store and distribute images. Understanding registry operations—authentication, pushing, pulling, and tagging—is essential for working with containers in any environment.

Registry Concepts:

Concept	Description	Example
Registry	Server that stores images	docker.io, gcr.io, ghcr.io
Repository	Collection of related images	library/nginx, mycompany/myapp
Tag	Version identifier within a repository	v1.0.0, latest, dev
Digest	Immutable content identifier	@sha256:abc123...
Manifest	JSON describing image layers	Retrieved on pull

registry-operations.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# Image naming: [REGISTRY/][NAMESPACE/]NAME[:TAG|@DIGEST]
 
# Examples:
docker.io/library/nginx:latest          # Docker Hub official
docker.io/myuser/myapp:v1.0.0          # Docker Hub user
gcr.io/my-project/backend:abcdef       # Google Container Registry
ghcr.io/myorg/frontend:main            # GitHub Container Registry
registry.example.com/api:2.1.0         # Private registry
 
# ───────────────────────────────────────────────────────
 
# Authentication
$ docker login                          # Docker Hub
$ docker login ghcr.io                  # GitHub
$ docker login gcr.io                   # GCR (use gcloud helper)
$ cat ~/.docker/config.json             # Credentials stored here
 
# ───────────────────────────────────────────────────────
 
# Push workflow
# 1. Build image
$ docker build -t myapp:v1.0.0 .
 
# 2. Tag for target registry
$ docker tag myapp:v1.0.0 ghcr.io/myorg/myapp:v1.0.0
$ docker tag myapp:v1.0.0 ghcr.io/myorg/myapp:latest
 
# 3. Push to registry
$ docker push ghcr.io/myorg/myapp:v1.0.0
The push refers to repository [ghcr.io/myorg/myapp]
5f70bf18a086: Pushed        # Only uploads layers registry doesn't have
2a15ad3e3c6b: Layer already exists
v1.0.0: digest: sha256:abc123... size: 1156
 
# ───────────────────────────────────────────────────────
 
# Pull workflow
$ docker pull nginx:1.25.3
1.25.3: Pulling from library/nginx
8a1e25ce7c4f: Already exists     # Layer already in local cache
a9c9c5e96c3c: Downloading  [==>  ]  1.2MB/15MB
...
Digest: sha256:def456...
Status: Downloaded newer image for nginx:1.25.3
 
# Pull by digest (immutable, guaranteed exact image)
$ docker pull nginx@sha256:def456...
 
# ───────────────────────────────────────────────────────
 
# Inspect remote image without pulling
$ docker manifest inspect nginx:latest
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "...",
         "size": 1156,
         "digest": "sha256:abc...",
         "platform": {"architecture": "amd64", "os": "linux"}
      },
      {
         "platform": {"architecture": "arm64", "os": "linux"}
      }
   ]
}

Don't Trust :latest

Multi-Architecture Images:

Modern registries support multi-architecture images (manifest lists). A single tag can reference different images for amd64, arm64, etc. Docker automatically pulls the right architecture:

multi-arch.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Build and push multi-architecture image with buildx
$ docker buildx create --use
 
$ docker buildx build \
    --platform linux/amd64,linux/arm64 \
    --tag ghcr.io/myorg/myapp:v1.0.0 \
    --push .
 
# This creates:
# - Image for amd64 (Intel/AMD servers)
# - Image for arm64 (AWS Graviton, Apple Silicon)
# - Manifest list pointing to both
 
# When users pull, they automatically get the right architecture:
$ docker pull ghcr.io/myorg/myapp:v1.0.0  # Gets correct arch

Layer and Image Inspection

Understanding how to inspect images helps with debugging, size optimization, and security audits. Docker provides several tools for exploring image internals.

Essential Inspection Commands:

image-inspection.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
# View image history (shows Dockerfile commands and layer sizes)
$ docker history nginx:latest
IMAGE          CREATED       CREATED BY                                      SIZE
a6bd71f48f68   2 weeks ago   /bin/sh -c #(nop)  CMD ["nginx" "-g" "daemon…   0B
<missing>      2 weeks ago   /bin/sh -c #(nop)  STOPSIGNAL SIGQUIT           0B
<missing>      2 weeks ago   /bin/sh -c #(nop)  EXPOSE 80                    0B
<missing>      2 weeks ago   /bin/sh -c #(nop)  ENTRYPOINT ["/docker-entr…   0B
<missing>      2 weeks ago   /bin/sh -c set -x  && apt-get update  && apt…   88.5MB
<missing>      2 weeks ago   /bin/sh -c #(nop)  ENV NGINX_VERSION=1.25.3     0B
...
 
# Detailed image metadata (JSON format)
$ docker inspect nginx:latest | jq '.[0].Config'
{
  "Hostname": "",
  "Env": ["PATH=...", "NGINX_VERSION=1.25.3"],
  "Cmd": ["nginx", "-g", "daemon off;"],
  "ExposedPorts": {"80/tcp": {}},
  "Labels": {...}
}
 
# View image layers and sizes
$ docker inspect nginx:latest | jq '.[0].RootFS.Layers'
[
  "sha256:aad...",   # Layer 1
  "sha256:bbd...",   # Layer 2
  ...
]
 
# ───────────────────────────────────────────────────────
 
# Dive: Interactive layer exploration (third-party tool)
# Install: https://github.com/wagoodman/dive
$ dive nginx:latest
 
# Shows:
# - Each layer's contents
# - What was added/modified/deleted
# - Wasted space (files deleted in later layers)
# - Image efficiency score
 
# Example output:
# Layer 1: Base Debian (+78 MB)
# Layer 2: apt-get install nginx (+88 MB)
#          Added: /usr/sbin/nginx, /etc/nginx/*, ...
# Layer 3: Configuration (+1.2 KB)
#          Modified: /etc/nginx/nginx.conf
 
# ───────────────────────────────────────────────────────
 
# Export image filesystem for analysis
$ docker save nginx:latest | tar -xf - -C /tmp/nginx-image
$ ls /tmp/nginx-image
blobs/    index.json    manifest.json    oci-layout
 
# Or export a container's filesystem
$ docker export $(docker create nginx:latest) | tar -tf - | head -20
bin/
boot/
dev/
...

Analyzing Image Size

Summary: Container Images

We've explored container images in depth—from their structure and creation to optimization and distribution. Let's consolidate the key concepts:

Key Takeaways

•Images consist of layers and metadata — Layers are immutable, content-addressed filesystem changes. Metadata includes configuration, environment, and runtime settings.
•Dockerfile instructions create layers — RUN, COPY, and ADD create layers. Order by change frequency for optimal caching. Combine commands to minimize layers.
•Multi-stage builds dramatically reduce size — Build in one image, copy only artifacts to minimal runtime image. Can reduce image size by 90%+.
•Security starts at the image level — Use minimal base images, never run as root, never bake in secrets, regularly scan for vulnerabilities.
•Registries distribute images efficiently — Content-addressing enables deduplication. Only missing layers are transferred. Multi-arch support covers different platforms.
•Inspect images to understand and optimize — Use docker history, docker inspect, and dive to analyze layers, identify bloat, and verify contents.

What's next:

Page Complete

4 / 5