Operating SystemsCloud Computing

Cloud Computing

LevelAdvanced

Duration75 mins

TopicCloud Computing

2 / 5

IaaS, PaaS, SaaS

The Three Pillars of Cloud Computing

Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) represent the canonical taxonomy of cloud computing service models. While we introduced these concepts previously, this page explores each model with the depth required to make informed architectural decisions and understand their operating systems foundations.

Think of these models as layers of abstraction in a distributed operating system spanning data centers worldwide. Each layer hides complexity from the layer above, enabling specialization—just as operating systems hide hardware complexity from applications, cloud service models hide infrastructure complexity from developers and users.

What You Will Master

By completing this page, you will understand the architectural underpinnings of IaaS, PaaS, and SaaS; recognize their operating systems implications; identify appropriate use cases for each model; and develop a decision framework for service model selection.

IaaS Architecture and Implementation

Infrastructure as a Service delivers fundamental computing resources—compute, storage, and networking—as programmatically controlled services. IaaS is built upon the virtualization technologies we've explored throughout this curriculum, combined with sophisticated control planes that orchestrate resources at scale.

The IaaS Control Plane:

Every IaaS platform comprises two fundamental layers:

Data Plane — The actual compute, storage, and network resources executing workloads
Control Plane — The APIs, schedulers, and orchestrators that provision and manage resources

┌─────────────────────────────────────────────────────────────────┐
│                        CONTROL PLANE                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐              │
│  │     API     │  │  Scheduler  │  │  Resource   │              │
│  │   Gateway   │──│   Engine    │──│   Manager   │              │
│  └─────────────┘  └─────────────┘  └─────────────┘              │
│         │               │                   │                    │
│         └───────────────┴───────────────────┘                    │
│                         │                                        │
├─────────────────────────┼────────────────────────────────────────┤
│                    DATA PLANE                                    │
│         ┌───────────────┴───────────────┐                        │
│  ┌──────┴──────┐  ┌──────┴──────┐  ┌─────┴──────┐               │
│  │   Compute   │  │   Storage   │  │  Network   │               │
│  │   Cluster   │  │   Cluster   │  │   Fabric   │               │
│  │             │  │             │  │            │               │
│  │ ┌──┐ ┌──┐  │  │ ┌──┐ ┌──┐  │  │ ┌──┐ ┌──┐ │               │
│  │ │VM│ │VM│  │  │ │HD│ │HD│  │  │ │SW│ │SW│ │               │
│  │ └──┘ └──┘  │  │ └──┘ └──┘  │  │ └──┘ └──┘ │               │
│  └─────────────┘  └─────────────┘  └───────────┘               │
└─────────────────────────────────────────────────────────────────┘

Compute Virtualization Stack:

IaaS compute relies on hypervisors to multiplex physical servers across multiple virtual machines:

Type-1 Hypervisors in Production:

KVM (Kernel-based Virtual Machine) — Linux kernel module treating the kernel as a hypervisor; used by Google Cloud, AWS (Nitro), and many OpenStack deployments
VMware ESXi — Bare-metal hypervisor dominating enterprise private clouds
Xen — Paravirtualization pioneer; historically used by AWS, still common in hosting environments
Microsoft Hyper-V — Powers Azure and Windows Server virtualization

The Nitro Architecture (AWS Case Study):

AWS developed custom hardware (Nitro cards) to offload virtualization overhead from the CPU:

Nitro Hypervisor — Minimal hypervisor based on KVM with most functionality offloaded
Nitro Cards — Custom ASICs handling I/O virtualization, security monitoring, and instance management
Nitro Enclaves — Isolated compute environments for processing sensitive data

This architecture delivers near-bare-metal performance by eliminating hypervisor CPU overhead for I/O operations.

IaaS Compute Instance Types

•General Purpose — Balanced CPU-to-memory ratio (e.g., AWS m6i, Azure D-series); OS considerations: standard kernel configurations, typical memory management patterns
•Compute Optimized — High CPU-to-memory ratio for compute-intensive workloads (e.g., c6i, Fsv2); OS tuning: CPU affinity, NUMA awareness, compute governor settings
•Memory Optimized — Large memory capacity for in-memory databases (e.g., r6i, Ev4); OS tuning: huge pages, memory overcommit policies, NUMA topology
•Storage Optimized — High I/O performance with local NVMe (e.g., i3en, Lsv2); OS considerations: I/O scheduler selection, filesystem tuning, device driver optimization
•Accelerated Computing — GPU/FPGA/ASIC instances for ML/HPC (e.g., p4d, NC-series); OS requirements: specific kernel versions, accelerator drivers, CUDA toolkit
•Bare Metal — Direct hardware access without hypervisor layer; full OS control including custom kernels and hypervisor deployment

Instance Selection Strategy

Start with general-purpose instances during development, then profile workloads to identify bottlenecks. CPU-bound → compute optimized. Memory pressure → memory optimized. I/O wait → storage optimized. The goal is matching instance characteristics to workload requirements at minimum cost.

IaaS Storage and Networking

Storage Virtualization:

IaaS provides multiple storage abstractions, each with distinct performance characteristics and OS interaction patterns:

Block Storage: Virtual disks that attach to VMs like physical drives. Examples: AWS EBS, Azure Managed Disks, GCP Persistent Disks.

Network-attached storage presenting as block devices to the guest OS
Guest OS sees standard block device (e.g., /dev/sda, /dev/nvme0n1)
Filesystems (ext4, XFS, NTFS) created on top, just like physical disks
Replication handled by the storage backend, transparent to guest OS

Object Storage: Infinitely scalable storage for unstructured data. Examples: S3, Azure Blob, Cloud Storage.

Accessed via HTTP/REST APIs, not mounted as filesystems
No OS-level interaction; applications use SDKs or CLI tools
Eventual consistency semantics (traditionally; many now offer strong consistency)

File Storage: Managed NFS/SMB filesystems. Examples: EFS, Azure Files, Filestore.

Mount directly in guest OS using standard NFS/SMB clients
Kernel NFS client communicates with managed file server
POSIX semantics with some cloud-specific behaviors

IaaS Storage Comparison
Aspect	Block Storage	Object Storage	File Storage
Access Method	Block device	HTTP API	NFS/SMB mount
OS Visibility	Block device	Application-level	Filesystem mount
Performance	Low latency, high IOPS	High throughput, higher latency	Medium latency
Scalability	Volume size limits	Virtually unlimited	Capacity tiers
Use Cases	Databases, boot volumes	Backups, media, data lakes	Shared storage, CMS
Consistency	Strong	Strong or eventual	Strong

Network Virtualization:

IaaS networking relies on Software-Defined Networking (SDN) to create isolated virtual networks:

Virtual Private Cloud (VPC):

Isolated network partition with customer-defined IP address ranges
Subnets span availability zones for redundancy
Route tables define traffic flow between subnets and external networks

Virtual Network Interfaces:

Elastic Network Interfaces (ENIs) attach to instances
Guest OS sees standard Ethernet device; driver (often virtio-net) communicates with hypervisor
MAC addresses and IPs assigned by cloud control plane

Security Groups and Network ACLs:

Stateful firewall rules (security groups) evaluated at hypervisor level
Stateless ACLs evaluated at subnet boundary
No guest OS configuration required; firewall runs in hypervisor's network stack

Load Balancing:

Application Load Balancers (Layer 7) inspect HTTP headers
Network Load Balancers (Layer 4) handle TCP/UDP with ultra-low latency
Traffic distribution transparent to guest OS

Network Performance Considerations

Network bandwidth is often tied to instance size—larger instances get more network capacity. For high-throughput workloads, consider placement groups (cluster placement for low latency, spread for resilience) and enhanced networking (SR-IOV, ENA, DPDK) for bypassing hypervisor network stack overhead.

PaaS Architecture Deep Dive

Platform as a Service abstracts infrastructure management, providing a complete development and deployment environment. PaaS platforms are essentially distributed operating systems for cloud applications—managing process scheduling, resource allocation, and inter-process communication across clusters of machines.

The PaaS Runtime Model:

┌─────────────────────────────────────────────────────────────────┐
│                     DEVELOPER INTERACTION                        │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  git push │ docker push │ CLI deploy │ CI/CD pipeline      │  │
│  └─────────────────────────┬─────────────────────────────────┘  │
├─────────────────────────────┼────────────────────────────────────┤
│                    PLATFORM LAYER                                │
│  ┌─────────────────────────┴─────────────────────────────────┐  │
│  │                    BUILD SYSTEM                            │  │
│  │     (Source → Container Image → Registry)                  │  │
│  └─────────────────────────┬─────────────────────────────────┘  │
│  ┌─────────────────────────┴─────────────────────────────────┐  │
│  │                   ORCHESTRATOR                             │  │
│  │     (Scheduler, Service Discovery, Load Balancing)         │  │
│  └─────────────────────────┬─────────────────────────────────┘  │
│  ┌─────────────────────────┴─────────────────────────────────┐  │
│  │                  RUNTIME CONTAINERS                        │  │
│  │  ┌────────┐  ┌────────┐  ┌────────┐  ┌────────┐           │  │
│  │  │ App    │  │ App    │  │ App    │  │ App    │           │  │
│  │  │ Pod 1  │  │ Pod 2  │  │ Pod 3  │  │ Pod N  │           │  │
│  │  └────────┘  └────────┘  └────────┘  └────────┘           │  │
│  └───────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │            MANAGED SERVICES (Hidden from Developer)        │  │
│  │    Databases │ Caches │ Message Queues │ Secrets Mgmt      │  │
│  └───────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

PaaS Component Analysis:

1. Build Systems and Buildpacks: Buildpacks detect application type and construct container images:

Analyze source code to identify language/framework
Download appropriate runtime (Node.js, Python, Java, etc.)
Install dependencies (npm, pip, Maven)
Configure entry points and health checks
Produce OCI-compliant container image

2. Container Orchestration: The platform scheduler places containers on infrastructure:

Resource requests and limits (CPU, memory) inform placement decisions
Health checks trigger automatic restarts
Horizontal Pod Autoscaler adjusts replica count based on metrics
Service mesh handles inter-service communication

3. Managed Services: PaaS platforms bundle common supporting services:

Managed Databases: PostgreSQL, MySQL, Redis—provisioned via API, scaling handled automatically
Message Queues: RabbitMQ, Pub/Sub—for asynchronous communication patterns
Secrets Management: Vault integration, automatic rotation, secure injection into runtime

PaaS Platform Examples

•Heroku — Pioneer of 'git push' deployment; buildpack-based; PostgreSQL, Redis add-ons
•Google App Engine — Standard and Flexible environments; automatic scaling; deep GCP integration
•AWS Elastic Beanstalk — Managed EC2 underneath; supports Docker, many language runtimes
•Azure App Service — Windows and Linux apps; integrated DevOps pipelines; hybrid connectivity

Serverless Platforms

•AWS Lambda — Event-driven functions; 15-minute timeout; 10GB memory; container image support
•Google Cloud Functions — HTTP and event triggers; tight Pub/Sub integration
•Azure Functions — Durable functions for workflows; consumption and premium plans
•Cloudflare Workers — Edge execution; V8 isolates; sub-millisecond cold starts

The Cold Start Problem

Serverless platforms scale to zero when idle, eliminating costs but introducing cold start latency. When a function is invoked after idle period, the platform must: allocate a container, download the runtime image, initialize the language runtime, and load your code. This can take 100ms to several seconds depending on language and dependencies.

PaaS Operating Systems Implications

Although PaaS abstracts the operating system, understanding OS concepts remains crucial for performance tuning, debugging, and making informed platform choices.

Resource Limits and Cgroups:

PaaS platforms use Linux cgroups to enforce resource limits:

┌────────────────────────────────────────────────────────────────┐
│                    CONTAINER RUNTIME                            │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                   YOUR APPLICATION                        │  │
│  │                                                           │  │
│  │   Memory Limit: 512MB  │  CPU Limit: 0.5 cores           │  │
│  │   File Descriptors: 1024  │  Disk I/O: Throttled         │  │
│  │                                                           │  │
│  └──────────────────────────────────────────────────────────┘  │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                   CGROUP SUBSYSTEMS                       │  │
│  │   memory: 536870912 bytes max                            │  │
│  │   cpu: 50000/100000 quota (50%)                          │  │
│  │   pids: 100 max processes                                │  │
│  │   blkio: 10MB/s read, 10MB/s write                       │  │
│  └──────────────────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────────────────┘

What happens when limits are exceeded:

Memory: OOM Killer terminates the process; container restarts
CPU: Process is throttled; request latency increases but doesn't crash
PIDs: Fork/clone syscalls fail; process can't spawn threads
Disk I/O: Requests queue up; latency increases dramatically

Filesystem Semantics in PaaS:

Ephemeral Filesystems:

Container filesystems are layered (overlayfs) and typically reset between deployments
Local writes may not persist across container restarts or horizontal scaling
/tmp is usually writable but has size limits enforced by cgroups

Read-Only Layers:

Application code is baked into container images (read-only)
Configuration often injected via environment variables or mounted secrets
Runtime modification of code isn't possible without redeployment

Persistent Storage Options:

Volume mounts connect to network-attached storage
Object storage for durable artifacts
Database services for structured data

Process Model Constraints:

Many PaaS platforms expect a single main process per container
Sidecar patterns (multiple containers in a pod) handle auxiliary concerns
Background workers typically run as separate deployments
Init systems (systemd, supervisord) often unnecessary or prohibited

Critical OS Knowledge for PaaS Development

•Memory Management — Understand heap size, GC tuning, memory-mapped files. Size memory limits to accommodate your runtime plus headroom.
•Signal Handling — SIGTERM initiates graceful shutdown; your app should handle it. SIGKILL follows after grace period.
•File Descriptors — Sockets, pipes, log files consume descriptors. Exhaustion causes connection failures.
•Networking — Understand DNS resolution caching, connection pooling, timeout semantics in container networks.
•Time and Clocks — Container time is host time. NTP may not be accessible. Consider clock drift in distributed coordination.
•Environment Variables — Primary configuration mechanism; understand shell escaping and special character handling.

SaaS Architecture Patterns

Software as a Service delivers complete applications to end users, with all infrastructure and platform concerns hidden. While SaaS consumers don't interact with operating systems, SaaS builders must master systems concepts to deliver reliable, scalable, multi-tenant applications.

The Multi-Tenancy Challenge:

Multi-tenancy—serving multiple customers from shared infrastructure—is the defining characteristic of SaaS. It introduces challenges at every layer of the stack:

┌─────────────────────────────────────────────────────────────────┐
│                        MULTI-TENANT SAAS                         │
│                                                                  │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐                │
│  │  Tenant A  │  │  Tenant B  │  │  Tenant C  │                │
│  │            │  │            │  │            │                │
│  │ 10 Users   │  │ 1000 Users │  │ 100 Users  │                │
│  └─────┬──────┘  └─────┬──────┘  └─────┬──────┘                │
│        │               │               │                        │
│        └───────────────┼───────────────┘                        │
│                        ▼                                         │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │              SHARED APPLICATION LAYER                     │  │
│  │          (Authentication, Business Logic, APIs)           │  │
│  │   ┌─────────────────────────────────────────────────┐    │  │
│  │   │  Tenant Context: Every request identifies tenant │    │  │
│  │   │  Row-Level Security: Database filters by tenant  │    │  │
│  │   │  Rate Limiting: Per-tenant quotas enforced       │    │  │
│  │   └─────────────────────────────────────────────────┘    │  │
│  └──────────────────────────────────────────────────────────┘  │
│                           ▼                                      │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │              SHARED DATABASE LAYER                        │  │
│  │    (All tenant data co-located with logical isolation)    │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

SaaS Isolation Models
Model	Compute	Database	Isolation	Cost/Tenant	Complexity
Siloed	Dedicated instances	Dedicated database	Maximum	Highest	High ops overhead
Bridge	Shared compute	Dedicated database	High	Medium	Moderate
Pooled	Shared compute	Shared database, schema isolation	Medium	Low	Complex data layer
Row-Level	Shared compute	Shared tables, row filtering	Lower	Lowest	Requires careful design

Noisy Neighbor Mitigation:

In pooled architectures, one tenant's workload can impact others. Mitigation strategies include:

Rate Limiting — Enforce per-tenant request quotas using token buckets or sliding windows
Resource Quotas — Limit database connections, storage consumption, API calls per tenant
Priority Queues — Process tenant requests with fairness scheduling (weighted queues)
Throttling — Degrade service gracefully rather than failing completely
Background Job Isolation — Run heavy computations in isolated worker pools

Data Isolation Techniques:

Tenant ID in Every Query — Every database query includes WHERE tenant_id = ?
Row-Level Security (RLS) — Database enforces tenant filtering automatically (PostgreSQL RLS, Azure RLS)
Separate Schemas — Each tenant has own schema; reduces query filtering but complicates migrations
Encryption Key Isolation — Per-tenant encryption keys prevent cross-tenant data access even if query filter fails

The Data Leak Risk

Multi-tenant data isolation failures result in catastrophic security breaches. A missing tenant_id filter in a single query can expose one customer's data to another. Defense in depth: implement tenant filtering at multiple layers—API gateway, application code, database policies. Never rely on a single protection mechanism.

SaaS Scalability Patterns

SaaS applications must handle unpredictable load variations—from quiet periods to viral growth. Scalability patterns leverage operating systems and cloud infrastructure capabilities:

Horizontal Scaling:

Adding more instances to handle increased load:

Stateless Application Tier — Any instance can handle any request; session state externalized to Redis/database
Auto-Scaling Groups — Cloud infrastructure monitors metrics and adds/removes instances
Load Balancer Distribution — Traffic distributed across healthy instances
Connection Pooling — Database connections shared across requests to avoid exhaustion

Vertical Scaling:

Increasing resources for individual instances:

Right-Sizing — Matching instance type to workload (memory-optimized for caching, compute-optimized for processing)
Database Scaling — Moving to larger RDS instances, increasing provisioned IOPS
Limitation — Single-machine limits cap vertical scaling; eventually requires horizontal approach

Tenant-Aware Scaling:

Large tenants may require dedicated resources:

┌─────────────────────────────────────────────────────────────────┐
│                    TIERED TENANT ARCHITECTURE                    │
│                                                                  │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │                     PREMIUM TIER                            │ │
│  │   ┌───────────┐   ┌───────────┐   ┌───────────┐            │ │
│  │   │ Tenant X  │   │ Tenant Y  │   │ Tenant Z  │            │ │
│  │   │ Dedicated │   │ Dedicated │   │ Dedicated │            │ │
│  │   │ Resources │   │ Resources │   │ Resources │            │ │
│  │   └───────────┘   └───────────┘   └───────────┘            │ │
│  └────────────────────────────────────────────────────────────┘ │
│                                                                  │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │                    STANDARD TIER                            │ │
│  │   ┌─────────────────────────────────────────────────────┐  │ │
│  │   │              SHARED RESOURCE POOL                    │  │ │
│  │   │   Tenants A, B, C, D, E, F, G, H, I, J, K, L...      │  │ │
│  │   │         (Hundreds to thousands of tenants)            │  │ │
│  │   └─────────────────────────────────────────────────────┘  │ │
│  └────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

Shard-Nothing Architecture:

Each shard is a complete stack (compute, database, cache)
Tenants assigned to shards based on consistent hashing or manual placement
Shards can be scaled independently
Cross-shard operations are expensive and minimized

SaaS Performance Patterns

•Caching Layers — Redis/Memcached for session data, query results, computed values. Invalidation is critical.
•CDN for Static Assets — CloudFront, Cloudflare, Fastly serve static content from edge locations.
•Read Replicas — Database read replicas handle query load; writes go to primary.
•Event-Driven Processing — Background jobs via queues (SQS, RabbitMQ) decouple synchronous request handling from heavy processing.
•CQRS — Command Query Responsibility Segregation separates read and write models for scalability.
•Connection Pooling — PgBouncer, ProxySQL manage database connections efficiently across many application instances.

Service Model Selection Framework

Choosing between IaaS, PaaS, and SaaS involves evaluating multiple factors. This decision framework helps architects make informed choices:

Decision Criteria Matrix:

Service Model Selection Criteria
Criterion	Favors IaaS	Favors PaaS	Favors SaaS (Buy)
Team Expertise	Strong ops/infrastructure skills	Strong development skills	Limited technical team
Customization Needs	Extensive OS-level customization	Standard runtimes work	Commodity functionality
Compliance Requirements	Specific OS hardening required	Platform compliance sufficient	Vendor is compliant
Performance Requirements	Maximum control over tuning	Platform performance acceptable	Standard performance OK
Time to Market	Slower initial deployment	Rapid development	Immediate availability
Operational Budget	Can invest in ops team	Prefer managed infrastructure	Minimize ops entirely
Scale Characteristics	Predictable, optimizable	Highly variable	Unknown/variable
Vendor Lock-in Tolerance	Prefer portability	Accept moderate lock-in	Lock-in acceptable

Common Patterns by Application Type:

Stateless Web Applications: → Strong candidate for PaaS. Buildpacks handle deployment, auto-scaling handles load, managed databases handle persistence.

Legacy Applications Being Modernized: → Start with IaaS (lift-and-shift), then progressively move components to PaaS as they're refactored.

Machine Learning Workloads: → IaaS for training (GPU instances, custom environments), PaaS for inference (managed serving platforms).

Internal Tools and Productivity: → SaaS wherever possible. Build vs. buy analysis usually favors buying for non-differentiating functionality.

High-Frequency Trading/Low-Latency: → IaaS or bare metal. Kernel tuning, network bypass, custom drivers required for extreme performance.

The Hybrid Approach

Most organizations use a mix of service models. Core infrastructure on IaaS, web applications on PaaS, and SaaS for commoditized functions (email, CRM, identity). The key is matching each workload to the most appropriate abstraction level.

Summary: IaaS, PaaS, SaaS

The three primary service models represent layers of abstraction in cloud computing, each trading control for convenience and enabling teams with different skills and requirements to build effective solutions.

Key Takeaways

•IaaS exposes virtual infrastructure — You manage guest OS, networking, storage configuration. Hypervisors (KVM, ESXi) virtualize hardware. Instance type selection matches workload to resources.
•IaaS storage spans block, object, and file — Each has distinct OS interaction patterns, performance characteristics, and consistency guarantees.
•PaaS abstracts infrastructure management — Buildpacks compile code to containers, orchestrators schedule pods, managed services handle databases and caches.
•OS knowledge remains essential for PaaS — Cgroups enforce limits, signal handling enables graceful shutdown, filesystem semantics affect persistence strategies.
•SaaS delivers complete applications — Multi-tenancy is the defining challenge; data isolation failures cause catastrophic breaches.
•SaaS scalability requires sophisticated patterns — Horizontal scaling, tenant-aware resource allocation, caching, and event-driven architectures.
•Service model selection is contextual — Team skills, customization needs, compliance requirements, and time constraints all influence the optimal choice.

Looking Ahead:

We've explored the service model taxonomy in depth. Next, we'll examine how virtualization technologies power cloud computing—from hypervisors and paravirtualization to the container runtimes that enable modern PaaS platforms.

Page Complete

You now possess a comprehensive understanding of IaaS, PaaS, and SaaS architectures, their operating systems implications, and a framework for selecting the appropriate model for different workloads. Next, we'll dive into virtualization technologies that enable cloud infrastructure.

2 / 5

Loading learning content...

Operating SystemsCloud Computing

Cloud Computing

LevelAdvanced

Duration75 mins

TopicCloud Computing

2 / 5

IaaS, PaaS, SaaS

The Three Pillars of Cloud Computing

What You Will Master

IaaS Architecture and Implementation

The IaaS Control Plane:

Every IaaS platform comprises two fundamental layers:

Data Plane — The actual compute, storage, and network resources executing workloads
Control Plane — The APIs, schedulers, and orchestrators that provision and manage resources

┌─────────────────────────────────────────────────────────────────┐
│                        CONTROL PLANE                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐              │
│  │     API     │  │  Scheduler  │  │  Resource   │              │
│  │   Gateway   │──│   Engine    │──│   Manager   │              │
│  └─────────────┘  └─────────────┘  └─────────────┘              │
│         │               │                   │                    │
│         └───────────────┴───────────────────┘                    │
│                         │                                        │
├─────────────────────────┼────────────────────────────────────────┤
│                    DATA PLANE                                    │
│         ┌───────────────┴───────────────┐                        │
│  ┌──────┴──────┐  ┌──────┴──────┐  ┌─────┴──────┐               │
│  │   Compute   │  │   Storage   │  │  Network   │               │
│  │   Cluster   │  │   Cluster   │  │   Fabric   │               │
│  │             │  │             │  │            │               │
│  │ ┌──┐ ┌──┐  │  │ ┌──┐ ┌──┐  │  │ ┌──┐ ┌──┐ │               │
│  │ │VM│ │VM│  │  │ │HD│ │HD│  │  │ │SW│ │SW│ │               │
│  │ └──┘ └──┘  │  │ └──┘ └──┘  │  │ └──┘ └──┘ │               │
│  └─────────────┘  └─────────────┘  └───────────┘               │
└─────────────────────────────────────────────────────────────────┘

Compute Virtualization Stack:

IaaS compute relies on hypervisors to multiplex physical servers across multiple virtual machines:

Type-1 Hypervisors in Production:

KVM (Kernel-based Virtual Machine) — Linux kernel module treating the kernel as a hypervisor; used by Google Cloud, AWS (Nitro), and many OpenStack deployments
VMware ESXi — Bare-metal hypervisor dominating enterprise private clouds
Xen — Paravirtualization pioneer; historically used by AWS, still common in hosting environments
Microsoft Hyper-V — Powers Azure and Windows Server virtualization

The Nitro Architecture (AWS Case Study):

AWS developed custom hardware (Nitro cards) to offload virtualization overhead from the CPU:

Nitro Hypervisor — Minimal hypervisor based on KVM with most functionality offloaded
Nitro Cards — Custom ASICs handling I/O virtualization, security monitoring, and instance management
Nitro Enclaves — Isolated compute environments for processing sensitive data

This architecture delivers near-bare-metal performance by eliminating hypervisor CPU overhead for I/O operations.

IaaS Compute Instance Types

•General Purpose — Balanced CPU-to-memory ratio (e.g., AWS m6i, Azure D-series); OS considerations: standard kernel configurations, typical memory management patterns
•Compute Optimized — High CPU-to-memory ratio for compute-intensive workloads (e.g., c6i, Fsv2); OS tuning: CPU affinity, NUMA awareness, compute governor settings
•Memory Optimized — Large memory capacity for in-memory databases (e.g., r6i, Ev4); OS tuning: huge pages, memory overcommit policies, NUMA topology
•Storage Optimized — High I/O performance with local NVMe (e.g., i3en, Lsv2); OS considerations: I/O scheduler selection, filesystem tuning, device driver optimization
•Accelerated Computing — GPU/FPGA/ASIC instances for ML/HPC (e.g., p4d, NC-series); OS requirements: specific kernel versions, accelerator drivers, CUDA toolkit
•Bare Metal — Direct hardware access without hypervisor layer; full OS control including custom kernels and hypervisor deployment

Instance Selection Strategy

IaaS Storage and Networking

Storage Virtualization:

IaaS provides multiple storage abstractions, each with distinct performance characteristics and OS interaction patterns:

Block Storage: Virtual disks that attach to VMs like physical drives. Examples: AWS EBS, Azure Managed Disks, GCP Persistent Disks.

Network-attached storage presenting as block devices to the guest OS
Guest OS sees standard block device (e.g., /dev/sda, /dev/nvme0n1)
Filesystems (ext4, XFS, NTFS) created on top, just like physical disks
Replication handled by the storage backend, transparent to guest OS

Object Storage: Infinitely scalable storage for unstructured data. Examples: S3, Azure Blob, Cloud Storage.

Accessed via HTTP/REST APIs, not mounted as filesystems
No OS-level interaction; applications use SDKs or CLI tools
Eventual consistency semantics (traditionally; many now offer strong consistency)

File Storage: Managed NFS/SMB filesystems. Examples: EFS, Azure Files, Filestore.

Mount directly in guest OS using standard NFS/SMB clients
Kernel NFS client communicates with managed file server
POSIX semantics with some cloud-specific behaviors

IaaS Storage Comparison
Aspect	Block Storage	Object Storage	File Storage
Access Method	Block device	HTTP API	NFS/SMB mount
OS Visibility	Block device	Application-level	Filesystem mount
Performance	Low latency, high IOPS	High throughput, higher latency	Medium latency
Scalability	Volume size limits	Virtually unlimited	Capacity tiers
Use Cases	Databases, boot volumes	Backups, media, data lakes	Shared storage, CMS
Consistency	Strong	Strong or eventual	Strong

Network Virtualization:

IaaS networking relies on Software-Defined Networking (SDN) to create isolated virtual networks:

Virtual Private Cloud (VPC):

Isolated network partition with customer-defined IP address ranges
Subnets span availability zones for redundancy
Route tables define traffic flow between subnets and external networks

Virtual Network Interfaces:

Elastic Network Interfaces (ENIs) attach to instances
Guest OS sees standard Ethernet device; driver (often virtio-net) communicates with hypervisor
MAC addresses and IPs assigned by cloud control plane

Security Groups and Network ACLs:

Stateful firewall rules (security groups) evaluated at hypervisor level
Stateless ACLs evaluated at subnet boundary
No guest OS configuration required; firewall runs in hypervisor's network stack

Load Balancing:

Application Load Balancers (Layer 7) inspect HTTP headers
Network Load Balancers (Layer 4) handle TCP/UDP with ultra-low latency
Traffic distribution transparent to guest OS

Network Performance Considerations

PaaS Architecture Deep Dive

The PaaS Runtime Model:

┌─────────────────────────────────────────────────────────────────┐
│                     DEVELOPER INTERACTION                        │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  git push │ docker push │ CLI deploy │ CI/CD pipeline      │  │
│  └─────────────────────────┬─────────────────────────────────┘  │
├─────────────────────────────┼────────────────────────────────────┤
│                    PLATFORM LAYER                                │
│  ┌─────────────────────────┴─────────────────────────────────┐  │
│  │                    BUILD SYSTEM                            │  │
│  │     (Source → Container Image → Registry)                  │  │
│  └─────────────────────────┬─────────────────────────────────┘  │
│  ┌─────────────────────────┴─────────────────────────────────┐  │
│  │                   ORCHESTRATOR                             │  │
│  │     (Scheduler, Service Discovery, Load Balancing)         │  │
│  └─────────────────────────┬─────────────────────────────────┘  │
│  ┌─────────────────────────┴─────────────────────────────────┐  │
│  │                  RUNTIME CONTAINERS                        │  │
│  │  ┌────────┐  ┌────────┐  ┌────────┐  ┌────────┐           │  │
│  │  │ App    │  │ App    │  │ App    │  │ App    │           │  │
│  │  │ Pod 1  │  │ Pod 2  │  │ Pod 3  │  │ Pod N  │           │  │
│  │  └────────┘  └────────┘  └────────┘  └────────┘           │  │
│  └───────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │            MANAGED SERVICES (Hidden from Developer)        │  │
│  │    Databases │ Caches │ Message Queues │ Secrets Mgmt      │  │
│  └───────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

PaaS Component Analysis:

1. Build Systems and Buildpacks: Buildpacks detect application type and construct container images:

Analyze source code to identify language/framework
Download appropriate runtime (Node.js, Python, Java, etc.)
Install dependencies (npm, pip, Maven)
Configure entry points and health checks
Produce OCI-compliant container image

2. Container Orchestration: The platform scheduler places containers on infrastructure:

Resource requests and limits (CPU, memory) inform placement decisions
Health checks trigger automatic restarts
Horizontal Pod Autoscaler adjusts replica count based on metrics
Service mesh handles inter-service communication

3. Managed Services: PaaS platforms bundle common supporting services:

Managed Databases: PostgreSQL, MySQL, Redis—provisioned via API, scaling handled automatically
Message Queues: RabbitMQ, Pub/Sub—for asynchronous communication patterns
Secrets Management: Vault integration, automatic rotation, secure injection into runtime

PaaS Platform Examples

•Heroku — Pioneer of 'git push' deployment; buildpack-based; PostgreSQL, Redis add-ons
•Google App Engine — Standard and Flexible environments; automatic scaling; deep GCP integration
•AWS Elastic Beanstalk — Managed EC2 underneath; supports Docker, many language runtimes
•Azure App Service — Windows and Linux apps; integrated DevOps pipelines; hybrid connectivity

Serverless Platforms

•AWS Lambda — Event-driven functions; 15-minute timeout; 10GB memory; container image support
•Google Cloud Functions — HTTP and event triggers; tight Pub/Sub integration
•Azure Functions — Durable functions for workflows; consumption and premium plans
•Cloudflare Workers — Edge execution; V8 isolates; sub-millisecond cold starts

The Cold Start Problem

PaaS Operating Systems Implications

Although PaaS abstracts the operating system, understanding OS concepts remains crucial for performance tuning, debugging, and making informed platform choices.

Resource Limits and Cgroups:

PaaS platforms use Linux cgroups to enforce resource limits:

┌────────────────────────────────────────────────────────────────┐
│                    CONTAINER RUNTIME                            │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                   YOUR APPLICATION                        │  │
│  │                                                           │  │
│  │   Memory Limit: 512MB  │  CPU Limit: 0.5 cores           │  │
│  │   File Descriptors: 1024  │  Disk I/O: Throttled         │  │
│  │                                                           │  │
│  └──────────────────────────────────────────────────────────┘  │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                   CGROUP SUBSYSTEMS                       │  │
│  │   memory: 536870912 bytes max                            │  │
│  │   cpu: 50000/100000 quota (50%)                          │  │
│  │   pids: 100 max processes                                │  │
│  │   blkio: 10MB/s read, 10MB/s write                       │  │
│  └──────────────────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────────────────┘

What happens when limits are exceeded:

Memory: OOM Killer terminates the process; container restarts
CPU: Process is throttled; request latency increases but doesn't crash
PIDs: Fork/clone syscalls fail; process can't spawn threads
Disk I/O: Requests queue up; latency increases dramatically

Filesystem Semantics in PaaS:

Ephemeral Filesystems:

Container filesystems are layered (overlayfs) and typically reset between deployments
Local writes may not persist across container restarts or horizontal scaling
/tmp is usually writable but has size limits enforced by cgroups

Read-Only Layers:

Application code is baked into container images (read-only)
Configuration often injected via environment variables or mounted secrets
Runtime modification of code isn't possible without redeployment

Persistent Storage Options:

Volume mounts connect to network-attached storage
Object storage for durable artifacts
Database services for structured data

Process Model Constraints:

Many PaaS platforms expect a single main process per container
Sidecar patterns (multiple containers in a pod) handle auxiliary concerns
Background workers typically run as separate deployments
Init systems (systemd, supervisord) often unnecessary or prohibited

Critical OS Knowledge for PaaS Development

•Memory Management — Understand heap size, GC tuning, memory-mapped files. Size memory limits to accommodate your runtime plus headroom.
•Signal Handling — SIGTERM initiates graceful shutdown; your app should handle it. SIGKILL follows after grace period.
•File Descriptors — Sockets, pipes, log files consume descriptors. Exhaustion causes connection failures.
•Networking — Understand DNS resolution caching, connection pooling, timeout semantics in container networks.
•Time and Clocks — Container time is host time. NTP may not be accessible. Consider clock drift in distributed coordination.
•Environment Variables — Primary configuration mechanism; understand shell escaping and special character handling.

SaaS Architecture Patterns

The Multi-Tenancy Challenge:

Multi-tenancy—serving multiple customers from shared infrastructure—is the defining characteristic of SaaS. It introduces challenges at every layer of the stack:

┌─────────────────────────────────────────────────────────────────┐
│                        MULTI-TENANT SAAS                         │
│                                                                  │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐                │
│  │  Tenant A  │  │  Tenant B  │  │  Tenant C  │                │
│  │            │  │            │  │            │                │
│  │ 10 Users   │  │ 1000 Users │  │ 100 Users  │                │
│  └─────┬──────┘  └─────┬──────┘  └─────┬──────┘                │
│        │               │               │                        │
│        └───────────────┼───────────────┘                        │
│                        ▼                                         │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │              SHARED APPLICATION LAYER                     │  │
│  │          (Authentication, Business Logic, APIs)           │  │
│  │   ┌─────────────────────────────────────────────────┐    │  │
│  │   │  Tenant Context: Every request identifies tenant │    │  │
│  │   │  Row-Level Security: Database filters by tenant  │    │  │
│  │   │  Rate Limiting: Per-tenant quotas enforced       │    │  │
│  │   └─────────────────────────────────────────────────┘    │  │
│  └──────────────────────────────────────────────────────────┘  │
│                           ▼                                      │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │              SHARED DATABASE LAYER                        │  │
│  │    (All tenant data co-located with logical isolation)    │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

SaaS Isolation Models
Model	Compute	Database	Isolation	Cost/Tenant	Complexity
Siloed	Dedicated instances	Dedicated database	Maximum	Highest	High ops overhead
Bridge	Shared compute	Dedicated database	High	Medium	Moderate
Pooled	Shared compute	Shared database, schema isolation	Medium	Low	Complex data layer
Row-Level	Shared compute	Shared tables, row filtering	Lower	Lowest	Requires careful design

Noisy Neighbor Mitigation:

In pooled architectures, one tenant's workload can impact others. Mitigation strategies include:

Rate Limiting — Enforce per-tenant request quotas using token buckets or sliding windows
Resource Quotas — Limit database connections, storage consumption, API calls per tenant
Priority Queues — Process tenant requests with fairness scheduling (weighted queues)
Throttling — Degrade service gracefully rather than failing completely
Background Job Isolation — Run heavy computations in isolated worker pools

Data Isolation Techniques:

Tenant ID in Every Query — Every database query includes WHERE tenant_id = ?
Row-Level Security (RLS) — Database enforces tenant filtering automatically (PostgreSQL RLS, Azure RLS)
Separate Schemas — Each tenant has own schema; reduces query filtering but complicates migrations
Encryption Key Isolation — Per-tenant encryption keys prevent cross-tenant data access even if query filter fails

The Data Leak Risk

SaaS Scalability Patterns

SaaS applications must handle unpredictable load variations—from quiet periods to viral growth. Scalability patterns leverage operating systems and cloud infrastructure capabilities:

Horizontal Scaling:

Adding more instances to handle increased load:

Stateless Application Tier — Any instance can handle any request; session state externalized to Redis/database
Auto-Scaling Groups — Cloud infrastructure monitors metrics and adds/removes instances
Load Balancer Distribution — Traffic distributed across healthy instances
Connection Pooling — Database connections shared across requests to avoid exhaustion

Vertical Scaling:

Increasing resources for individual instances:

Right-Sizing — Matching instance type to workload (memory-optimized for caching, compute-optimized for processing)
Database Scaling — Moving to larger RDS instances, increasing provisioned IOPS
Limitation — Single-machine limits cap vertical scaling; eventually requires horizontal approach

Tenant-Aware Scaling:

Large tenants may require dedicated resources:

┌─────────────────────────────────────────────────────────────────┐
│                    TIERED TENANT ARCHITECTURE                    │
│                                                                  │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │                     PREMIUM TIER                            │ │
│  │   ┌───────────┐   ┌───────────┐   ┌───────────┐            │ │
│  │   │ Tenant X  │   │ Tenant Y  │   │ Tenant Z  │            │ │
│  │   │ Dedicated │   │ Dedicated │   │ Dedicated │            │ │
│  │   │ Resources │   │ Resources │   │ Resources │            │ │
│  │   └───────────┘   └───────────┘   └───────────┘            │ │
│  └────────────────────────────────────────────────────────────┘ │
│                                                                  │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │                    STANDARD TIER                            │ │
│  │   ┌─────────────────────────────────────────────────────┐  │ │
│  │   │              SHARED RESOURCE POOL                    │  │ │
│  │   │   Tenants A, B, C, D, E, F, G, H, I, J, K, L...      │  │ │
│  │   │         (Hundreds to thousands of tenants)            │  │ │
│  │   └─────────────────────────────────────────────────────┘  │ │
│  └────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

Shard-Nothing Architecture:

Each shard is a complete stack (compute, database, cache)
Tenants assigned to shards based on consistent hashing or manual placement
Shards can be scaled independently
Cross-shard operations are expensive and minimized

SaaS Performance Patterns

•Caching Layers — Redis/Memcached for session data, query results, computed values. Invalidation is critical.
•CDN for Static Assets — CloudFront, Cloudflare, Fastly serve static content from edge locations.
•Read Replicas — Database read replicas handle query load; writes go to primary.
•Event-Driven Processing — Background jobs via queues (SQS, RabbitMQ) decouple synchronous request handling from heavy processing.
•CQRS — Command Query Responsibility Segregation separates read and write models for scalability.
•Connection Pooling — PgBouncer, ProxySQL manage database connections efficiently across many application instances.

Service Model Selection Framework

Choosing between IaaS, PaaS, and SaaS involves evaluating multiple factors. This decision framework helps architects make informed choices:

Decision Criteria Matrix:

Service Model Selection Criteria
Criterion	Favors IaaS	Favors PaaS	Favors SaaS (Buy)
Team Expertise	Strong ops/infrastructure skills	Strong development skills	Limited technical team
Customization Needs	Extensive OS-level customization	Standard runtimes work	Commodity functionality
Compliance Requirements	Specific OS hardening required	Platform compliance sufficient	Vendor is compliant
Performance Requirements	Maximum control over tuning	Platform performance acceptable	Standard performance OK
Time to Market	Slower initial deployment	Rapid development	Immediate availability
Operational Budget	Can invest in ops team	Prefer managed infrastructure	Minimize ops entirely
Scale Characteristics	Predictable, optimizable	Highly variable	Unknown/variable
Vendor Lock-in Tolerance	Prefer portability	Accept moderate lock-in	Lock-in acceptable

Common Patterns by Application Type:

Stateless Web Applications: → Strong candidate for PaaS. Buildpacks handle deployment, auto-scaling handles load, managed databases handle persistence.

Legacy Applications Being Modernized: → Start with IaaS (lift-and-shift), then progressively move components to PaaS as they're refactored.

Machine Learning Workloads: → IaaS for training (GPU instances, custom environments), PaaS for inference (managed serving platforms).

Internal Tools and Productivity: → SaaS wherever possible. Build vs. buy analysis usually favors buying for non-differentiating functionality.

High-Frequency Trading/Low-Latency: → IaaS or bare metal. Kernel tuning, network bypass, custom drivers required for extreme performance.

The Hybrid Approach

Summary: IaaS, PaaS, SaaS

Key Takeaways

•IaaS exposes virtual infrastructure — You manage guest OS, networking, storage configuration. Hypervisors (KVM, ESXi) virtualize hardware. Instance type selection matches workload to resources.
•IaaS storage spans block, object, and file — Each has distinct OS interaction patterns, performance characteristics, and consistency guarantees.
•PaaS abstracts infrastructure management — Buildpacks compile code to containers, orchestrators schedule pods, managed services handle databases and caches.
•OS knowledge remains essential for PaaS — Cgroups enforce limits, signal handling enables graceful shutdown, filesystem semantics affect persistence strategies.
•SaaS delivers complete applications — Multi-tenancy is the defining challenge; data isolation failures cause catastrophic breaches.
•SaaS scalability requires sophisticated patterns — Horizontal scaling, tenant-aware resource allocation, caching, and event-driven architectures.
•Service model selection is contextual — Team skills, customization needs, compliance requirements, and time constraints all influence the optimal choice.

Looking Ahead:

Page Complete

2 / 5