Types Of Operating Systems - Learning Module

Loading content...

0/227

Distributed Operating Systems

Computing Without Boundaries

What if a collection of networked computers could appear as a single, coherent system? Users wouldn't know—or need to know—which machine runs their process, which disk stores their file, or which CPU performs their computation. The illusion would be seamless: one unified, powerful virtual computer built from many physical machines.

This is the vision of Distributed Operating Systems—systems designed from the ground up to manage multiple computers as a unified resource, providing transparency, fault tolerance, and scalability impossible with single-machine systems.

Distributed systems introduce challenges absent in conventional OS: network latency, partial failures, clock synchronization, and the fundamental impossibility of achieving certain consistency guarantees simultaneously. Understanding these systems requires grappling with trade-offs that have no clean solutions—only engineering choices appropriate to specific contexts.

Learning Objectives

By the end of this page, you will understand:

• The distinction between distributed OS and network OS • Transparency goals: location, migration, replication, concurrency, failure • The CAP theorem and its implications for system design • Distributed mutual exclusion and coordination algorithms • Distributed file systems: design patterns and consistency trade-offs • How modern systems (Kubernetes, cloud platforms) embody distributed principles

What Makes an OS 'Distributed'?

Not all systems spanning multiple computers are 'distributed operating systems.' Understanding the spectrum clarifies what truly distinguishes distributed OS design.

Network OS vs. Distributed OS
Characteristic	Network Operating System	Distributed Operating System
User awareness	Users know which machine they're on	Users see one unified system
Resource access	Explicit: 'access file on machine X'	Transparent: 'access file Y'
Process migration	Manual or none	Automatic, transparent
Administration	Each machine managed separately	Single system image, unified management
Failure handling	Each machine fails independently	System masks individual failures
Example	Linux + NFS + SSH	Amoeba, Plan 9, Sprite

The Single System Image (SSI):

The defining characteristic of a true distributed OS is the single system image—the appearance that multiple machines constitute one unified computer. This manifests as:

Unified Namespace — All resources (files, devices, processes) have globally unique names, regardless of physical location.
Location Transparency — Access to a resource doesn't require knowing its location. The system routes requests automatically.
Seamless Process Execution — Users launch processes without specifying where they run. The system places and potentially migrates them.
Coherent State — Changes to shared resources are visible consistently across all nodes.
Fault Masking — Individual node failures don't manifest as system failures (to the extent possible).

Leslie Lamport's Definition

"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable."

— Leslie Lamport, Turing Award winner

This tongue-in-cheek definition captures a fundamental truth: distributed systems introduce failure modes absent in single-machine systems. The complexity of distributed OS stems from hiding this fundamental fragility.

Converting Mermaid diagram...

The Eight Transparencies

Distributed systems aim to hide complexity through various forms of transparency—making the distributed nature invisible to users and applications. The ISO Reference Model for Open Distributed Processing identifies eight types:

Transparency Goals

•Access Transparency — Local and remote resources are accessed using identical operations. An application calling read() shouldn't know if the file is local or on a distant server.
•Location Transparency — Resources can be accessed without knowledge of their physical location. A file named /docs/report.pdf might physically reside on any node.
•Migration Transparency — Resources can be moved without affecting how users or applications access them. A file or process relocated to another node remains reachable at the same name.
•Relocation Transparency — Resources can be relocated while being used without users noticing. A running process moved to another node continues without interruption.
•Replication Transparency — Multiple copies of a resource can exist for performance or reliability. Users see one logical resource, unaware of underlying copies.
•Concurrency Transparency — Multiple users can access shared resources simultaneously without inconsistency. The system handles locking, versioning, and conflict resolution invisibly.
•Failure Transparency — Component failures are masked from users. When a node fails, access seamlessly redirects to replicas or restarts on healthy nodes.
•Persistence Transparency — Resources (objects, data) survive process and system restarts. The distinction between volatile memory and persistent storage is hidden.

Transparency Is Not Free

Each transparency goal has costs:

• Location transparency requires name resolution infrastructure • Replication transparency requires consistency protocols (potentially expensive) • Failure transparency requires detection, failover, and state recovery • Migration transparency requires stopping and moving running processes

Practical systems achieve some transparencies better than others, based on their design priorities. Full transparency across all dimensions is often impossible or impractical.

Case Study: Plan 9 from Bell Labs

Plan 9 (successor to UNIX, also from Bell Labs) achieved extraordinary transparency through a simple idea: everything is a file, accessed through a unified namespace, composable across machines.

# On Plan 9, import another machine's screen:
import graphicsbox /dev

# Now /dev/draw shows the remote display
# Local programs draw there transparently

# Import another machine's CPU:
import cpuserver /mnt/cpu

# Process execution can transparently use remote CPU

Each machine exports resources as files. Any machine can mount any other machine's resources, creating a unified namespace tailored to each user. The system achieved remarkable transparency with minimal complexity—though it never achieved widespread adoption outside research.

Fundamental Challenges: The Impossibility Results

Distributed systems face challenges that are not merely difficult but provably impossible to solve completely. Understanding these impossibility results is crucial for making informed design trade-offs.

CAP Theorem (Brewer, 2000; Gilbert & Lynch proof, 2002)

A distributed system cannot simultaneously provide all three of:

C - Consistency: Every read receives the most recent write (or an error). All nodes see the same data at the same time.

A - Availability: Every request receives a response (not an error), without guarantee it contains the most recent write.

P - Partition Tolerance: The system continues operating despite network partitions (messages between nodes being lost or delayed indefinitely).

The Core Trade-off:

                    ┌─────────────┐
                    │  Partition  │
                    │  Tolerance  │
                    │     (P)     │
                    └──────┬──────┘
                           │
         Choose ONE:
    ┌──────────────────────┼──────────────────────┐
    │                      │                      │
┌───▼────┐            ┌────▼───┐            ┌────▼───┐
│   CP   │            │   AP   │            │   CA   │
│Systems │            │Systems │            │Systems │
│        │            │        │            │        │
│MongoDB │            │Cassandra│           │Single  │
│Zookeeper│           │DynamoDB│            │ Node   │
│HBase   │            │CouchDB │            │(not    │
│        │            │        │            │ dist.) │
└────────┘            └────────┘            └────────┘

Why P Is Non-Negotiable:

In a distributed system, network partitions will happen. Hardware fails, cables are cut, routers malfunction. A system that doesn't tolerate partitions will have periods of total unavailability.

Thus, the real choice is between:

CP (Consistency + Partition tolerance): Sacrifice availability during partitions. Refuse to serve if state might be inconsistent.
AP (Availability + Partition tolerance): Sacrifice consistency during partitions. Serve potentially stale or conflicting data.

CA systems don't exist in distributed contexts—they're effectively single-node systems.

Living with Impossibility

Impossibility results don't mean distributed systems are impossible—they mean we must make conscious trade-offs:

• Accept occasional inconsistency (eventual consistency systems) • Accept occasional unavailability (CP systems during partitions) • Accept that consensus may sometimes take a long time (retry-based protocols)

Understanding what's theoretically impossible helps us choose practical approximations appropriate to our requirements.

Distributed Coordination Mechanisms

Distributed systems require mechanisms to coordinate actions across nodes: mutual exclusion (ensuring only one node accesses a resource), leader election, and atomic commit (ensuring all-or-nothing transactions).

Distributed Mutual Exclusion:

When multiple nodes may access a shared resource, how do we ensure only one accesses it at a time?

Centralized Approach:

1. Designate one node as coordinator
2. Nodes request lock from coordinator
3. Coordinator grants if free, queues otherwise
4. On release, grant to next in queue

Advantages: Simple, fair
Disadvantages: Single point of failure, coordinator bottleneck

Token Ring Approach:

1. Arrange nodes in logical ring
2. A token circulates around the ring
3. Node can enter critical section only when holding token
4. After use, pass token to next node

Advantages: No central coordinator, fair
Disadvantages: Token loss requires recovery, high latency if ring is large

Ricart-Agrawala Algorithm:

1. To enter critical section, broadcast REQUEST to all nodes
2. Each receiving node replies REPLY if:
   - Not interested in entering, OR
   - Has lower timestamp (other request has priority)
3. Wait for REPLY from all nodes
4. Enter critical section
5. On exit, send deferred REPLYs

Advantages: Fully distributed, no single point of failure
Disadvantages: High message overhead (2(n-1) messages)

Leader Election:

Many distributed algorithms require a designated leader (coordinator). Leader election algorithms ensure exactly one leader emerges, even after failures.

Bully Algorithm for Leader Election

Pseudocode

// Bully Algorithm (Garcia-Molina, 1982)
// Assumption: Each node has unique ID; higher ID = higher priority
 
when_detecting_leader_failure():
    start_election()
 
start_election():
    // Send ELECTION to all higher-ID nodes
    higher_nodes = nodes_with_higher_id_than(self.id)
    
    if higher_nodes is empty:
        // I have highest ID -> I am leader
        broadcast_to_all(COORDINATOR, self.id)
        become_leader()
    else:
        // Ask higher nodes to take over
        for node in higher_nodes:
            send(node, ELECTION)
        
        wait_for_response(timeout=T):
            if no_response:
                // Higher nodes dead, I win
                broadcast_to_all(COORDINATOR, self.id)
                become_leader()
            // else: wait for them to decide
 
on_receive_ELECTION(from_node):
    // A lower node is asking; take over
    send(from_node, OK)  // "I'm alive, stand down"
    start_election()     // I should try to become leader
 
on_receive_COORDINATOR(leader_id):
    accept_leader(leader_id)

Consensus Algorithms: Paxos and Raft

For robust leader election and agreement in fault-tolerant systems, Paxos (Lamport) and Raft (Ongaro & Ousterhout) are industry standards:

• Paxos: Theoretically elegant, notoriously difficult to implement • Raft: Designed for understandability, widely deployed (etcd, Consul)

Both ensure consensus despite minority failures, forming the foundation for distributed databases, coordination services, and configuration management.

Distributed File Systems

File systems are the most visible manifestation of distributed OS—enabling access to files stored across many machines as if they were local. Each distributed file system makes different trade-offs in consistency, availability, and performance.

Design Dimensions:

DFS Design Decisions

•Naming: How are files identified? Local path + server? Global namespace? Unique IDs?
•Caching Semantics: What happens when multiple clients cache the same file? When are caches invalidated?
•Consistency Model: Do all clients see writes immediately? Eventually? In what order?
•Replication: Are files copied to multiple servers? How are copies kept consistent?
•Fault Tolerance: What happens when servers fail? How is data recovered?
•Stateless vs. Stateful: Do servers remember client state? Impacts recovery complexity.

Notable Distributed File Systems
System	Era	Key Innovation	Consistency
NFS (Sun)	1984	Stateless protocol, VFS integration	Weak (cache + timeout)
AFS (CMU)	1988	Whole-file caching, callbacks	Session semantics
Coda (CMU)	1990	Disconnected operation, conflict resolution	Optimistic replication
GFS (Google)	2003	Massive files, append-only, relaxed consistency	Relaxed, defined record semantics
HDFS (Apache)	2006	GFS-inspired for Hadoop	Append-only, eventual
Ceph	2006	Object storage, CRUSH placement	Strong (RADOS)
GlusterFS	2011	No metadata server, hash-based location	Tunable

Case Study: Google File System (GFS)

GFS revolutionized distributed storage by accepting trade-offs appropriate for Google's workloads:

Assumptions:

Files are huge (multi-GB common)
Workloads are append-heavy, random writes rare
High throughput more important than low latency
Component failures are routine, not exceptional

Architecture:

┌──────────────────────────────────────────────────────────────┐
│                           GFS Master                          │
│  • File/chunk namespace                                       │
│  • Chunk location mapping                                     │
│  • Lease management                                           │
│  (Single master, replicated for recovery)                     │
└─────────────────────────┬────────────────────────────────────┘
                          │ Metadata operations
┌─────────────────────────┼────────────────────────────────────┐
│  Client                 │                                     │
│  ┌──────────┐          ▼                                     │
│  │App       │    ┌────────────┐                              │
│  │Code      │───▶│GFS Client  │                              │
│  └──────────┘    └─────┬──────┘                              │
│                        │ Data operations (direct)             │
└────────────────────────┼─────────────────────────────────────┘
                         ▼
┌──────────────────────────────────────────────────────────────┐
│             Chunk Servers (thousands)                         │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐         │
│  │64MB     │  │64MB     │  │64MB     │  │64MB     │  ...    │
│  │Chunks   │  │Chunks   │  │Chunks   │  │Chunks   │         │
│  │(3x rep) │  │(3x rep) │  │(3x rep) │  │(3x rep) │         │
│  └─────────┘  └─────────┘  └─────────┘  └─────────┘         │
└──────────────────────────────────────────────────────────────┘

Relaxed Consistency: GFS doesn't provide POSIX semantics. Concurrent appends may result in record duplication or interleaving. Applications must handle this—an acceptable trade-off for massive scale.

Lesson from GFS

GFS succeeded by explicitly rejecting assumptions that limited prior systems:

• Don't pretend failures are rare—embrace them • Don't enforce POSIX—define semantics your apps actually need • Don't worry about small files—optimize for your actual workload • Don't fear centralized metadata—replicate for fault tolerance

This willingness to challenge established assumptions enabled unprecedented scale.

Historic Distributed Operating Systems

Several research systems pioneered distributed OS concepts, proving the vision achievable even if they never achieved mass adoption.

Amoeba (Vrije Universiteit, 1981-1996)

Vision: True distributed OS where users don't know which processor runs their code.

Architecture:

Pool Processors: A pool of CPUs. Processes run wherever resources are available.
Workstations: User terminals without local computation.
Specialized Servers: File servers, directory servers, etc.
Objects: All resources (files, processes, devices) represented as objects with capabilities.

Key Innovations:

Location independence: Processes could migrate transparently
Capability-based security: Unforgeable tokens for resource access
Group communication: Reliable multicast primitives
Microkernel design: Small kernel with user-space servers

FLIP Protocol: Amoeba's custom network protocol (Fast Local Internet Protocol) optimized for LAN communication with low latency and multicast support.

Legacy: Amoeba demonstrated that true distributed transparency was achievable. While never production-deployed at scale, it influenced subsequent distributed systems research and trained many distributed systems researchers.

Modern Distributed Systems: Cloud and Container Orchestration

While true distributed operating systems never achieved mass adoption, their principles live on in modern cloud platforms and container orchestration systems.

Kubernetes as a Distributed OS

Kubernetes, though called a 'container orchestration system,' embodies many distributed OS concepts:

Distributed OS Concepts in Kubernetes
Distributed OS Concept	Kubernetes Implementation
Process abstraction	Pods (group of containers as scheduling unit)
Process placement	Scheduler places pods on nodes automatically
Process migration	Pod eviction and rescheduling on failure
Resource limits	CPU/memory requests and limits per container
Service naming	DNS-based service discovery (kube-dns, CoreDNS)
Storage abstraction	PersistentVolumes, StorageClasses
Configuration	ConfigMaps, Secrets
Networking	Virtual network, Service mesh
Health monitoring	Liveness/readiness probes, restarts
Replication	ReplicaSets maintain desired instance count

Declarative Resource Specification (Kubernetes)
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Kubernetes Deployment - Location Transparent Service
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-service
spec:
  replicas: 3  # Replication transparency: 3 instances
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: web
        image: nginx:1.21
        resources:
          limits:
            cpu: "500m"
            memory: "128Mi"
---
# Service abstraction - Location transparency
apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  selector:
    app: web
  ports:
  - port: 80
    targetPort: 80
  # Clients access 'web-service:80'
  # Kubernetes routes to healthy pod instances
  # Client doesn't know which node serves the request

The Distributed OS We Actually Got

Academic distributed OS sought transparent single-system-image. Industry delivered something different but pragmatic:

• Cloud platforms (AWS, GCP, Azure) provide resource abstraction at infrastructure level • Kubernetes provides workload abstraction across clusters • Service meshes (Istio, Linkerd) provide networking abstraction • Distributed databases (Spanner, CockroachDB) provide storage abstraction

Rather than one unified distributed OS, we have layered abstractions—each providing transparency at its level. This evolved architecture arguably achieves distributed OS goals more practically than monolithic designs.

Modern Coordination Services:

Distributed systems require coordination primitives that the distributed OS concepts formalized. Modern systems use specialized coordination services:

•Apache ZooKeeper: Distributed configuration, naming, synchronization (based on Paxos-like protocol)
•etcd: Distributed key-value store (Raft consensus), foundation of Kubernetes
•Consul: Service discovery, configuration, segmentation (Raft-based)
•Apache Kafka: Distributed log, event streaming (replicated commit log)

Summary: Distributed Operating Systems

We've explored the ambitious vision of distributed operating systems—making many computers appear as one unified system—and the fundamental challenges that make this goal always partially elusive.

Key Takeaways

•Distributed OS differs from Network OS — A true distributed OS provides a single system image; users don't know which machine serves their requests. Network OS exposes the distributed nature.
•Transparency hides distribution complexity — Eight forms of transparency (location, migration, replication, failure, etc.) aim to make distribution invisible, but each has costs.
•CAP theorem forces fundamental trade-offs — No distributed system achieves consistency, availability, and partition tolerance simultaneously. Design must choose.
•FLP impossibility limits consensus — Asynchronous distributed consensus cannot always terminate. Practical systems use timeouts and partial synchrony assumptions.
•Distributed coordination requires specialized algorithms — Mutual exclusion, leader election, and consensus require careful protocol design (Paxos, Raft, etc.).
•Distributed file systems make different trade-offs — From NFS's weak consistency to GFS's relaxed semantics, each DFS optimizes for specific workloads.
•Modern systems embody distributed principles — Kubernetes, cloud platforms, and coordination services implement distributed OS concepts in layered, pragmatic ways.

What's Next:

We've explored systems that scale to many machines. Next, we'll examine Embedded Operating Systems—the OS domain where the challenge isn't scale or distribution, but extreme resource constraints: tiny memory, limited CPU, power budgets, and physical constraints that desktop computing never faces.

Page Complete

You now understand Distributed Operating Systems—the ambitious attempt to unify multiple computers into one coherent system. You can explain transparency goals, CAP trade-offs, distributed coordination, and see how modern cloud systems embody these principles. Next, we'll explore embedded systems where constraints are physical, not just theoretical.

Distributed Operating Systems

Computing Without Boundaries

Learning Objectives

By the end of this page, you will understand:

What Makes an OS 'Distributed'?

Not all systems spanning multiple computers are 'distributed operating systems.' Understanding the spectrum clarifies what truly distinguishes distributed OS design.

Network OS vs. Distributed OS
Characteristic	Network Operating System	Distributed Operating System
User awareness	Users know which machine they're on	Users see one unified system
Resource access	Explicit: 'access file on machine X'	Transparent: 'access file Y'
Process migration	Manual or none	Automatic, transparent
Administration	Each machine managed separately	Single system image, unified management
Failure handling	Each machine fails independently	System masks individual failures
Example	Linux + NFS + SSH	Amoeba, Plan 9, Sprite

The Single System Image (SSI):

The defining characteristic of a true distributed OS is the single system image—the appearance that multiple machines constitute one unified computer. This manifests as:

Unified Namespace — All resources (files, devices, processes) have globally unique names, regardless of physical location.
Location Transparency — Access to a resource doesn't require knowing its location. The system routes requests automatically.
Seamless Process Execution — Users launch processes without specifying where they run. The system places and potentially migrates them.
Coherent State — Changes to shared resources are visible consistently across all nodes.
Fault Masking — Individual node failures don't manifest as system failures (to the extent possible).

Leslie Lamport's Definition

"A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable."

— Leslie Lamport, Turing Award winner

Converting Mermaid diagram...

The Eight Transparencies

Transparency Goals

•Access Transparency — Local and remote resources are accessed using identical operations. An application calling read() shouldn't know if the file is local or on a distant server.
•Location Transparency — Resources can be accessed without knowledge of their physical location. A file named /docs/report.pdf might physically reside on any node.
•Migration Transparency — Resources can be moved without affecting how users or applications access them. A file or process relocated to another node remains reachable at the same name.
•Relocation Transparency — Resources can be relocated while being used without users noticing. A running process moved to another node continues without interruption.
•Replication Transparency — Multiple copies of a resource can exist for performance or reliability. Users see one logical resource, unaware of underlying copies.
•Concurrency Transparency — Multiple users can access shared resources simultaneously without inconsistency. The system handles locking, versioning, and conflict resolution invisibly.
•Failure Transparency — Component failures are masked from users. When a node fails, access seamlessly redirects to replicas or restarts on healthy nodes.
•Persistence Transparency — Resources (objects, data) survive process and system restarts. The distinction between volatile memory and persistent storage is hidden.

Transparency Is Not Free

Each transparency goal has costs:

Practical systems achieve some transparencies better than others, based on their design priorities. Full transparency across all dimensions is often impossible or impractical.

Case Study: Plan 9 from Bell Labs

Plan 9 (successor to UNIX, also from Bell Labs) achieved extraordinary transparency through a simple idea: everything is a file, accessed through a unified namespace, composable across machines.

# On Plan 9, import another machine's screen:
import graphicsbox /dev

# Now /dev/draw shows the remote display
# Local programs draw there transparently

# Import another machine's CPU:
import cpuserver /mnt/cpu

# Process execution can transparently use remote CPU

Fundamental Challenges: The Impossibility Results

CAP Theorem (Brewer, 2000; Gilbert & Lynch proof, 2002)

A distributed system cannot simultaneously provide all three of:

C - Consistency: Every read receives the most recent write (or an error). All nodes see the same data at the same time.

A - Availability: Every request receives a response (not an error), without guarantee it contains the most recent write.

P - Partition Tolerance: The system continues operating despite network partitions (messages between nodes being lost or delayed indefinitely).

The Core Trade-off:

                    ┌─────────────┐
                    │  Partition  │
                    │  Tolerance  │
                    │     (P)     │
                    └──────┬──────┘
                           │
         Choose ONE:
    ┌──────────────────────┼──────────────────────┐
    │                      │                      │
┌───▼────┐            ┌────▼───┐            ┌────▼───┐
│   CP   │            │   AP   │            │   CA   │
│Systems │            │Systems │            │Systems │
│        │            │        │            │        │
│MongoDB │            │Cassandra│           │Single  │
│Zookeeper│           │DynamoDB│            │ Node   │
│HBase   │            │CouchDB │            │(not    │
│        │            │        │            │ dist.) │
└────────┘            └────────┘            └────────┘

Why P Is Non-Negotiable:

In a distributed system, network partitions will happen. Hardware fails, cables are cut, routers malfunction. A system that doesn't tolerate partitions will have periods of total unavailability.

Thus, the real choice is between:

CP (Consistency + Partition tolerance): Sacrifice availability during partitions. Refuse to serve if state might be inconsistent.
AP (Availability + Partition tolerance): Sacrifice consistency during partitions. Serve potentially stale or conflicting data.

CA systems don't exist in distributed contexts—they're effectively single-node systems.

Living with Impossibility

Impossibility results don't mean distributed systems are impossible—they mean we must make conscious trade-offs:

Understanding what's theoretically impossible helps us choose practical approximations appropriate to our requirements.

Distributed Coordination Mechanisms

Distributed Mutual Exclusion:

When multiple nodes may access a shared resource, how do we ensure only one accesses it at a time?

Centralized Approach:

1. Designate one node as coordinator
2. Nodes request lock from coordinator
3. Coordinator grants if free, queues otherwise
4. On release, grant to next in queue

Advantages: Simple, fair
Disadvantages: Single point of failure, coordinator bottleneck

Token Ring Approach:

1. Arrange nodes in logical ring
2. A token circulates around the ring
3. Node can enter critical section only when holding token
4. After use, pass token to next node

Advantages: No central coordinator, fair
Disadvantages: Token loss requires recovery, high latency if ring is large

Ricart-Agrawala Algorithm:

1. To enter critical section, broadcast REQUEST to all nodes
2. Each receiving node replies REPLY if:
   - Not interested in entering, OR
   - Has lower timestamp (other request has priority)
3. Wait for REPLY from all nodes
4. Enter critical section
5. On exit, send deferred REPLYs

Advantages: Fully distributed, no single point of failure
Disadvantages: High message overhead (2(n-1) messages)

Leader Election:

Many distributed algorithms require a designated leader (coordinator). Leader election algorithms ensure exactly one leader emerges, even after failures.

Bully Algorithm for Leader Election

Pseudocode

// Bully Algorithm (Garcia-Molina, 1982)
// Assumption: Each node has unique ID; higher ID = higher priority
 
when_detecting_leader_failure():
    start_election()
 
start_election():
    // Send ELECTION to all higher-ID nodes
    higher_nodes = nodes_with_higher_id_than(self.id)
    
    if higher_nodes is empty:
        // I have highest ID -> I am leader
        broadcast_to_all(COORDINATOR, self.id)
        become_leader()
    else:
        // Ask higher nodes to take over
        for node in higher_nodes:
            send(node, ELECTION)
        
        wait_for_response(timeout=T):
            if no_response:
                // Higher nodes dead, I win
                broadcast_to_all(COORDINATOR, self.id)
                become_leader()
            // else: wait for them to decide
 
on_receive_ELECTION(from_node):
    // A lower node is asking; take over
    send(from_node, OK)  // "I'm alive, stand down"
    start_election()     // I should try to become leader
 
on_receive_COORDINATOR(leader_id):
    accept_leader(leader_id)

Consensus Algorithms: Paxos and Raft

For robust leader election and agreement in fault-tolerant systems, Paxos (Lamport) and Raft (Ongaro & Ousterhout) are industry standards:

• Paxos: Theoretically elegant, notoriously difficult to implement • Raft: Designed for understandability, widely deployed (etcd, Consul)

Both ensure consensus despite minority failures, forming the foundation for distributed databases, coordination services, and configuration management.

Distributed File Systems

Design Dimensions:

DFS Design Decisions

•Naming: How are files identified? Local path + server? Global namespace? Unique IDs?
•Caching Semantics: What happens when multiple clients cache the same file? When are caches invalidated?
•Consistency Model: Do all clients see writes immediately? Eventually? In what order?
•Replication: Are files copied to multiple servers? How are copies kept consistent?
•Fault Tolerance: What happens when servers fail? How is data recovered?
•Stateless vs. Stateful: Do servers remember client state? Impacts recovery complexity.

Notable Distributed File Systems
System	Era	Key Innovation	Consistency
NFS (Sun)	1984	Stateless protocol, VFS integration	Weak (cache + timeout)
AFS (CMU)	1988	Whole-file caching, callbacks	Session semantics
Coda (CMU)	1990	Disconnected operation, conflict resolution	Optimistic replication
GFS (Google)	2003	Massive files, append-only, relaxed consistency	Relaxed, defined record semantics
HDFS (Apache)	2006	GFS-inspired for Hadoop	Append-only, eventual
Ceph	2006	Object storage, CRUSH placement	Strong (RADOS)
GlusterFS	2011	No metadata server, hash-based location	Tunable

Case Study: Google File System (GFS)

GFS revolutionized distributed storage by accepting trade-offs appropriate for Google's workloads:

Assumptions:

Files are huge (multi-GB common)
Workloads are append-heavy, random writes rare
High throughput more important than low latency
Component failures are routine, not exceptional

Architecture:

┌──────────────────────────────────────────────────────────────┐
│                           GFS Master                          │
│  • File/chunk namespace                                       │
│  • Chunk location mapping                                     │
│  • Lease management                                           │
│  (Single master, replicated for recovery)                     │
└─────────────────────────┬────────────────────────────────────┘
                          │ Metadata operations
┌─────────────────────────┼────────────────────────────────────┐
│  Client                 │                                     │
│  ┌──────────┐          ▼                                     │
│  │App       │    ┌────────────┐                              │
│  │Code      │───▶│GFS Client  │                              │
│  └──────────┘    └─────┬──────┘                              │
│                        │ Data operations (direct)             │
└────────────────────────┼─────────────────────────────────────┘
                         ▼
┌──────────────────────────────────────────────────────────────┐
│             Chunk Servers (thousands)                         │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐         │
│  │64MB     │  │64MB     │  │64MB     │  │64MB     │  ...    │
│  │Chunks   │  │Chunks   │  │Chunks   │  │Chunks   │         │
│  │(3x rep) │  │(3x rep) │  │(3x rep) │  │(3x rep) │         │
│  └─────────┘  └─────────┘  └─────────┘  └─────────┘         │
└──────────────────────────────────────────────────────────────┘

Lesson from GFS

GFS succeeded by explicitly rejecting assumptions that limited prior systems:

This willingness to challenge established assumptions enabled unprecedented scale.

Historic Distributed Operating Systems

Several research systems pioneered distributed OS concepts, proving the vision achievable even if they never achieved mass adoption.

Amoeba (Vrije Universiteit, 1981-1996)

Vision: True distributed OS where users don't know which processor runs their code.

Architecture:

Pool Processors: A pool of CPUs. Processes run wherever resources are available.
Workstations: User terminals without local computation.
Specialized Servers: File servers, directory servers, etc.
Objects: All resources (files, processes, devices) represented as objects with capabilities.

Key Innovations:

Location independence: Processes could migrate transparently
Capability-based security: Unforgeable tokens for resource access
Group communication: Reliable multicast primitives
Microkernel design: Small kernel with user-space servers

FLIP Protocol: Amoeba's custom network protocol (Fast Local Internet Protocol) optimized for LAN communication with low latency and multicast support.

Modern Distributed Systems: Cloud and Container Orchestration

While true distributed operating systems never achieved mass adoption, their principles live on in modern cloud platforms and container orchestration systems.

Kubernetes as a Distributed OS

Kubernetes, though called a 'container orchestration system,' embodies many distributed OS concepts:

Distributed OS Concepts in Kubernetes
Distributed OS Concept	Kubernetes Implementation
Process abstraction	Pods (group of containers as scheduling unit)
Process placement	Scheduler places pods on nodes automatically
Process migration	Pod eviction and rescheduling on failure
Resource limits	CPU/memory requests and limits per container
Service naming	DNS-based service discovery (kube-dns, CoreDNS)
Storage abstraction	PersistentVolumes, StorageClasses
Configuration	ConfigMaps, Secrets
Networking	Virtual network, Service mesh
Health monitoring	Liveness/readiness probes, restarts
Replication	ReplicaSets maintain desired instance count

Declarative Resource Specification (Kubernetes)
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Kubernetes Deployment - Location Transparent Service
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-service
spec:
  replicas: 3  # Replication transparency: 3 instances
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: web
        image: nginx:1.21
        resources:
          limits:
            cpu: "500m"
            memory: "128Mi"
---
# Service abstraction - Location transparency
apiVersion: v1
kind: Service
metadata:
  name: web-service
spec:
  selector:
    app: web
  ports:
  - port: 80
    targetPort: 80
  # Clients access 'web-service:80'
  # Kubernetes routes to healthy pod instances
  # Client doesn't know which node serves the request

The Distributed OS We Actually Got

Academic distributed OS sought transparent single-system-image. Industry delivered something different but pragmatic:

Modern Coordination Services:

Distributed systems require coordination primitives that the distributed OS concepts formalized. Modern systems use specialized coordination services:

•Apache ZooKeeper: Distributed configuration, naming, synchronization (based on Paxos-like protocol)
•etcd: Distributed key-value store (Raft consensus), foundation of Kubernetes
•Consul: Service discovery, configuration, segmentation (Raft-based)
•Apache Kafka: Distributed log, event streaming (replicated commit log)

Summary: Distributed Operating Systems

Key Takeaways

•Distributed OS differs from Network OS — A true distributed OS provides a single system image; users don't know which machine serves their requests. Network OS exposes the distributed nature.
•Transparency hides distribution complexity — Eight forms of transparency (location, migration, replication, failure, etc.) aim to make distribution invisible, but each has costs.
•CAP theorem forces fundamental trade-offs — No distributed system achieves consistency, availability, and partition tolerance simultaneously. Design must choose.
•FLP impossibility limits consensus — Asynchronous distributed consensus cannot always terminate. Practical systems use timeouts and partial synchrony assumptions.
•Distributed coordination requires specialized algorithms — Mutual exclusion, leader election, and consensus require careful protocol design (Paxos, Raft, etc.).
•Distributed file systems make different trade-offs — From NFS's weak consistency to GFS's relaxed semantics, each DFS optimizes for specific workloads.
•Modern systems embody distributed principles — Kubernetes, cloud platforms, and coordination services implement distributed OS concepts in layered, pragmatic ways.

What's Next:

Page Complete