Leader Election - Learning Module

Loading content...

0/273

The Ring Algorithm

Election by Circulation

Imagine a group of people sitting in a circle, passing a note around. Each person writes their name on the note if they want to be leader, and when the note comes back to its originator, the person with the 'highest' name on the list wins. This orderly, sequential approach is the essence of the Ring Algorithm.

Proposed by Chang and Roberts in 1979, the Ring Algorithm takes a fundamentally different approach from the Bully Algorithm. Instead of broadcasting challenges to all higher-priority nodes simultaneously, the Ring Algorithm arranges nodes in a logical ring topology and passes election messages sequentially around the ring. Each node contributes its candidacy, and after a complete circuit, the winner is determined.

This sequential approach trades latency for simplicity and reduced message burst. Where Bully floods the network with O(n²) messages in parallel, the Ring Algorithm sends O(n) messages in sequence.

What You Will Learn

By the end of this page, you will understand how the Ring Algorithm operates, the concept of logical ring topology, its message and time complexity characteristics, how it handles node failures, its strengths and limitations compared to Bully, and the practical scenarios where ring-based election might be appropriate.

Understanding Logical Ring Topology

Before diving into the algorithm, we must understand the logical ring topology that underpins it. This is a crucial concept that distinguishes the Ring Algorithm from other approaches.

Physical vs Logical Topology:

The physical network connecting nodes might be any structure—fully connected, mesh, star, or even the internet. The logical ring is an overlay abstraction where nodes are arranged in a conceptual circle:

Each node knows its successor (next node in the ring)
Messages flow in one direction around the ring
The ring is unidirectional: node A sends to B, B sends to C, C sends to A
Any node can initiate communication around the ring

Constructing the logical ring:

The logical ring is typically constructed by ordering nodes by their identifiers and connecting them cyclically:

Sort all node identifiers: [1, 3, 5, 7, 9]
Connect sequentially: 1 → 3 → 5 → 7 → 9 → 1
Each node stores only its successor's address

Example ring with 5 nodes:

        Node 1
       ↗      ↘
     Node 9    Node 3
       ↑          ↓
     Node 7 ← Node 5

Messages always flow clockwise (in this illustration). Node 3's successor is Node 5; Node 9's successor is Node 1 (wrapping around).

Logical Doesn't Mean Physical

The logical ring doesn't require a physical ring network. Nodes 1 and 3 might be in different datacenters connected over the internet. The 'ring' is purely a software abstraction—a way of organizing communication patterns. This abstraction simplifies the algorithm but introduces challenges when nodes fail (breaking the ring).

Why use a ring topology?

The ring provides several algorithmic benefits:

Bounded message paths: Messages travel at most n hops (once around the ring)
Simple routing: Each node needs to know only its successor, not all nodes
Natural termination: An election completes when the message returns to its origin
Space efficiency: O(1) routing information per node vs O(n) for full knowledge

However, rings also have drawbacks: a single node failure breaks the ring, and sequential message passing increases latency compared to parallel communication.

Algorithm Operation

The Ring Algorithm uses two types of messages:

ELECTION — Carries a list of candidates (initially just the initiator)
COORDINATOR — Announces the elected leader (sent after election completes)

Election initiation:

When a node P detects that the leader has failed (via heartbeat timeout), it initiates an election:

P marks itself as participating in an election
P creates an ELECTION message containing its own identifier: [P]
P sends this message to its successor in the ring

ELECTION message handling:

When a node Q receives an ELECTION message:

Check if Q is already in the candidate list:
- If YES: The message has completed a full circuit. Q determines the winner (highest ID in the list) and initiates the COORDINATOR phase.
- If NO: Continue to step 2.
Add Q to the candidate list:
- Q appends its own identifier to the list: [..., Q]
- Q marks itself as participating
Forward the message:
- Q sends the updated ELECTION message to its successor

COORDINATOR message handling:

Once election completes (message returns to originator):

The winner is the node with the highest identifier in the candidate list
The detecting node sends a COORDINATOR message around the ring announcing the winner
Each node receives COORDINATOR, notes the new leader, and forwards the message
When COORDINATOR returns to its originator, election is complete

Ring Algorithm Message Types
Message	Purpose	Contents	Termination
ELECTION	Collect candidates	List of node IDs	Returns to originator
COORDINATOR	Announce winner	Elected leader ID	Returns to originator

Variant: Single-Pass Election

An optimized variant combines ELECTION and COORDINATOR into a single circuit. Instead of collecting all candidates and then announcing, each node immediately recognizes if it cannot win (its ID is lower than the current max) and simply forwards. When the message returns, the winner is already known. This saves n messages but complicates concurrent election handling.

Step-by-Step Example

Let's trace through a complete election example. Consider a ring of 5 nodes with identifiers 2, 4, 6, 8, 10 arranged as: 2 → 4 → 6 → 8 → 10 → 2.

Initial state:

Node 10 is the current leader
Ring order: 2 → 4 → 6 → 8 → 10 → (back to 2)
Node 10 crashes

Ring Algorithm Execution Trace
Step	Actor	Action	Message State
1	Node 4	Detects leader failure, initiates election	ELECTION [4] → Node 6
2	Node 6	Receives ELECTION, adds self	ELECTION [4, 6] → Node 8
3	Node 8	Receives ELECTION, adds self	ELECTION [4, 6, 8] → Node 10
4	—	Node 10 is down, message times out to Node 8	Ring broken! (see failure handling)
4b	Node 8	Skips Node 10, sends to Node 2	ELECTION [4, 6, 8] → Node 2
5	Node 2	Receives ELECTION, adds self	ELECTION [4, 6, 8, 2] → Node 4
6	Node 4	Sees self in list, election complete!	Winner: max([4, 6, 8, 2]) = 8
7	Node 4	Sends COORDINATOR announcing Node 8	COORDINATOR (8) → Node 6
8-10	Nodes 6,8,2	Forward COORDINATOR around ring	All nodes know Node 8 is leader
11	Node 4	COORDINATOR returns, election done	Node 8 is now the leader

Key observations:

Sequential processing: Unlike Bully's parallel broadcast, Ring sends one message at a time around the ring. This is slower but more orderly.
Candidate accumulation: The ELECTION message grows as it travels, accumulating all participating nodes' identifiers.
Natural termination: The election ends when the message returns to its originator—simple and elegant.
Failure handling required: When Node 10 (crashed) couldn't receive the message, the ring was broken. Step 4b shows failure handling—Node 8 must detect the failure and skip to the next live node.
Two complete circuits: ELECTION goes around once, then COORDINATOR goes around once. Total: 2n messages for a successful election (in the basic algorithm).

The Broken Ring Problem

A crashed node breaks the ring. If Node 8's successor (Node 10) is down, Node 8 cannot simply forward the message. The algorithm must handle this by maintaining backup successor information or dynamically discovering the next live node. This adds significant complexity to the 'simple' ring approach.

Complexity Analysis

The Ring Algorithm's complexity characteristics differ significantly from the Bully Algorithm. Let's analyze both message and time complexity.

Message complexity:

Best case (single initiator, no failures):

ELECTION phase: n messages (one full circuit)
COORDINATOR phase: n messages (one full circuit)
Total: 2n = O(n) messages

This is a dramatic improvement over Bully's O(n²) worst case.

Worst case (all nodes initiate simultaneously):

Each of n nodes starts its own ELECTION message
Each ELECTION travels n hops: n² messages
If we suppress redundant COORDINATOR phases: add n more
Total: O(n²) messages

However, this worst case is less likely than Bully's because:

Only one node typically detects failure first (they timeout at different times)
Nodes that receive an ELECTION message can refrain from starting their own
Message suppression can eliminate redundant circuits

Complexity Comparison: Ring vs Bully
Metric	Ring Algorithm	Bully Algorithm
Best-case messages	O(n)	O(n)
Worst-case messages	O(n²)	O(n²)
Typical-case messages	O(n)	O(n²)
Message path length	Long (n hops)	Short (1 hop)
Time complexity	O(n × d)	O(n × t)
Network burst	Low (sequential)	High (parallel)

Time complexity:

The Ring Algorithm's time complexity is its significant weakness:

Each message hop incurs network delay d
Complete circuit: n × d
Two circuits (ELECTION + COORDINATOR): 2n × d
Total election time: O(n × d)

If n = 10 and d = 10ms, the election takes approximately 200ms. For n = 100 and d = 10ms, it's 2 seconds. Compare this to Bully, where (in the best case) election completes in a single timeout period regardless of cluster size.

This latency characteristic makes Ring unsuitable for systems requiring fast failover.

Sequential vs Parallel Trade-off

The Ring Algorithm trades network burst for latency. By sending messages sequentially rather than in parallel, Ring reduces instantaneous network load but increases total election time. This is beneficial in bandwidth-constrained environments but problematic when fast failover is critical.

Handling Node Failures

Node failures in the Ring Algorithm are particularly problematic because they break the ring structure. A single failed node stops message circulation unless the algorithm handles it. Let's examine failure scenarios and solutions.

Failure Scenarios

•Successor failure — A node's immediate successor crashes. The sending node cannot forward messages and the ring is broken. This is the most common failure mode.
•Multiple consecutive failures — If nodes A → B → C → D and both B and C fail, A cannot reach D through normal ring forwarding. Requires knowledge beyond immediate successor.
•Failure during message transit — A node receives but crashes before forwarding. The message is lost. Sender doesn't know if message was received.
•Initiator failure — The node that started the election crashes. Other nodes continue the election but no one recognizes the termination condition (message return to originator).

Solution 1: Successor lists

Each node maintains a list of its next k successors rather than just one. If the immediate successor fails:

Node detects failure (timeout on acknowledgment)
Node sends to second successor
If second fails, try third, and so on
Eventually find a live successor or determine ring is too broken

Example with k=3:

Node 4's successors: [6, 8, 10]
Node 6 fails → send to 8
Node 8 also fails → send to 10

This handles up to k-1 consecutive failures.

Solution 2: Ring repair

When a failure is detected, repair the ring before continuing:

Node A detects successor B has failed
A queries known nodes: 'Who is B's successor?'
A connects to B's successor C, bypassing B
Ring is repaired: A → C

This is more complex but maintains a clean ring structure.

Solution 3: Timeout and restart

The simplest (but slowest) approach:

If forwarding fails, wait for timeout
Rebuild ring from scratch (re-query group membership)
Restart election on new ring

This handles all failure scenarios but may take significant time.

Ring Fragility

The ring structure is inherently fragile. Any single node failure requires special handling. In contrast, algorithms like Raft tolerate minority failures naturally through quorum voting. This fragility is a significant drawback of ring-based election in production environments.

Handling Concurrent Elections

When multiple nodes detect the leader failure simultaneously, they may all initiate elections. Unlike Bully's cascading takeover, Ring can have multiple ELECTION messages circulating the ring at once. This creates complexity that must be handled correctly.

Scenario: Three nodes start elections simultaneously

Nodes 2, 6, and 8 all detect failure and initiate elections:

Message A from Node 2: [2]
Message B from Node 6: [6]
Message C from Node 8: [8]

All three messages travel around the ring. Without coordination:

Message A arrives at Node 2 with candidates [2, 4, 6, 8, 10]
Message B arrives at Node 6 with candidates [6, 8, 10, 2, 4]
Message C arrives at Node 8 with candidates [8, 10, 2, 4, 6]

All three have the same candidates (different order), and all three compute the same winner (10, if 10 hadn't crashed). But now three COORDINATOR messages are sent!

Concurrent Election Solutions

•Message merging — When a node receives an ELECTION message while already participating in an election, merge the candidate lists and continue with a single message. Suppresses redundant elections.
•Initiator priority — Attach the initiator's ID to the message. If a node receives an ELECTION from a lower-priority initiator while participating in a higher-priority election, discard the duplicate. Only the highest-priority initiator's election completes.
•Election IDs — Each election has a unique ID (timestamp + initiator). Nodes track which elections they've participated in and ignore duplicates. The first election to complete wins.
•Single COORDINATOR — Even if multiple ELECTION messages complete, only send COORDINATOR if no COORDINATOR has been seen yet. Subsequent messages detect existing COORDINATOR and stop.

The Chang-Roberts optimization:

The original Chang-Roberts algorithm includes an optimization for concurrent elections:

When a node receives an ELECTION message with candidates:
- If the node's ID is greater than all candidates, add self and forward
- If the node's ID is less than the max candidate, don't add self (already a better candidate)
- If the node sees its own ID and it's the max, it has won
This way, lower-priority candidates drop out as the message travels. By the time the message returns, only the winner's ID matters.
Concurrent elections naturally merge: the message carrying the highest ID 'wins' and absorbs other elections.

Election Convergence

Regardless of how many elections start concurrently, they all compute the same winner (the highest-priority live node). The key is ensuring the system converges to a single COORDINATOR announcement. With proper duplicate suppression, concurrent elections add overhead but don't cause correctness issues.

Ring vs Bully: A Detailed Comparison

Having now studied both the Bully and Ring algorithms, we can make an informed comparison. Each algorithm has distinct characteristics that make it suitable for different scenarios.

Ring vs Bully Algorithm Comparison
Characteristic	Ring Algorithm	Bully Algorithm
Topology requirement	Logical ring structure	Complete knowledge of all nodes
Typical message complexity	O(n) – single circuit	O(n²) – all-to-higher broadcast
Election time	O(n × latency) – sequential	O(timeout) – parallel
Network load pattern	Steady, sequential	Burst, parallel
Single node failure impact	Breaks ring – must handle	No structural impact
Routing state per node	O(1) – just successor	O(n) – all node addresses
Implementation complexity	Moderate	Simple
Partition handling	Poor – ring breaks	Poor – same limitations

Choose Ring When

•Cluster is large (reduces message count)
•Network bandwidth is limited
•Nodes naturally form a ring (DHT, etc.)
•Election latency is not critical
•Node failure rate is low
•Memory on each node is constrained

Choose Bully When

•Cluster is small (< 15 nodes)
•Network bandwidth is plentiful
•Fast failover is critical
•Nodes don't form natural ring
•Implementation simplicity matters
•Memory is not a constraint

Both Are Historical

In practice, neither Ring nor Bully is used in modern production systems. Both lack proper partition handling, and their complexity doesn't justify their limitations. Modern systems use Raft, Paxos, or external coordination services. Ring and Bully remain valuable for understanding the problem space and why more sophisticated solutions evolved.

Summary: The Ring Algorithm

We've thoroughly explored the Ring Algorithm—its elegant sequential approach, topology requirements, and practical limitations. Let's consolidate the key takeaways:

Key Takeaways

•Ring topology enables sequential election — Nodes arranged in a logical ring pass ELECTION messages around, accumulating candidates until returning to the originator.
•O(n) typical message complexity — Dramatically better than Bully's O(n²), but O(n) time complexity makes elections slow in large clusters.
•Node failures break the ring — The sequential structure is fragile; each failure requires special handling through successor lists or ring repair.
•Natural termination — Elections complete when messages return to their origin, providing a simple termination condition.
•Concurrent elections require handling — Multiple simultaneous elections must be merged or suppressed to avoid redundant COORDINATOR announcements.
•Low network burst — Sequential messaging reduces instantaneous network load, beneficial in bandwidth-constrained environments.

What's next:

We'll explore Lease-Based Election, a fundamentally different approach that doesn't require explicit election voting at all. Instead of nodes voting for a leader, a leader holds a time-limited lease that grants exclusive leadership authority. This approach elegantly handles network partitions and provides bounded uncertainty—properties that Bully and Ring lack.

Page Complete

You now understand the Ring Algorithm's operation, complexity characteristics, failure handling, and comparison with Bully. While not used in modern production systems, Ring provides valuable insight into structured coordination patterns. Next, we'll examine lease-based election and its elegant approach to bounded leadership.