Data Structures & AlgorithmsSkip Lists (Conceptual)

Skip Lists — Probabilistic Alternative to Balanced Trees

LevelAdvanced

Duration60 mins

TopicSkip Lists (Conceptual)

2 / 4

Multiple Levels of Linked Lists

Building the Express Lane System

In the previous page, we discovered that randomized level assignment can achieve balanced structure without explicit rebalancing. But how exactly does this translate into a working data structure? What does a skip list actually look like, and how do the pieces fit together?

The answer lies in understanding skip lists as a tower of sorted linked lists—multiple levels where higher levels provide increasingly sparse "express lanes" for rapid navigation, while the bottom level contains all elements for completeness.

This multi-layered architecture is the heart of skip lists. It's what transforms the abstract idea of probabilistic balancing into a concrete, implementable data structure with elegant operations.

What You Will Learn

By the end of this page, you will understand the complete structural anatomy of skip lists: how nodes contain multiple forward pointers, how levels connect horizontally, how the sentinel header node anchors everything, and how traversal uses these structures to achieve efficient search. You'll be able to visualize and draw skip lists, which is essential for implementing them.

Skip List Node Architecture

A skip list node is fundamentally different from a regular linked list node. Rather than having a single next pointer, a skip list node has an array of forward pointers—one for each level it participates in.

Conceptual Node Structure:

+---------------------+
|       value         |    The stored key/value
+---------------------+
| forward[maxLevel]   |    Pointer to next node at highest level
| forward[maxLevel-1] |    Pointer to next node at level maxLevel-1
|        ...          |              ...
| forward[2]          |    Pointer to next node at level 2
| forward[1]          |    Pointer to next node at level 1 (bottom)
+---------------------+

The level of a node determines how many forward pointers it has. A level-3 node has 3 forward pointers (for levels 1, 2, and 3), while a level-1 node has only 1 (for level 1).

skip_list_node.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
"""
Skip List Node Implementation
 
Each node contains a value and an array of forward pointers,
one for each level the node participates in.
"""
from typing import TypeVar, Generic, Optional, List
 
K = TypeVar('K')  # Key type (must be comparable)
V = TypeVar('V')  # Value type
 
 
class SkipListNode(Generic[K, V]):
    """
    A node in the skip list data structure.
    
    Attributes:
        key: The search key for this node
        value: The associated value (for key-value skips lists)
        forward: Array of forward pointers, one per level
        level: The height of this node (number of forward pointers)
    
    Memory layout visualization:
    
    Level 3:  forward[2] ──────────────────────────────→ (next at L3)
    Level 2:  forward[1] ──────────→ (next at L2)
    Level 1:  forward[0] ──→ (next at L1)
              ┌─────────┐
              │   key   │
              │  value  │
              └─────────┘
    """
    
    __slots__ = ('key', 'value', 'forward')  # Memory optimization
    
    def __init__(self, key: K, value: V, level: int):
        """
        Create a new skip list node with the specified level.
        
        Args:
            key: The search key
            value: The associated value
            level: How many levels this node participates in (1-indexed)
                   A level-k node appears in levels 1 through k
        
        The forward array has 'level' slots, indexed 0 to level-1,
        where forward[i] points to the next node at level i+1.
        
        Example for level=3:
            forward[0] → next node at level 1 (bottom level, all nodes)
            forward[1] → next node at level 2 (sparse, ~half of nodes)
            forward[2] → next node at level 3 (sparser, ~quarter of nodes)
        """
        self.key = key
        self.value = value
        # Initialize all forward pointers to None
        # forward[i] corresponds to level (i+1)
        self.forward: List[Optional['SkipListNode[K, V]']] = [None] * level
    
    @property
    def level(self) -> int:
        """Return the height (level) of this node."""
        return len(self.forward)
    
    def get_forward(self, level: int) -> Optional['SkipListNode[K, V]']:
        """
        Get the forward pointer at the specified level.
        
        Args:
            level: 1-indexed level number
        
        Returns:
            The next node at this level, or None if at end
        """
        if level < 1 or level > self.level:
            return None
        return self.forward[level - 1]  # Convert to 0-indexed
    
    def set_forward(self, level: int, node: Optional['SkipListNode[K, V]']):
        """
        Set the forward pointer at the specified level.
        
        Args:
            level: 1-indexed level number
            node: The node to point to (or None for end)
        """
        if 1 <= level <= self.level:
            self.forward[level - 1] = node
    
    def __repr__(self) -> str:
        fwd_info = ", ".join(
            f"L{i+1}→{n.key if n else '∅'}" 
            for i, n in enumerate(self.forward)
        )
        return f"Node({self.key}, level={self.level}, [{fwd_info}])"
 
 
# Demonstration: Creating and connecting nodes
def demonstrate_node_structure():
    """
    Show how skip list nodes connect at multiple levels.
    
    Building this structure:
    
    Level 3:  HEAD ─────────────────── 19 ─────────────────── NULL
                │                       │
    Level 2:  HEAD ────── 7 ────────── 19 ────── 31 ────────── NULL
                │         │             │         │
    Level 1:  HEAD ─ 3 ── 7 ── 12 ──── 19 ─ 24 ─ 31 ── 42 ──── NULL
    """
    # Create nodes with their randomly assigned levels
    node_3 = SkipListNode(3, "three", level=1)    # Only level 1
    node_7 = SkipListNode(7, "seven", level=2)    # Levels 1-2
    node_12 = SkipListNode(12, "twelve", level=1) # Only level 1
    node_19 = SkipListNode(19, "nineteen", level=3) # Levels 1-3
    node_24 = SkipListNode(24, "twenty-four", level=1) # Only level 1
    node_31 = SkipListNode(31, "thirty-one", level=2) # Levels 1-2
    node_42 = SkipListNode(42, "forty-two", level=1) # Only level 1
    
    # Connect at Level 1 (all nodes participate)
    node_3.set_forward(1, node_7)
    node_7.set_forward(1, node_12)
    node_12.set_forward(1, node_19)
    node_19.set_forward(1, node_24)
    node_24.set_forward(1, node_31)
    node_31.set_forward(1, node_42)
    # node_42.forward[0] remains None (end of list)
    
    # Connect at Level 2 (nodes 7, 19, 31 participate)
    node_7.set_forward(2, node_19)
    node_19.set_forward(2, node_31)
    # node_31.forward[1] remains None (end at level 2)
    
    # Connect at Level 3 (only node 19 participates)
    # node_19.forward[2] remains None (end at level 3)
    
    # Display the structure
    print("Skip List Node Structure:")
    print("=" * 60)
    for node in [node_3, node_7, node_12, node_19, node_24, node_31, node_42]:
        print(node)
    
    # Trace a search path
    print("\nSearch path for key 31:")
    print("-" * 40)
    print("Starting at Level 3...")
    print(f"  At node 19 (Level 3): 19 < 31, but forward[2] is None")
    print(f"  Drop to Level 2 at node 19")
    print(f"  At node 19 (Level 2): forward → 31")
    print(f"  31 == 31: Found!")

Memory Efficiency Note

Notice that node levels are determined at creation time and never change. This means we allocate exactly the right number of forward pointers—no more, no less. With p=0.5, the average node has 2 forward pointers (geometric series: 1 + 0.5 + 0.25 + ... = 2), so skip lists use about 2n pointers total, compared to 2n for a doubly-linked list.

Level Organization and Structure

Understanding how levels are organized is crucial for implementing skip lists correctly. Let's establish the key structural properties.

Level Indexing Conventions:

There are two common conventions for numbering levels:

1-indexed (Pugh's original): Levels numbered 1 to maxLevel, with level 1 being the bottom (densest) level.
0-indexed (common in implementations): Levels numbered 0 to maxLevel-1, with level 0 being the bottom.

We'll use 1-indexed levels conceptually (for clarity) but 0-indexed arrays in code. The key insight is:

Level 1 always contains all elements. Higher levels contain successively sparser subsets.

Visual Representation of a Skip List:

Consider a skip list with 8 elements and randomly assigned levels:

                    ┌───────────────────────────────────────────────────────┐
                    │                    SKIP LIST                          │
                    │            (max level = 4, n = 8)                     │
                    └───────────────────────────────────────────────────────┘

Level 4:  [HEAD] ─────────────────── [25] ─────────────────────────────────── NULL
             │                         │
Level 3:  [HEAD] ───── [10] ───────── [25] ─────────────────── [50] ───────── NULL
             │          │               │                        │
Level 2:  [HEAD] ───── [10] ── [17] ── [25] ── [30] ────────── [50] ── [75] ─ NULL
             │          │       │       │       │                │       │
Level 1:  [HEAD] ─ [5] [10] ── [17] ── [25] ── [30] ── [44] ── [50] ── [75] ─ NULL
                    │    │       │       │       │       │       │       │
                    ▼    ▼       ▼       ▼       ▼       ▼       ▼       ▼
                  (All elements are in level 1 - the complete sorted list)

Structural Invariants:

Sorted Order: At every level, elements are in strictly increasing order.
Subset Property: Level k+1 is a subset of level k. Every element at a higher level also appears at all lower levels.
Header Participation: The header node has pointers for all potential levels (up to max level).
Independent Termination: Each level terminates independently with NULL pointers.

Level Participation in Example Skip List
Element	Levels Present	Level Height	Forward Pointers
5	1	1	[1→10]
10	1, 2, 3	3	[1→17, 2→17, 3→25]
17	1, 2	2	[1→25, 2→25]
25	1, 2, 3, 4	4	[1→30, 2→30, 3→50, 4→NULL]
30	1, 2	2	[1→44, 2→50]
44	1	1	[1→50]
50	1, 2, 3	3	[1→75, 2→75, 3→NULL]
75	1, 2	2	[1→NULL, 2→NULL]

The Tower Analogy

Each element can be visualized as a "tower" of varying height. A level-3 element is like a 3-story tower that participates in express lanes at floors 1, 2, and 3. The skyline of these towers forms the skip list structure.

The Header (Sentinel) Node

The header node is a special sentinel that anchors the entire skip list. It has several unique properties that distinguish it from regular nodes.

Header Node Characteristics:

Maximum Height: The header has forward pointers for the maximum possible level, even if no data nodes reach that height. This ensures we can always start searches at the top.
No Key/Value: The header doesn't store actual data—it's purely structural. Its "key" is conceptually -∞ (negative infinity), meaning all real keys are greater.
Entry Point: All operations (search, insert, delete) begin at the header node.
Tracks Current Level: The skip list maintains the current highest level that actually contains nodes (not just the header). This optimization avoids traversing empty levels.

skip_list_structure.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
"""
Skip List Core Structure with Header Node
 
The header node is the anchor point for all skip list operations.
It maintains forward pointers for all possible levels.
"""
from typing import TypeVar, Generic, Optional, List, Tuple
import random
 
K = TypeVar('K')
V = TypeVar('V')
 
 
class SkipListNode(Generic[K, V]):
    """Node in the skip list with level-indexed forward pointers."""
    
    __slots__ = ('key', 'value', 'forward')
    
    def __init__(self, key: Optional[K], value: Optional[V], level: int):
        self.key = key
        self.value = value
        self.forward: List[Optional['SkipListNode[K, V]']] = [None] * level
 
 
class SkipList(Generic[K, V]):
    """
    Skip List data structure with probabilistic balancing.
    
    Structure:
    - header: Sentinel node with max_level forward pointers
    - level: Current maximum level actually in use (not including header)
    - max_level: Theoretical maximum level (typically log₂(n_expected))
    - probability: Promotion probability (typically 0.5)
    
    Visualization:
    
    ┌──────────────────────────────────────────────────────────────┐
    │  SkipList                                                    │
    │  ┌─────────┐                                                 │
    │  │ HEADER  │  ← Entry point (key = -∞ conceptually)          │
    │  │ level=4 │                                                 │
    │  └────┬────┘                                                 │
    │       │                                                      │
    │  Level 4: [H]────────────────────[25]────────────────── NULL │
    │           │                       │                          │
    │  Level 3: [H]─────[10]───────────[25]──────────[50]──── NULL │
    │           │        │              │              │           │
    │  Level 2: [H]─────[10]──[17]────[25]──[30]────[50]──[75] NULL│
    │           │        │     │       │     │       │     │       │
    │  Level 1: [H]─[5]─[10]──[17]────[25]──[30]─[44][50]──[75] NULL│
    └──────────────────────────────────────────────────────────────┘
    """
    
    DEFAULT_MAX_LEVEL = 16  # Supports ~65,536 elements efficiently
    DEFAULT_PROBABILITY = 0.5
    
    def __init__(
        self, 
        max_level: int = DEFAULT_MAX_LEVEL,
        probability: float = DEFAULT_PROBABILITY
    ):
        """
        Initialize an empty skip list.
        
        Args:
            max_level: Maximum number of levels (default 16)
            probability: Chance of promoting to next level (default 0.5)
        
        The header node is created with max_level forward pointers,
        all initially pointing to None.
        """
        self.max_level = max_level
        self.probability = probability
        
        # Current highest level with actual nodes (0 = empty)
        self.level = 0
        
        # Number of elements (excludes header)
        self.size = 0
        
        # Header node: special sentinel with no key, max height
        # Conceptually stores key = -∞
        self.header = SkipListNode[K, V](
            key=None,  # Sentinel, no actual key
            value=None,
            level=max_level
        )
    
    def _random_level(self) -> int:
        """
        Generate a random level for a new node.
        
        Uses geometric distribution: P(level = k) = p^(k-1) * (1-p)
        Expected level = 1 / (1 - p) = 2 for p = 0.5
        
        Returns:
            Level from 1 to max_level
        """
        level = 1
        while random.random() < self.probability and level < self.max_level:
            level += 1
        return level
    
    def is_empty(self) -> bool:
        """Check if the skip list contains no elements."""
        return self.size == 0
    
    def __len__(self) -> int:
        """Return the number of elements in the skip list."""
        return self.size
    
    def _display(self) -> str:
        """
        Generate a visual representation of the skip list.
        Useful for debugging and understanding structure.
        """
        if self.is_empty():
            return "Empty SkipList"
        
        lines = []
        lines.append(f"SkipList (size={self.size}, level={self.level})")
        lines.append("=" * 60)
        
        # Collect all nodes
        nodes = []
        current = self.header.forward[0]
        while current:
            nodes.append(current)
            current = current.forward[0]
        
        # Print each level from top to bottom
        for lv in range(self.level, 0, -1):
            line = f"L{lv}: [H]"
            current = self.header.forward[lv - 1]
            for node in nodes:
                if node.key is not None and len(node.forward) >= lv:
                    # Check if this node is at this level
                    if current == node:
                        line += f"──[{node.key}]"
                        current = node.forward[lv - 1]
                    else:
                        line += "─" * (len(str(node.key)) + 4)
                else:
                    line += "─" * (len(str(node.key)) + 4)
            line += "── NULL"
            lines.append(line)
        
        return "\n".join(lines)
 
 
# Demonstration
def demonstrate_header():
    """Show the header node and initial skip list structure."""
    sl = SkipList[int, str](max_level=6)
    
    print("Empty Skip List Structure:")
    print("=" * 50)
    print(f"Max Level: {sl.max_level}")
    print(f"Current Level: {sl.level}")
    print(f"Size: {sl.size}")
    print(f"Header forward pointers: {len(sl.header.forward)}")
    print(f"All header forwards are None: {all(p is None for p in sl.header.forward)}")
    
    print("\nHeader node visualization:")
    print("┌───────────┐")
    print("│  HEADER   │")
    print("├───────────┤")
    for i in range(sl.max_level - 1, -1, -1):
        print(f"│ forward[{i}]│──── NULL")
    print("└───────────┘")
    print("(All levels ready, waiting for data)")

Max Level Selection

The max_level should be chosen based on expected data size. A common formula is maxLevel = ⌈log_{1/p}(n)⌉. For p=0.5 and n=65,536, max level = 16 suffices. For n=1,000,000, use max level = 20. Under-sizing wastes performance; over-sizing wastes (minimal) memory.

Navigating the Multi-Level Structure

The true elegance of skip lists emerges when we understand navigation. The search algorithm exploits the multi-level structure to achieve O(log n) expected time through a simple pattern:

Go right as far as possible at the current level, then drop down.

This pattern is deceptively simple but incredibly effective. Let's trace it step by step.

Search Algorithm Walkthrough:

To search for key k, starting at the header:

1. Set current = header, level = current_max_level

2. While level >= 1:
   a. While current.forward[level] has key < k:
      - Move right: current = current.forward[level]
   b. Drop down: level = level - 1

3. Check current.forward[0]:
   - If key == k: Found!
   - Otherwise: Not found

Visual Trace: Searching for 44

Searching for key 44 in:

Level 4:  [H] ─────────────────── [25] ─────────────────────────────────── NULL
Level 3:  [H] ───── [10] ───────── [25] ─────────────────── [50] ───────── NULL
Level 2:  [H] ───── [10] ── [17] ── [25] ── [30] ────────── [50] ── [75] ─ NULL
Level 1:  [H] ─ [5] [10] ── [17] ── [25] ── [30] ── [44] ── [50] ── [75] ─ NULL

Step 1: Start at header, level 4
        → forward[4] = 25, and 25 < 44, so move right to 25
        
Step 2: At node 25, level 4
        → forward[4] = NULL, can't go right
        → Drop to level 3
        
Step 3: At node 25, level 3
        → forward[3] = 50, but 50 > 44, can't go right
        → Drop to level 2
        
Step 4: At node 25, level 2
        → forward[2] = 30, and 30 < 44, so move right to 30
        
Step 5: At node 30, level 2
        → forward[2] = 50, but 50 > 44, can't go right
        → Drop to level 1
        
Step 6: At node 30, level 1
        → forward[1] = 44, and 44 == 44, Found!

Total comparisons: 6 (compared 25, NULL, 50, 30, 50, 44)

skip_list_search.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
def search(self, key: K) -> Tuple[bool, Optional[V], List[str]]:
    """
    Search for a key in the skip list.
    
    The search algorithm:
    1. Start at the header node at the highest level
    2. At each level, move right while the next key is less than target
    3. When we can't move right, drop down one level
    4. Repeat until we reach level 1
    5. Check if the next node at level 1 has our key
    
    Args:
        key: The key to search for
    
    Returns:
        Tuple of (found: bool, value: Optional[V], trace: List[str])
        The trace shows the path taken for debugging/visualization
    
    Time Complexity: O(log n) expected
    Space Complexity: O(1)
    """
    trace = []
    current = self.header
    
    # Start at the highest level that contains nodes
    trace.append(f"Starting search for key {key}")
    trace.append(f"Current level: {self.level}")
    
    # Traverse from top level down to level 1
    for lv in range(self.level, 0, -1):
        trace.append(f"Level {lv}: at node {current.key or 'HEADER'}")
        
        # Move right while next node's key is less than target
        while current.forward[lv - 1] is not None and \
              current.forward[lv - 1].key < key:
            current = current.forward[lv - 1]
            trace.append(f"  → moved right to {current.key}")
        
        # Log why we stopped
        if current.forward[lv - 1] is None:
            trace.append(f"  → reached end of level, dropping down")
        else:
            trace.append(f"  → next is {current.forward[lv - 1].key} >= {key}, dropping down")
    
    # After the loop, current.forward[0] is either:
    # - The node with our key (if it exists)
    # - The node with the smallest key greater than ours
    # - None if our key would be the largest
    
    next_node = current.forward[0]
    
    if next_node is not None and next_node.key == key:
        trace.append(f"Found {key} with value {next_node.value}")
        return True, next_node.value, trace
    else:
        trace.append(f"Key {key} not found")
        return False, None, trace
 
 
# Example usage with visual trace
def demonstrate_search():
    """Demonstrate the search algorithm with detailed trace."""
    sl = SkipList[int, str]()
    
    # Manually build a specific structure for demonstration
    # (In practice, use insert() which assigns random levels)
    elements = [
        (5, "five", 1), (10, "ten", 3), (17, "seventeen", 2),
        (25, "twenty-five", 4), (30, "thirty", 2), (44, "forty-four", 1),
        (50, "fifty", 3), (75, "seventy-five", 2)
    ]
    
    # Build the structure...
    # (implementation details omitted for clarity)
    
    # Search demonstration
    print("Search for key 44:")
    print("=" * 50)
    found, value, trace = sl.search(44)
    for step in trace:
        print(step)
    print(f"\nResult: {'Found' if found else 'Not found'}, value = {value}")

Why This Pattern Works

At each level, we skip over many elements with a single pointer traversal. Level k skips ~2^(k-1) elements per step on average. By starting at the top and working down, we eliminate large portions of the search space quickly—exactly like binary search, but on a linked structure.

The Update Array: Tracking Predecessors

For insertions and deletions, we need more than just finding the position—we need to know the predecessor at each level so we can update pointers. This is the role of the update array.

The Update Array Concept:

During traversal, we maintain an array update[] where:

update[i] = the last node visited at level i before we dropped down or moved past the target position

In other words, update[i] is the rightmost node at level i whose key is less than our target. These are exactly the nodes whose forward pointers need to change during insertion or deletion.

update_array.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
def _find_update_array(self, key: K) -> List[SkipListNode[K, V]]:
    """
    Find the predecessor at each level for a given key.
    
    The update array is crucial for insert and delete operations:
    - update[i] is the last node at level i+1 before the target position
    - These are the nodes whose forward pointers will change
    
    Args:
        key: The key we're searching for
    
    Returns:
        List of predecessor nodes, one per level (0 to max_level-1)
    
    Visual example for key 44:
    
    Level 4:  [H] ────────────────── [25] ────────────────────── NULL
               ↓                                                 
              update[3] = 25      (last node at L4 before 44 position)
              
    Level 3:  [H] ───── [10] ──── [25] ─────────── [50] ─────── NULL
                                   ↓
              update[2] = 25      (last node at L3 before 44 position)
              
    Level 2:  [H] ───── [10] ─ [17] ─ [25] ── [30] ── [50] ─ [75] NULL
                                               ↓
              update[1] = 30      (last node at L2 before 44 position)
              
    Level 1:  [H] ─[5]─ [10]──[17]───[25]──[30]─────[44]──[50]─[75] NULL
                                              ↓
              update[0] = 30      (last node at L1 before 44 position)
    
    Result: update = [30, 30, 25, 25] (plus potentially more for higher max_level)
    """
    # Initialize update array with header node for all levels
    update: List[SkipListNode[K, V]] = [self.header] * self.max_level
    current = self.header
    
    # Traverse from the current highest level down to level 1
    for lv in range(self.level, 0, -1):
        # Move right while next key is less than target
        while current.forward[lv - 1] is not None and \
              current.forward[lv - 1].key < key:
            current = current.forward[lv - 1]
        
        # Record this node as the predecessor at this level
        # Array is 0-indexed, so level lv uses index lv-1
        update[lv - 1] = current
    
    return update
 
 
def insert(self, key: K, value: V) -> None:
    """
    Insert a key-value pair into the skip list.
    
    Algorithm:
    1. Find predecessors at all levels using update array
    2. Generate a random level for the new node
    3. If new level > current level, update headers point to new node
    4. Create new node and splice into each participating level
    
    Time Complexity: O(log n) expected
    """
    # Step 1: Find update array
    update = self._find_update_array(key)
    
    # Check for existing key at level 1
    candidate = update[0].forward[0]
    if candidate is not None and candidate.key == key:
        # Key exists, update value
        candidate.value = value
        return
    
    # Step 2: Generate random level for new node
    new_level = self._random_level()
    
    # Step 3: If new level exceeds current max, update header entries
    if new_level > self.level:
        # For levels above current max, predecessor is header
        for lv in range(self.level + 1, new_level + 1):
            update[lv - 1] = self.header
        self.level = new_level
    
    # Step 4: Create new node and splice into each level
    new_node = SkipListNode[K, V](key, value, new_level)
    
    for lv in range(1, new_level + 1):
        # new_node's forward at this level = predecessor's forward
        new_node.forward[lv - 1] = update[lv - 1].forward[lv - 1]
        # predecessor's forward now points to new_node
        update[lv - 1].forward[lv - 1] = new_node
    
    self.size += 1
 
 
def delete(self, key: K) -> bool:
    """
    Delete a key from the skip list.
    
    Algorithm:
    1. Find update array (predecessors at all levels)
    2. Check if key exists
    3. Remove from each level by updating predecessor pointers
    4. Decrease skip list level if top levels became empty
    
    Time Complexity: O(log n) expected
    
    Returns:
        True if key was found and deleted, False otherwise
    """
    # Step 1: Find update array
    update = self._find_update_array(key)
    
    # Step 2: Check if key exists
    target = update[0].forward[0]
    if target is None or target.key != key:
        return False  # Key not found
    
    # Step 3: Remove from each level where target appears
    for lv in range(1, target.level + 1):
        # Skip levels where this node doesn't appear
        if update[lv - 1].forward[lv - 1] != target:
            break
        # Bypass target: predecessor points to target's successor
        update[lv - 1].forward[lv - 1] = target.forward[lv - 1]
    
    # Step 4: Decrease level if top levels are now empty
    while self.level > 0 and self.header.forward[self.level - 1] is None:
        self.level -= 1
    
    self.size -= 1
    return True

The Elegance of Local Updates

Notice that insertion and deletion only modify O(log n) pointers—exactly those in the update array, plus the new/deleted node's own pointers. There's no cascading restructuring, no rotation propagation, no rebalancing. Each operation is self-contained and local.

Level Distribution in Practice

Let's examine what the level distribution actually looks like in practice. This helps build intuition for why skip lists work and how to tune them.

level_distribution_analysis.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
"""
Analyze level distribution in actual skip lists.
 
This demonstrates that the theoretical distribution closely
matches what we observe in practice.
"""
import random
from collections import Counter
from typing import List, Tuple
 
 
def analyze_level_distribution(n: int, p: float = 0.5, max_level: int = 20) -> dict:
    """
    Generate n random levels and analyze the distribution.
    
    Args:
        n: Number of elements to simulate
        p: Promotion probability
        max_level: Maximum allowed level
    
    Returns:
        Dictionary with distribution analysis
    """
    def random_level() -> int:
        level = 1
        while random.random() < p and level < max_level:
            level += 1
        return level
    
    levels = [random_level() for _ in range(n)]
    distribution = Counter(levels)
    
    # Calculate statistics
    max_observed = max(levels)
    avg_level = sum(levels) / n
    
    # Expected values
    expected_avg = 1 / (1 - p) if p < 1 else float('inf')
    expected_max = 1 + (1 / (1 - p)) * (1 if n == 0 else (n).bit_length())
    
    return {
        "n": n,
        "probability": p,
        "distribution": dict(sorted(distribution.items())),
        "max_observed": max_observed,
        "average_level": round(avg_level, 2),
        "expected_average": round(expected_avg, 2),
        "expected_max": round(expected_max, 1),
    }
 
 
def visualize_tower_skyline(n: int = 30, p: float = 0.5, max_level: int = 10) -> str:
    """
    Create a visual representation of skip list "towers".
    
    Each column represents a node; height represents its level.
    This shows how the probabilistic levels create the staggered
    structure that enables efficient navigation.
    """
    def random_level() -> int:
        level = 1
        while random.random() < p and level < max_level:
            level += 1
        return level
    
    levels = [random_level() for _ in range(n)]
    max_height = max(levels)
    
    lines = []
    lines.append(f"Skip List Tower Skyline (n={n}, p={p})")
    lines.append("=" * (n * 4 + 10))
    
    # Build visual from top to bottom
    for height in range(max_height, 0, -1):
        line = f"L{height:2d} │ "
        for level in levels:
            if level >= height:
                line += "████"
            else:
                line += "    "
        lines.append(line)
    
    # Bottom border
    lines.append("    └" + "─" * (n * 4 + 2))
    lines.append("      " + "".join(f"{i:4d}" for i in range(n)))
    
    # Statistics
    distribution = Counter(levels)
    lines.append("\nLevel Distribution:")
    for level in sorted(distribution.keys()):
        count = distribution[level]
        pct = count / n * 100
        expected = (p ** (level - 1)) * (1 - p) * 100
        lines.append(f"  Level {level}: {count:3d} ({pct:5.1f}%) - expected: {expected:5.1f}%")
    
    return "\n".join(lines)
 
 
# Demonstration
def demonstrate_distribution():
    print(visualize_tower_skyline(30, 0.5, 8))
    print("\n")
    
    # Analyze for different sizes
    print("Level Distribution Analysis")
    print("=" * 60)
    for n in [100, 1000, 10000]:
        analysis = analyze_level_distribution(n)
        print(f"\nn = {n:,}")
        print(f"  Average level: {analysis['average_level']} "
              f"(expected: {analysis['expected_average']})")
        print(f"  Max level: {analysis['max_observed']} "
              f"(expected: ~{analysis['expected_max']})")
        dist = analysis['distribution']
        print(f"  Level 1: {dist.get(1, 0):,} elements")
        print(f"  Level 2: {dist.get(2, 0):,} elements")
        print(f"  Level 3: {dist.get(3, 0):,} elements")
 
# Sample output for n=1000:
# 
# Skip List Tower Skyline (n=30, p=0.5)
# ================================================================
# L 4 │                     ████            ████
# L 3 │     ████        ████████            ████        ████
# L 2 │ ████████████████████████    ████████████████████████████
# L 1 │ ████████████████████████████████████████████████████████████████
#     └───────────────────────────────────────────────────────────────
#            0   1   2   3   4   5   6   7...
#
# Level Distribution:
#   Level 1:  14 (46.7%) - expected: 50.0%
#   Level 2:   9 (30.0%) - expected: 25.0%
#   Level 3:   5 (16.7%) - expected: 12.5%
#   Level 4:   2 ( 6.7%) - expected:  6.2%

Key Observations:

Geometric Decay: Each level has approximately half the elements of the level below. This creates the logarithmic structure.
Variance: Individual skip lists may deviate from expected distributions, especially for small n. But the expected behavior holds on average.
Maximum Level: With high probability, the maximum level is O(log n). Specifically, for p = 0.5, we expect ~log₂(n) levels with actual elements.
Tower Heights: Taller towers are exponentially rarer. This ensures the "express lanes" are sparse enough to provide speedup but common enough to exist.

The Self-Organizing Property

The beautiful thing about this distribution is that no coordination is needed. Each node independently decides its height, yet the global structure emerges with the right properties. This is randomization at its finest—local decisions yielding global structure.

Summary: The Multi-Level Architecture

We've explored the structural anatomy of skip lists in detail. Let's consolidate the key insights about their multi-level linked list architecture:

Key Takeaways

•Node Structure: Each node contains a key/value and an array of forward pointers—one for each level it participates in. A level-k node has k pointers.
•Level Organization: Level 1 contains all elements; higher levels contain geometrically decreasing subsets. Level k has ~n/2^(k-1) elements (expected).
•Header Node: A sentinel with maximum-level forward pointers anchors the structure. It conceptually holds key -∞ and is the entry point for all operations.
•Navigation Pattern: Search uses "go right, then drop down"—move forward at the current level while possible, then descend. This achieves O(log n) expected time.
•Update Array: For modifications, we track the predecessor at each level. This enables O(log n) insertions and deletions with purely local pointer updates.
•Self-Organization: The distribution emerges naturally from independent random decisions. No global coordination or restructuring is needed.

What's Next:

Now that we understand the structure, we'll analyze the operations rigorously. The next page provides a formal analysis of expected O(log n) time complexity, proving that the probabilistic structure delivers the promised performance guarantees with high reliability.

Page Complete

You now understand the complete structural architecture of skip lists—how multiple levels of linked lists interconnect to form a navigable hierarchy. This multi-level design, combined with probabilistic level assignment, creates a data structure that is both simple to implement and efficient to use. Next, we'll prove these efficiency claims mathematically.

2 / 4

Loading learning content...

Data Structures & AlgorithmsSkip Lists (Conceptual)

Skip Lists — Probabilistic Alternative to Balanced Trees

LevelAdvanced

Duration60 mins

TopicSkip Lists (Conceptual)

2 / 4

Multiple Levels of Linked Lists

Building the Express Lane System

This multi-layered architecture is the heart of skip lists. It's what transforms the abstract idea of probabilistic balancing into a concrete, implementable data structure with elegant operations.

What You Will Learn

Skip List Node Architecture

Conceptual Node Structure:

+---------------------+
|       value         |    The stored key/value
+---------------------+
| forward[maxLevel]   |    Pointer to next node at highest level
| forward[maxLevel-1] |    Pointer to next node at level maxLevel-1
|        ...          |              ...
| forward[2]          |    Pointer to next node at level 2
| forward[1]          |    Pointer to next node at level 1 (bottom)
+---------------------+

The level of a node determines how many forward pointers it has. A level-3 node has 3 forward pointers (for levels 1, 2, and 3), while a level-1 node has only 1 (for level 1).

skip_list_node.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
"""
Skip List Node Implementation
 
Each node contains a value and an array of forward pointers,
one for each level the node participates in.
"""
from typing import TypeVar, Generic, Optional, List
 
K = TypeVar('K')  # Key type (must be comparable)
V = TypeVar('V')  # Value type
 
 
class SkipListNode(Generic[K, V]):
    """
    A node in the skip list data structure.
    
    Attributes:
        key: The search key for this node
        value: The associated value (for key-value skips lists)
        forward: Array of forward pointers, one per level
        level: The height of this node (number of forward pointers)
    
    Memory layout visualization:
    
    Level 3:  forward[2] ──────────────────────────────→ (next at L3)
    Level 2:  forward[1] ──────────→ (next at L2)
    Level 1:  forward[0] ──→ (next at L1)
              ┌─────────┐
              │   key   │
              │  value  │
              └─────────┘
    """
    
    __slots__ = ('key', 'value', 'forward')  # Memory optimization
    
    def __init__(self, key: K, value: V, level: int):
        """
        Create a new skip list node with the specified level.
        
        Args:
            key: The search key
            value: The associated value
            level: How many levels this node participates in (1-indexed)
                   A level-k node appears in levels 1 through k
        
        The forward array has 'level' slots, indexed 0 to level-1,
        where forward[i] points to the next node at level i+1.
        
        Example for level=3:
            forward[0] → next node at level 1 (bottom level, all nodes)
            forward[1] → next node at level 2 (sparse, ~half of nodes)
            forward[2] → next node at level 3 (sparser, ~quarter of nodes)
        """
        self.key = key
        self.value = value
        # Initialize all forward pointers to None
        # forward[i] corresponds to level (i+1)
        self.forward: List[Optional['SkipListNode[K, V]']] = [None] * level
    
    @property
    def level(self) -> int:
        """Return the height (level) of this node."""
        return len(self.forward)
    
    def get_forward(self, level: int) -> Optional['SkipListNode[K, V]']:
        """
        Get the forward pointer at the specified level.
        
        Args:
            level: 1-indexed level number
        
        Returns:
            The next node at this level, or None if at end
        """
        if level < 1 or level > self.level:
            return None
        return self.forward[level - 1]  # Convert to 0-indexed
    
    def set_forward(self, level: int, node: Optional['SkipListNode[K, V]']):
        """
        Set the forward pointer at the specified level.
        
        Args:
            level: 1-indexed level number
            node: The node to point to (or None for end)
        """
        if 1 <= level <= self.level:
            self.forward[level - 1] = node
    
    def __repr__(self) -> str:
        fwd_info = ", ".join(
            f"L{i+1}→{n.key if n else '∅'}" 
            for i, n in enumerate(self.forward)
        )
        return f"Node({self.key}, level={self.level}, [{fwd_info}])"
 
 
# Demonstration: Creating and connecting nodes
def demonstrate_node_structure():
    """
    Show how skip list nodes connect at multiple levels.
    
    Building this structure:
    
    Level 3:  HEAD ─────────────────── 19 ─────────────────── NULL
                │                       │
    Level 2:  HEAD ────── 7 ────────── 19 ────── 31 ────────── NULL
                │         │             │         │
    Level 1:  HEAD ─ 3 ── 7 ── 12 ──── 19 ─ 24 ─ 31 ── 42 ──── NULL
    """
    # Create nodes with their randomly assigned levels
    node_3 = SkipListNode(3, "three", level=1)    # Only level 1
    node_7 = SkipListNode(7, "seven", level=2)    # Levels 1-2
    node_12 = SkipListNode(12, "twelve", level=1) # Only level 1
    node_19 = SkipListNode(19, "nineteen", level=3) # Levels 1-3
    node_24 = SkipListNode(24, "twenty-four", level=1) # Only level 1
    node_31 = SkipListNode(31, "thirty-one", level=2) # Levels 1-2
    node_42 = SkipListNode(42, "forty-two", level=1) # Only level 1
    
    # Connect at Level 1 (all nodes participate)
    node_3.set_forward(1, node_7)
    node_7.set_forward(1, node_12)
    node_12.set_forward(1, node_19)
    node_19.set_forward(1, node_24)
    node_24.set_forward(1, node_31)
    node_31.set_forward(1, node_42)
    # node_42.forward[0] remains None (end of list)
    
    # Connect at Level 2 (nodes 7, 19, 31 participate)
    node_7.set_forward(2, node_19)
    node_19.set_forward(2, node_31)
    # node_31.forward[1] remains None (end at level 2)
    
    # Connect at Level 3 (only node 19 participates)
    # node_19.forward[2] remains None (end at level 3)
    
    # Display the structure
    print("Skip List Node Structure:")
    print("=" * 60)
    for node in [node_3, node_7, node_12, node_19, node_24, node_31, node_42]:
        print(node)
    
    # Trace a search path
    print("\nSearch path for key 31:")
    print("-" * 40)
    print("Starting at Level 3...")
    print(f"  At node 19 (Level 3): 19 < 31, but forward[2] is None")
    print(f"  Drop to Level 2 at node 19")
    print(f"  At node 19 (Level 2): forward → 31")
    print(f"  31 == 31: Found!")

Memory Efficiency Note

Level Organization and Structure

Understanding how levels are organized is crucial for implementing skip lists correctly. Let's establish the key structural properties.

Level Indexing Conventions:

There are two common conventions for numbering levels:

1-indexed (Pugh's original): Levels numbered 1 to maxLevel, with level 1 being the bottom (densest) level.
0-indexed (common in implementations): Levels numbered 0 to maxLevel-1, with level 0 being the bottom.

We'll use 1-indexed levels conceptually (for clarity) but 0-indexed arrays in code. The key insight is:

Level 1 always contains all elements. Higher levels contain successively sparser subsets.

Visual Representation of a Skip List:

Consider a skip list with 8 elements and randomly assigned levels:

                    ┌───────────────────────────────────────────────────────┐
                    │                    SKIP LIST                          │
                    │            (max level = 4, n = 8)                     │
                    └───────────────────────────────────────────────────────┘

Level 4:  [HEAD] ─────────────────── [25] ─────────────────────────────────── NULL
             │                         │
Level 3:  [HEAD] ───── [10] ───────── [25] ─────────────────── [50] ───────── NULL
             │          │               │                        │
Level 2:  [HEAD] ───── [10] ── [17] ── [25] ── [30] ────────── [50] ── [75] ─ NULL
             │          │       │       │       │                │       │
Level 1:  [HEAD] ─ [5] [10] ── [17] ── [25] ── [30] ── [44] ── [50] ── [75] ─ NULL
                    │    │       │       │       │       │       │       │
                    ▼    ▼       ▼       ▼       ▼       ▼       ▼       ▼
                  (All elements are in level 1 - the complete sorted list)

Structural Invariants:

Sorted Order: At every level, elements are in strictly increasing order.
Subset Property: Level k+1 is a subset of level k. Every element at a higher level also appears at all lower levels.
Header Participation: The header node has pointers for all potential levels (up to max level).
Independent Termination: Each level terminates independently with NULL pointers.

Level Participation in Example Skip List
Element	Levels Present	Level Height	Forward Pointers
5	1	1	[1→10]
10	1, 2, 3	3	[1→17, 2→17, 3→25]
17	1, 2	2	[1→25, 2→25]
25	1, 2, 3, 4	4	[1→30, 2→30, 3→50, 4→NULL]
30	1, 2	2	[1→44, 2→50]
44	1	1	[1→50]
50	1, 2, 3	3	[1→75, 2→75, 3→NULL]
75	1, 2	2	[1→NULL, 2→NULL]

The Tower Analogy

The Header (Sentinel) Node

The header node is a special sentinel that anchors the entire skip list. It has several unique properties that distinguish it from regular nodes.

Header Node Characteristics:

Maximum Height: The header has forward pointers for the maximum possible level, even if no data nodes reach that height. This ensures we can always start searches at the top.
No Key/Value: The header doesn't store actual data—it's purely structural. Its "key" is conceptually -∞ (negative infinity), meaning all real keys are greater.
Entry Point: All operations (search, insert, delete) begin at the header node.
Tracks Current Level: The skip list maintains the current highest level that actually contains nodes (not just the header). This optimization avoids traversing empty levels.

skip_list_structure.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
"""
Skip List Core Structure with Header Node
 
The header node is the anchor point for all skip list operations.
It maintains forward pointers for all possible levels.
"""
from typing import TypeVar, Generic, Optional, List, Tuple
import random
 
K = TypeVar('K')
V = TypeVar('V')
 
 
class SkipListNode(Generic[K, V]):
    """Node in the skip list with level-indexed forward pointers."""
    
    __slots__ = ('key', 'value', 'forward')
    
    def __init__(self, key: Optional[K], value: Optional[V], level: int):
        self.key = key
        self.value = value
        self.forward: List[Optional['SkipListNode[K, V]']] = [None] * level
 
 
class SkipList(Generic[K, V]):
    """
    Skip List data structure with probabilistic balancing.
    
    Structure:
    - header: Sentinel node with max_level forward pointers
    - level: Current maximum level actually in use (not including header)
    - max_level: Theoretical maximum level (typically log₂(n_expected))
    - probability: Promotion probability (typically 0.5)
    
    Visualization:
    
    ┌──────────────────────────────────────────────────────────────┐
    │  SkipList                                                    │
    │  ┌─────────┐                                                 │
    │  │ HEADER  │  ← Entry point (key = -∞ conceptually)          │
    │  │ level=4 │                                                 │
    │  └────┬────┘                                                 │
    │       │                                                      │
    │  Level 4: [H]────────────────────[25]────────────────── NULL │
    │           │                       │                          │
    │  Level 3: [H]─────[10]───────────[25]──────────[50]──── NULL │
    │           │        │              │              │           │
    │  Level 2: [H]─────[10]──[17]────[25]──[30]────[50]──[75] NULL│
    │           │        │     │       │     │       │     │       │
    │  Level 1: [H]─[5]─[10]──[17]────[25]──[30]─[44][50]──[75] NULL│
    └──────────────────────────────────────────────────────────────┘
    """
    
    DEFAULT_MAX_LEVEL = 16  # Supports ~65,536 elements efficiently
    DEFAULT_PROBABILITY = 0.5
    
    def __init__(
        self, 
        max_level: int = DEFAULT_MAX_LEVEL,
        probability: float = DEFAULT_PROBABILITY
    ):
        """
        Initialize an empty skip list.
        
        Args:
            max_level: Maximum number of levels (default 16)
            probability: Chance of promoting to next level (default 0.5)
        
        The header node is created with max_level forward pointers,
        all initially pointing to None.
        """
        self.max_level = max_level
        self.probability = probability
        
        # Current highest level with actual nodes (0 = empty)
        self.level = 0
        
        # Number of elements (excludes header)
        self.size = 0
        
        # Header node: special sentinel with no key, max height
        # Conceptually stores key = -∞
        self.header = SkipListNode[K, V](
            key=None,  # Sentinel, no actual key
            value=None,
            level=max_level
        )
    
    def _random_level(self) -> int:
        """
        Generate a random level for a new node.
        
        Uses geometric distribution: P(level = k) = p^(k-1) * (1-p)
        Expected level = 1 / (1 - p) = 2 for p = 0.5
        
        Returns:
            Level from 1 to max_level
        """
        level = 1
        while random.random() < self.probability and level < self.max_level:
            level += 1
        return level
    
    def is_empty(self) -> bool:
        """Check if the skip list contains no elements."""
        return self.size == 0
    
    def __len__(self) -> int:
        """Return the number of elements in the skip list."""
        return self.size
    
    def _display(self) -> str:
        """
        Generate a visual representation of the skip list.
        Useful for debugging and understanding structure.
        """
        if self.is_empty():
            return "Empty SkipList"
        
        lines = []
        lines.append(f"SkipList (size={self.size}, level={self.level})")
        lines.append("=" * 60)
        
        # Collect all nodes
        nodes = []
        current = self.header.forward[0]
        while current:
            nodes.append(current)
            current = current.forward[0]
        
        # Print each level from top to bottom
        for lv in range(self.level, 0, -1):
            line = f"L{lv}: [H]"
            current = self.header.forward[lv - 1]
            for node in nodes:
                if node.key is not None and len(node.forward) >= lv:
                    # Check if this node is at this level
                    if current == node:
                        line += f"──[{node.key}]"
                        current = node.forward[lv - 1]
                    else:
                        line += "─" * (len(str(node.key)) + 4)
                else:
                    line += "─" * (len(str(node.key)) + 4)
            line += "── NULL"
            lines.append(line)
        
        return "\n".join(lines)
 
 
# Demonstration
def demonstrate_header():
    """Show the header node and initial skip list structure."""
    sl = SkipList[int, str](max_level=6)
    
    print("Empty Skip List Structure:")
    print("=" * 50)
    print(f"Max Level: {sl.max_level}")
    print(f"Current Level: {sl.level}")
    print(f"Size: {sl.size}")
    print(f"Header forward pointers: {len(sl.header.forward)}")
    print(f"All header forwards are None: {all(p is None for p in sl.header.forward)}")
    
    print("\nHeader node visualization:")
    print("┌───────────┐")
    print("│  HEADER   │")
    print("├───────────┤")
    for i in range(sl.max_level - 1, -1, -1):
        print(f"│ forward[{i}]│──── NULL")
    print("└───────────┘")
    print("(All levels ready, waiting for data)")

Max Level Selection

Navigating the Multi-Level Structure

The true elegance of skip lists emerges when we understand navigation. The search algorithm exploits the multi-level structure to achieve O(log n) expected time through a simple pattern:

Go right as far as possible at the current level, then drop down.

This pattern is deceptively simple but incredibly effective. Let's trace it step by step.

Search Algorithm Walkthrough:

To search for key k, starting at the header:

1. Set current = header, level = current_max_level

2. While level >= 1:
   a. While current.forward[level] has key < k:
      - Move right: current = current.forward[level]
   b. Drop down: level = level - 1

3. Check current.forward[0]:
   - If key == k: Found!
   - Otherwise: Not found

Visual Trace: Searching for 44

Searching for key 44 in:

Level 4:  [H] ─────────────────── [25] ─────────────────────────────────── NULL
Level 3:  [H] ───── [10] ───────── [25] ─────────────────── [50] ───────── NULL
Level 2:  [H] ───── [10] ── [17] ── [25] ── [30] ────────── [50] ── [75] ─ NULL
Level 1:  [H] ─ [5] [10] ── [17] ── [25] ── [30] ── [44] ── [50] ── [75] ─ NULL

Step 1: Start at header, level 4
        → forward[4] = 25, and 25 < 44, so move right to 25
        
Step 2: At node 25, level 4
        → forward[4] = NULL, can't go right
        → Drop to level 3
        
Step 3: At node 25, level 3
        → forward[3] = 50, but 50 > 44, can't go right
        → Drop to level 2
        
Step 4: At node 25, level 2
        → forward[2] = 30, and 30 < 44, so move right to 30
        
Step 5: At node 30, level 2
        → forward[2] = 50, but 50 > 44, can't go right
        → Drop to level 1
        
Step 6: At node 30, level 1
        → forward[1] = 44, and 44 == 44, Found!

Total comparisons: 6 (compared 25, NULL, 50, 30, 50, 44)

skip_list_search.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
def search(self, key: K) -> Tuple[bool, Optional[V], List[str]]:
    """
    Search for a key in the skip list.
    
    The search algorithm:
    1. Start at the header node at the highest level
    2. At each level, move right while the next key is less than target
    3. When we can't move right, drop down one level
    4. Repeat until we reach level 1
    5. Check if the next node at level 1 has our key
    
    Args:
        key: The key to search for
    
    Returns:
        Tuple of (found: bool, value: Optional[V], trace: List[str])
        The trace shows the path taken for debugging/visualization
    
    Time Complexity: O(log n) expected
    Space Complexity: O(1)
    """
    trace = []
    current = self.header
    
    # Start at the highest level that contains nodes
    trace.append(f"Starting search for key {key}")
    trace.append(f"Current level: {self.level}")
    
    # Traverse from top level down to level 1
    for lv in range(self.level, 0, -1):
        trace.append(f"Level {lv}: at node {current.key or 'HEADER'}")
        
        # Move right while next node's key is less than target
        while current.forward[lv - 1] is not None and \
              current.forward[lv - 1].key < key:
            current = current.forward[lv - 1]
            trace.append(f"  → moved right to {current.key}")
        
        # Log why we stopped
        if current.forward[lv - 1] is None:
            trace.append(f"  → reached end of level, dropping down")
        else:
            trace.append(f"  → next is {current.forward[lv - 1].key} >= {key}, dropping down")
    
    # After the loop, current.forward[0] is either:
    # - The node with our key (if it exists)
    # - The node with the smallest key greater than ours
    # - None if our key would be the largest
    
    next_node = current.forward[0]
    
    if next_node is not None and next_node.key == key:
        trace.append(f"Found {key} with value {next_node.value}")
        return True, next_node.value, trace
    else:
        trace.append(f"Key {key} not found")
        return False, None, trace
 
 
# Example usage with visual trace
def demonstrate_search():
    """Demonstrate the search algorithm with detailed trace."""
    sl = SkipList[int, str]()
    
    # Manually build a specific structure for demonstration
    # (In practice, use insert() which assigns random levels)
    elements = [
        (5, "five", 1), (10, "ten", 3), (17, "seventeen", 2),
        (25, "twenty-five", 4), (30, "thirty", 2), (44, "forty-four", 1),
        (50, "fifty", 3), (75, "seventy-five", 2)
    ]
    
    # Build the structure...
    # (implementation details omitted for clarity)
    
    # Search demonstration
    print("Search for key 44:")
    print("=" * 50)
    found, value, trace = sl.search(44)
    for step in trace:
        print(step)
    print(f"\nResult: {'Found' if found else 'Not found'}, value = {value}")

Why This Pattern Works

The Update Array: Tracking Predecessors

For insertions and deletions, we need more than just finding the position—we need to know the predecessor at each level so we can update pointers. This is the role of the update array.

The Update Array Concept:

During traversal, we maintain an array update[] where:

update[i] = the last node visited at level i before we dropped down or moved past the target position

In other words, update[i] is the rightmost node at level i whose key is less than our target. These are exactly the nodes whose forward pointers need to change during insertion or deletion.

update_array.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
def _find_update_array(self, key: K) -> List[SkipListNode[K, V]]:
    """
    Find the predecessor at each level for a given key.
    
    The update array is crucial for insert and delete operations:
    - update[i] is the last node at level i+1 before the target position
    - These are the nodes whose forward pointers will change
    
    Args:
        key: The key we're searching for
    
    Returns:
        List of predecessor nodes, one per level (0 to max_level-1)
    
    Visual example for key 44:
    
    Level 4:  [H] ────────────────── [25] ────────────────────── NULL
               ↓                                                 
              update[3] = 25      (last node at L4 before 44 position)
              
    Level 3:  [H] ───── [10] ──── [25] ─────────── [50] ─────── NULL
                                   ↓
              update[2] = 25      (last node at L3 before 44 position)
              
    Level 2:  [H] ───── [10] ─ [17] ─ [25] ── [30] ── [50] ─ [75] NULL
                                               ↓
              update[1] = 30      (last node at L2 before 44 position)
              
    Level 1:  [H] ─[5]─ [10]──[17]───[25]──[30]─────[44]──[50]─[75] NULL
                                              ↓
              update[0] = 30      (last node at L1 before 44 position)
    
    Result: update = [30, 30, 25, 25] (plus potentially more for higher max_level)
    """
    # Initialize update array with header node for all levels
    update: List[SkipListNode[K, V]] = [self.header] * self.max_level
    current = self.header
    
    # Traverse from the current highest level down to level 1
    for lv in range(self.level, 0, -1):
        # Move right while next key is less than target
        while current.forward[lv - 1] is not None and \
              current.forward[lv - 1].key < key:
            current = current.forward[lv - 1]
        
        # Record this node as the predecessor at this level
        # Array is 0-indexed, so level lv uses index lv-1
        update[lv - 1] = current
    
    return update
 
 
def insert(self, key: K, value: V) -> None:
    """
    Insert a key-value pair into the skip list.
    
    Algorithm:
    1. Find predecessors at all levels using update array
    2. Generate a random level for the new node
    3. If new level > current level, update headers point to new node
    4. Create new node and splice into each participating level
    
    Time Complexity: O(log n) expected
    """
    # Step 1: Find update array
    update = self._find_update_array(key)
    
    # Check for existing key at level 1
    candidate = update[0].forward[0]
    if candidate is not None and candidate.key == key:
        # Key exists, update value
        candidate.value = value
        return
    
    # Step 2: Generate random level for new node
    new_level = self._random_level()
    
    # Step 3: If new level exceeds current max, update header entries
    if new_level > self.level:
        # For levels above current max, predecessor is header
        for lv in range(self.level + 1, new_level + 1):
            update[lv - 1] = self.header
        self.level = new_level
    
    # Step 4: Create new node and splice into each level
    new_node = SkipListNode[K, V](key, value, new_level)
    
    for lv in range(1, new_level + 1):
        # new_node's forward at this level = predecessor's forward
        new_node.forward[lv - 1] = update[lv - 1].forward[lv - 1]
        # predecessor's forward now points to new_node
        update[lv - 1].forward[lv - 1] = new_node
    
    self.size += 1
 
 
def delete(self, key: K) -> bool:
    """
    Delete a key from the skip list.
    
    Algorithm:
    1. Find update array (predecessors at all levels)
    2. Check if key exists
    3. Remove from each level by updating predecessor pointers
    4. Decrease skip list level if top levels became empty
    
    Time Complexity: O(log n) expected
    
    Returns:
        True if key was found and deleted, False otherwise
    """
    # Step 1: Find update array
    update = self._find_update_array(key)
    
    # Step 2: Check if key exists
    target = update[0].forward[0]
    if target is None or target.key != key:
        return False  # Key not found
    
    # Step 3: Remove from each level where target appears
    for lv in range(1, target.level + 1):
        # Skip levels where this node doesn't appear
        if update[lv - 1].forward[lv - 1] != target:
            break
        # Bypass target: predecessor points to target's successor
        update[lv - 1].forward[lv - 1] = target.forward[lv - 1]
    
    # Step 4: Decrease level if top levels are now empty
    while self.level > 0 and self.header.forward[self.level - 1] is None:
        self.level -= 1
    
    self.size -= 1
    return True

The Elegance of Local Updates

Level Distribution in Practice

Let's examine what the level distribution actually looks like in practice. This helps build intuition for why skip lists work and how to tune them.

level_distribution_analysis.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
"""
Analyze level distribution in actual skip lists.
 
This demonstrates that the theoretical distribution closely
matches what we observe in practice.
"""
import random
from collections import Counter
from typing import List, Tuple
 
 
def analyze_level_distribution(n: int, p: float = 0.5, max_level: int = 20) -> dict:
    """
    Generate n random levels and analyze the distribution.
    
    Args:
        n: Number of elements to simulate
        p: Promotion probability
        max_level: Maximum allowed level
    
    Returns:
        Dictionary with distribution analysis
    """
    def random_level() -> int:
        level = 1
        while random.random() < p and level < max_level:
            level += 1
        return level
    
    levels = [random_level() for _ in range(n)]
    distribution = Counter(levels)
    
    # Calculate statistics
    max_observed = max(levels)
    avg_level = sum(levels) / n
    
    # Expected values
    expected_avg = 1 / (1 - p) if p < 1 else float('inf')
    expected_max = 1 + (1 / (1 - p)) * (1 if n == 0 else (n).bit_length())
    
    return {
        "n": n,
        "probability": p,
        "distribution": dict(sorted(distribution.items())),
        "max_observed": max_observed,
        "average_level": round(avg_level, 2),
        "expected_average": round(expected_avg, 2),
        "expected_max": round(expected_max, 1),
    }
 
 
def visualize_tower_skyline(n: int = 30, p: float = 0.5, max_level: int = 10) -> str:
    """
    Create a visual representation of skip list "towers".
    
    Each column represents a node; height represents its level.
    This shows how the probabilistic levels create the staggered
    structure that enables efficient navigation.
    """
    def random_level() -> int:
        level = 1
        while random.random() < p and level < max_level:
            level += 1
        return level
    
    levels = [random_level() for _ in range(n)]
    max_height = max(levels)
    
    lines = []
    lines.append(f"Skip List Tower Skyline (n={n}, p={p})")
    lines.append("=" * (n * 4 + 10))
    
    # Build visual from top to bottom
    for height in range(max_height, 0, -1):
        line = f"L{height:2d} │ "
        for level in levels:
            if level >= height:
                line += "████"
            else:
                line += "    "
        lines.append(line)
    
    # Bottom border
    lines.append("    └" + "─" * (n * 4 + 2))
    lines.append("      " + "".join(f"{i:4d}" for i in range(n)))
    
    # Statistics
    distribution = Counter(levels)
    lines.append("\nLevel Distribution:")
    for level in sorted(distribution.keys()):
        count = distribution[level]
        pct = count / n * 100
        expected = (p ** (level - 1)) * (1 - p) * 100
        lines.append(f"  Level {level}: {count:3d} ({pct:5.1f}%) - expected: {expected:5.1f}%")
    
    return "\n".join(lines)
 
 
# Demonstration
def demonstrate_distribution():
    print(visualize_tower_skyline(30, 0.5, 8))
    print("\n")
    
    # Analyze for different sizes
    print("Level Distribution Analysis")
    print("=" * 60)
    for n in [100, 1000, 10000]:
        analysis = analyze_level_distribution(n)
        print(f"\nn = {n:,}")
        print(f"  Average level: {analysis['average_level']} "
              f"(expected: {analysis['expected_average']})")
        print(f"  Max level: {analysis['max_observed']} "
              f"(expected: ~{analysis['expected_max']})")
        dist = analysis['distribution']
        print(f"  Level 1: {dist.get(1, 0):,} elements")
        print(f"  Level 2: {dist.get(2, 0):,} elements")
        print(f"  Level 3: {dist.get(3, 0):,} elements")
 
# Sample output for n=1000:
# 
# Skip List Tower Skyline (n=30, p=0.5)
# ================================================================
# L 4 │                     ████            ████
# L 3 │     ████        ████████            ████        ████
# L 2 │ ████████████████████████    ████████████████████████████
# L 1 │ ████████████████████████████████████████████████████████████████
#     └───────────────────────────────────────────────────────────────
#            0   1   2   3   4   5   6   7...
#
# Level Distribution:
#   Level 1:  14 (46.7%) - expected: 50.0%
#   Level 2:   9 (30.0%) - expected: 25.0%
#   Level 3:   5 (16.7%) - expected: 12.5%
#   Level 4:   2 ( 6.7%) - expected:  6.2%

Key Observations:

Geometric Decay: Each level has approximately half the elements of the level below. This creates the logarithmic structure.
Variance: Individual skip lists may deviate from expected distributions, especially for small n. But the expected behavior holds on average.
Maximum Level: With high probability, the maximum level is O(log n). Specifically, for p = 0.5, we expect ~log₂(n) levels with actual elements.
Tower Heights: Taller towers are exponentially rarer. This ensures the "express lanes" are sparse enough to provide speedup but common enough to exist.

The Self-Organizing Property

Summary: The Multi-Level Architecture

We've explored the structural anatomy of skip lists in detail. Let's consolidate the key insights about their multi-level linked list architecture:

Key Takeaways

•Node Structure: Each node contains a key/value and an array of forward pointers—one for each level it participates in. A level-k node has k pointers.
•Level Organization: Level 1 contains all elements; higher levels contain geometrically decreasing subsets. Level k has ~n/2^(k-1) elements (expected).
•Header Node: A sentinel with maximum-level forward pointers anchors the structure. It conceptually holds key -∞ and is the entry point for all operations.
•Navigation Pattern: Search uses "go right, then drop down"—move forward at the current level while possible, then descend. This achieves O(log n) expected time.
•Update Array: For modifications, we track the predecessor at each level. This enables O(log n) insertions and deletions with purely local pointer updates.
•Self-Organization: The distribution emerges naturally from independent random decisions. No global coordination or restructuring is needed.

What's Next:

Page Complete

2 / 4