Rarp And Bootp - Learning Module

Loading content...

0/228

RARP Operation

Anatomy of a RARP Transaction

When a diskless workstation powers on, a precise sequence of events unfolds in the span of milliseconds—a choreographed exchange between a client with no configuration and a server holding the answers. Understanding this operation at a packet-by-packet level is essential for troubleshooting, network design, and appreciating the elegance of early network bootstrapping.\n\nThis page dissects the complete RARP operation, from the moment the client's boot ROM begins executing to the successful configuration of its IP address. We examine timing constraints, network behavior, error scenarios, and the integration of RARP with the broader diskless boot process.

What You Will Learn

By the end of this page, you will understand the precise sequence of operations in a RARP transaction, timing considerations and retry strategies, how RARP frames traverse the network, error conditions and recovery mechanisms, server selection when multiple servers respond, and the complete integration of RARP within the diskless boot lifecycle.

The Complete Request Lifecycle

A RARP transaction follows a well-defined lifecycle, with specific actions at each stage. Let's trace through every step in precise detail.\n\nPhase 1: Client Initialization (0-100ms after power-on)\n\nWhen the diskless workstation powers on:\n\n1. POST (Power-On Self-Test): Hardware diagnostics run\n2. Boot ROM activation: The network boot ROM takes control\n3. NIC initialization: The network interface card is configured\n4. MAC address retrieval: The boot ROM reads the burned-in MAC address from the NIC's EEPROM\n5. RARP frame construction: The request frame is built in memory

rarp-client-initialization.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Boot ROM RARP Client - Initialization Phase
function initializeRARPClient():
    // Step 1: Initialize the NIC hardware
    nic = initializeNetworkInterface()
    
    // Step 2: Read our hardware address from NIC EEPROM
    myMAC = nic.readHardwareAddress()
    // e.g., myMAC = 00:1A:2B:3C:4D:5E
    
    // Step 3: Prepare the RARP request frame structure
    rarpRequest = {
        // Ethernet Header
        destMAC: FF:FF:FF:FF:FF:FF,    // Broadcast address
        srcMAC: myMAC,                  // Our MAC
        etherType: 0x8035,              // RARP protocol
        
        // RARP Payload
        hardwareType: 0x0001,           // Ethernet
        protocolType: 0x0800,           // IPv4
        hardwareLength: 6,              // MAC = 6 bytes
        protocolLength: 4,              // IPv4 = 4 bytes
        operation: 3,                   // RARP Request
        senderHardwareAddr: myMAC,      // Our MAC
        senderProtocolAddr: 0.0.0.0,    // Unknown
        targetHardwareAddr: myMAC,      // Query about ourselves
        targetProtocolAddr: 0.0.0.0     // This is what we need!
    }
    
    return rarpRequest

Phase 2: Request Transmission\n\nThe constructed RARP request is transmitted onto the network:\n\n1. Frame transmission: The NIC sends the Ethernet frame\n2. Broadcast propagation: All devices on the segment receive the frame\n3. Switch flooding: On switched networks, the frame floods all ports in the broadcast domain\n4. Timer activation: The client starts a response timeout timer\n\nKey timing considerations:\n\n| Event | Typical Duration | Notes |\n|-------|------------------|-------|\n| Frame transmission | 50-100 μs | Depends on frame size and line speed |\n| Switch propagation | 1-10 μs per hop | Store-and-forward adds latency |\n| Broadcast flood | 1-100 μs | Parallel on most switches |\n| Server processing | 100 μs - 10 ms | Database lookup time |\n| Reply transmission | 50-100 μs | Similar to request |\n| Total round-trip | 1-20 ms typical | Wide variation possible |

The Vanishing Broadcast

On modern switched networks, broadcast frames are handled differently than in the hub-based networks of RARP's era. While the broadcast still reaches all ports (within the VLAN), switch buffers and processing introduce latencies that didn't exist with hubs. Some switches may rate-limit broadcasts, potentially affecting RARP performance in busy environments.

Phase 3: Server Reception and Processing\n\nWhen the RARP server receives the broadcast request:\n\n1. Frame reception: NIC captures the frame (in promiscuous mode or with broadcast filter)\n2. EtherType check: Verify 0x8035 for RARP\n3. Operation check: Verify operation code = 3 (RARP Request)\n4. Database lookup: Search for Target Hardware Address in /etc/ethers\n5. Hostname resolution: If found, resolve hostname to IP via /etc/hosts\n6. Reply construction: Build RARP Reply frame\n7. Unicast transmission: Send reply directly to requester's MAC

rarp-server-processing.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// RARP Server - Request Processing
function processRARPRequest(frame):
    // Step 1: Validate the frame
    if frame.etherType != 0x8035:
        return  // Not a RARP frame
    
    if frame.operation != 3:
        return  // Not a RARP Request
    
    // Step 2: Extract the target MAC (what the client is querying about)
    queryMAC = frame.targetHardwareAddr
    // e.g., queryMAC = 00:1A:2B:3C:4D:5E
    
    // Step 3: Look up in our database
    hostname = ethersDatabase.lookup(queryMAC)
    // e.g., ethersDatabase = { "00:1A:2B:3C:4D:5E": "workstation01" }
    
    if hostname == null:
        // Unknown MAC - let another server handle it
        log("Unknown MAC: " + queryMAC)
        return
    
    // Step 4: Resolve hostname to IP
    clientIP = hostsDatabase.resolve(hostname)
    // e.g., hostsDatabase = { "workstation01": "192.168.1.100" }
    
    if clientIP == null:
        log("Cannot resolve hostname: " + hostname)
        return
    
    // Step 5: Construct the RARP Reply
    rarpReply = {
        // Ethernet Header
        destMAC: queryMAC,              // Unicast to client
        srcMAC: myMAC,                  // Server's MAC
        etherType: 0x8035,              // RARP protocol
        
        // RARP Payload
        hardwareType: 0x0001,           // Ethernet
        protocolType: 0x0800,           // IPv4
        hardwareLength: 6,
        protocolLength: 4,
        operation: 4,                   // RARP Reply
        senderHardwareAddr: myMAC,      // Server's MAC
        senderProtocolAddr: myIP,       // Server's IP
        targetHardwareAddr: queryMAC,   // Client's MAC
        targetProtocolAddr: clientIP    // THE ANSWER!
    }
    
    // Step 6: Send the reply
    transmit(rarpReply)

Phase 4: Client Response Processing\n\nWhen the client receives the RARP reply:\n\n1. Frame reception: NIC captures the unicast frame\n2. EtherType verification: Confirm 0x8035\n3. Operation verification: Confirm operation code = 4 (RARP Reply)\n4. MAC verification: Confirm Target Hardware Address matches our MAC\n5. IP extraction: Read Target Protocol Address (our IP!)\n6. IP configuration: Configure the local IP stack with the received address\n7. Timer cancellation: Stop the retry timer\n8. Boot continuation: Proceed to next phase (typically TFTP)

Timing and Retry Strategies

Because RARP operates over unreliable Ethernet broadcasts with no acknowledgment mechanism, robust timeout and retry handling is essential. The client must balance responsiveness against network load.\n\nThe Retry Dilemma:\n\n- Too aggressive: Floods the network with repeated requests, potentially overwhelming servers\n- Too conservative: Delays boot time unnecessarily, frustrates users\n- Just right: Recovers from transient failures without excessive load\n\nRFC 903 Considerations:\n\nRFC 903 explicitly does not specify timeout values, stating only that 'the requester should be prepared to retransmit the request.' Implementations have varied widely:

Retry Strategy Comparison
Strategy	Initial Wait	Max Retries	Total Max Wait	Pros	Cons
Fixed interval	4 sec	5	20 sec	Simple implementation	May overload slow servers
Linear backoff	1,2,3,4,5 sec	5	15 sec	Some congestion adaptation	May give up too soon
Exponential backoff	1,2,4,8,16 sec	5	31 sec	Excellent congestion handling	Slow final retries
Exponential w/ cap	1,2,4,8,8 sec	5	23 sec	Balance of speed and safety	Slightly complex
Infinite retry	4 sec	∞	∞	Never fails if server exists	Could hang forever

Exponential Backoff Algorithm:\n\nThe recommended approach for production environments is exponential backoff with jitter:\n\n\nbase_timeout = 1 second\nmax_timeout = 16 seconds\nmax_retries = 10\njitter_range = 0.5 seconds\n\nfor attempt in 1..max_retries:\n timeout = min(base_timeout * 2^(attempt-1), max_timeout)\n jitter = random(-jitter_range, +jitter_range)\n actual_wait = timeout + jitter\n \n send_rarp_request()\n \n if response_received_within(actual_wait):\n return SUCCESS\n \n log("Retry " + attempt + " of " + max_retries)\n\nreturn FAILURE\n\n\nThe Role of Jitter:\n\nIn environments with many diskless workstations (e.g., a lab with 30 identical machines powered on simultaneously), adding random jitter prevents synchronized retry storms:

Converting Mermaid diagram...

The Boot Storm Problem

When an entire facility loses power and then regains it, every diskless workstation boots simultaneously. Without jitter, RARP servers face a 'boot storm' with potentially thousands of simultaneous requests. This was a known operational challenge in data centers with many diskless clients. Modern protocols like DHCP incorporate jitter by default.

Timeout Selection Factors:\n\nThe optimal timeout values depend on several environmental factors:\n\n| Factor | Impact on Timeout | Recommendation |\n|--------|-------------------|----------------|\n| Network speed | Faster networks → shorter timeouts | 10 Mbps: 2-4s, 100 Mbps: 1-2s |\n| Server load | Heavily loaded servers → longer timeouts | Monitor response times |\n| Number of clients | More clients → more jitter | 0.5-1s jitter per 10 clients |\n| Boot criticality | Critical systems → more retries | Infinite retry for vital systems |\n| User tolerance | Low patience → faster initial timeout | 1s initial for interactive boot |\n\nImplementation Example (Sun Boot ROM):\n\nSun Microsystems, a major producer of diskless workstations, used this strategy:\n\n- Initial timeout: 4 seconds\n- Backoff: Double each retry (4, 8, 16, 32...)\n- Maximum timeout: 64 seconds\n- Retry limit: Implementation-specific (often 5-10)\n- On final failure: Display error and halt

Network-Level Behavior

Understanding how RARP frames interact with network infrastructure—switches, hubs, bridges, and routers—is essential for proper deployment and troubleshooting.\n\nRARP Request Propagation:\n\nThe RARP request uses the Ethernet broadcast address (FF:FF:FF:FF:FF:FF), triggering specific network behaviors:\n\n| Infrastructure | Behavior | Impact |\n|----------------|----------|--------|\n| Hub | Floods out all ports | All connected devices receive the request |\n| Unmanaged Switch | Floods out all ports | Same as hub for broadcasts |\n| Managed Switch | Floods within VLAN | Contained to configured VLAN |\n| Router | Drops the frame | RARP cannot cross routers |\n| Bridge | Forwards to other segment | Extends broadcast domain |

The Router Boundary

Routers operate at Layer 3 (IP). A RARP frame (Layer 2) has no IP header, so routers cannot process or forward it. This fundamental limitation means RARP servers must exist on every network segment, even if the segments are adjacent. This was a major driver for developing BOOTP, which uses UDP/IP and can be relayed across routers.

RARP Reply Delivery:\n\nUnlike the broadcast request, the RARP reply is a unicast frame:\n\n- Destination MAC: The specific MAC address of the requesting client\n- Source MAC: The RARP server's MAC address\n\nThis has implications for switched networks:\n\n1. MAC table learning: The switch learns the client's MAC from the request\n2. Unicast forwarding: The reply goes only to the port where the client is connected\n3. Reduced flooding: Reply doesn't consume bandwidth on uninvolved ports

Converting Mermaid diagram...

Switch MAC Table Dynamics:\n\nThe RARP exchange affects the switch's MAC address table:\n\nBefore RARP Request:\n\nMAC Address Table:\n 00:AA:BB:CC:DD:EE (RARP Server) -> Port 5\n ... (other entries)\n 00:1A:2B:3C:4D:5E (Client) -> NOT PRESENT\n\n\nAfter RARP Request (broadcast received):\n\nMAC Address Table:\n 00:AA:BB:CC:DD:EE (RARP Server) -> Port 5\n **00:1A:2B:3C:4D:5E (Client) -> Port 1** (learned!)\n ... (other entries)\n\n\nThe switch learns the client's port from the source MAC of the request. When the reply comes from the server, the switch can deliver it directly to Port 1.\n\nSpanning Tree Considerations:\n\nIn networks with redundant paths and Spanning Tree Protocol (STP):\n\n- Initial boot delay: STP takes 30-50 seconds to converge\n- PortFast mitigation: Modern switches use PortFast for edge ports, skipping STP delay\n- RARP timeout interaction: If RARP timeouts are shorter than STP convergence, client will fail to boot on first attempt\n\nRecommendation: Configure edge switch ports with PortFast to eliminate STP delay for client devices.

VLAN Configuration

In VLAN-segmented networks, ensure that diskless workstations and their RARP server are in the same VLAN. Since RARP broadcasts don't cross VLAN boundaries (they're Layer 2 constructs), mismatched VLAN assignment is a common misconfiguration that causes boot failures. Verify VLAN tagging on both client and server ports.

Error Conditions and Recovery

RARP operation can fail for numerous reasons. Understanding the error conditions and their symptoms enables effective troubleshooting.\n\nCommon Failure Scenarios:

RARP Error Conditions and Diagnostics
Symptom	Likely Cause	Diagnostic Approach	Resolution
No response at all	No server on segment	Check server availability, verify same VLAN	Deploy server to segment or fix VLAN config
No response at all	Firewall blocking RARP	Review firewall rules for EtherType 0x8035	Allow RARP traffic
No response at all	Server not listening	Check if rarpd daemon is running	Start rarpd service
No response at all	Client MAC not in database	Check server logs for rejected queries	Add MAC to /etc/ethers
Intermittent failures	Network congestion	Monitor switch errors and broadcasts	Upgrade infrastructure, reduce broadcasts
Intermittent failures	Multiple replies colliding	Packet capture to verify	Reduce server count or add delays
Wrong IP received	Database mismatch	Verify all server databases are synchronized	Synchronize /etc/ethers across servers
Long boot time	Server overloaded	Check server CPU and disk I/O	Optimize server or add capacity

Client-Side Error Handling:\n\nThe client has limited options for error handling due to boot ROM constraints:\n\n1. Retry with backoff: The primary recovery mechanism\n2. Error display: Show status on console or LED indicators\n3. Halt: Stop after exhausting retries\n4. Alternative boot: Some boot ROMs can fall back to local disk\n\nTypical Error Messages:\n\n\nSun Boot ROM Messages:\n "No carrier - transceiver cable problem?" → Physical layer issue\n "RARP timed out" → No server response\n "RARP request failed" → Multiple retries exhausted\n\n3Com Boot ROM Messages:\n "RPL: Adapter Error" → NIC problem\n "RPL: Time-out waiting for server" → No RARP response\n

Silent Failures

RARP has no mechanism for the server to indicate 'your MAC is not in my database.' The server simply ignores unknown requests. From the client's perspective, an unknown MAC and no server on the network look identical—both result in timeout. This makes debugging more challenging compared to protocols like DHCP that can send explicit rejection messages.

Server-Side Error Handling:\n\nRARP servers should implement robust error handling:\n\nLogging:\n\nJan 15 10:30:01 server rarpd[1234]: received request from 00:1A:2B:3C:4D:5E\nJan 15 10:30:01 server rarpd[1234]: 00:1A:2B:3C:4D:5E -> workstation01 -> 192.168.1.100\nJan 15 10:30:01 server rarpd[1234]: sending reply to 00:1A:2B:3C:4D:5E\n\nJan 15 10:30:15 server rarpd[1234]: received request from 00:DE:AD:BE:EF:00\nJan 15 10:30:15 server rarpd[1234]: no entry for 00:DE:AD:BE:EF:00 in /etc/ethers\n\n\nDatabase Validation:\n- Check /etc/ethers syntax on load\n- Verify all hostnames can be resolved\n- Validate IP addresses are proper format\n- Alert on duplicate MAC entries\n- Monitor stale entries (old MACs that no longer exist)

Network-Level Debugging:\n\nPacket capture is the definitive tool for RARP troubleshooting:\n\nUsing tcpdump:\nbash\n# Capture all RARP traffic on interface eth0\ntcpdump -i eth0 ether proto 0x8035 -v\n\n# Sample output:\n10:30:01.123456 00:1A:2B:3C:4D:5E > ff:ff:ff:ff:ff:ff, \n RARP-req who-is 00:1A:2B:3C:4D:5E tell 00:1A:2B:3C:4D:5E\n10:30:01.125123 00:AA:BB:CC:DD:EE > 00:1A:2B:3C:4D:5E,\n RARP-reply 00:1A:2B:3C:4D:5E at 192.168.1.100\n\n\nUsing Wireshark display filter:\n\nrarp\n\n\nThis captures all RARP traffic and provides decoded field analysis.

Multiple Server Behavior

For reliability, production networks typically deploy multiple RARP servers on each segment. This redundancy introduces specific behaviors and potential issues that must be understood.\n\nThe Race Condition:\n\nWhen multiple servers receive a RARP request simultaneously:\n\n1. All servers receive the broadcast at nearly the same time\n2. Each server looks up the MAC independently\n3. All servers with matching entries respond with replies\n4. Client accepts the first reply it receives\n5. Subsequent replies are discarded by the client\n\nThis is an intentional design providing natural redundancy and load balancing through racing.

Converting Mermaid diagram...

Load Distribution Effects:\n\nSurprisingly, having multiple servers doesn't provide true load balancing for RARP:\n\n| Scenario | Result |\n|----------|--------|\n| Servers with identical performance | Random distribution based on timing jitter |\n| One server faster than others | Faster server handles most requests |\n| Servers with different load | Less loaded server wins more often |\n\nThis means:\n- The fastest server typically handles most traffic\n- 'Primary/backup' roles are determined by performance, not configuration\n- All servers still process every request (just don't all send replies first)\n\nDatabase Consistency Critical:\n\nAll RARP servers on a segment must have identical databases:\n\n| Consistency Issue | Symptom |\n|-------------------|---------|\n| Server A has MAC, Server B doesn't | Some boots succeed, some fail (depending on which wins race) |\n| Servers have different IPs for same MAC | Client gets different IP on each boot |\n| Hostname typo on one server | Sporadic failures for that host |

The Inconsistency Nightmare

If Server A maps MAC X to IP 192.168.1.100 and Server B maps MAC X to IP 192.168.1.200, the client will receive a different address depending on which server responds first. This leads to intermittent, hard-to-diagnose issues where the workstation 'randomly' has different IPs. Always use automated synchronization for multi-server deployments.

Synchronization Strategies:\n\n1. Manual copying: Simple but error-prone, use for small deployments\n2. Shared storage (NFS): All servers mount the same /etc/ethers file\n3. Rsync cron job: Periodic synchronization from a master server\n4. Configuration management: Ansible/Puppet/Chef manages all files\n5. NIS (Network Information Service): Centralized database for ethers, hosts\n\nRecommended approach for multi-server deployments:\n\nbash\n# On primary server, rsync to all secondaries every 5 minutes\n*/5 * * * * rsync -av /etc/ethers /etc/hosts secondary1:/etc/\n*/5 * * * * rsync -av /etc/ethers /etc/hosts secondary2:/etc/\n\n# Or use NFS:\n# All servers mount: nfs-server:/shared/etc/ethers -> /etc/ethers (read-only)\n

Integration with the Complete Boot Process

RARP is just the first step in the diskless workstation boot process. Understanding how RARP integrates with subsequent phases provides context for its role and limitations.\n\nThe Complete Diskless Boot Sequence:

Converting Mermaid diagram...

Phase 1: RARP - Address Discovery\n\nWhat RARP provides:\n- The client's IP address\n\nWhat RARP does NOT provide:\n- Subnet mask\n- Default gateway\n- Boot server address\n- Boot file name\n- Any other configuration\n\nWorkarounds vendors used:\n\n| Missing Information | Workaround |\n|---------------------|------------|\n| Boot server address | Assume RARP server is also TFTP server |\n| Boot filename | Derive from IP (e.g., C0A8016A for 192.168.1.106) |\n| Subnet mask | Hard-code or derive from IP class |\n| Default gateway | Assume not needed (local segment) |\n\nPhase 2: TFTP - Boot Image Download\n\nAfter RARP, the client uses TFTP (Trivial File Transfer Protocol) to download its boot image:\n\n\n1. Client sends TFTP Read Request to RARP server IP\n2. Filename: derived from IP address in hex (e.g., C0A80164)\n or architecture-specific (e.g., C0A80164.SUN4)\n3. TFTP server sends boot image in 512-byte blocks\n4. Client acknowledges each block\n5. Boot image loaded into memory\n\n\nThe IP-to-Filename Mapping:\n\nSun's convention (widely adopted):\n\n| Client IP | Hex IP | Filename |\n|-----------|--------|----------|\n| 192.168.1.100 | C0A80164 | C0A80164 |\n| 10.0.0.1 | 0A000001 | 0A000001 |\n| 172.16.5.200 | AC1005C8 | AC1005C8.SUN4 |\n\nThe server's TFTP directory contains:\n\n/tftpboot/\n C0A80164 -> symlink to sunos-boot-image\n C0A80165 -> symlink to sunos-boot-image\n sunos-boot-image\n

The Cleverness of IP-based Filenames

By deriving the filename from the IP address, the boot process avoided needing another protocol to discover the boot filename. The convention worked because the same administrator who adds a MAC to /etc/ethers also creates the TFTP symlink. This tight coupling was manageable in small deployments but became a scalability problem for large installations.

Phase 3: Operating System Initialization\n\nOnce the boot image is loaded and executing:\n\n1. Kernel boots: The downloaded kernel initializes\n2. Network reconfiguration: Kernel rebuilds network config (possibly via another RARP or using the boot ROM's values)\n3. NFS root mount: The kernel mounts its root filesystem via NFS:\n \n mount -t nfs server:/export/root/client1 /\n \n4. Init execution: /sbin/init runs, bringing up the system\n5. Swap configuration: Swap may be local (if any disk) or NFS-mounted\n\nComplete Timeline Example:\n\n| Time | Event | Protocol |\n|------|-------|----------|\n| 0.0s | Power on | - |\n| 0.5s | Boot ROM starts | - |\n| 0.6s | RARP request sent | RARP |\n| 0.7s | RARP reply received | RARP |\n| 0.8s | TFTP request sent | TFTP |\n| 5.0s | Boot image downloaded (4MB) | TFTP |\n| 5.5s | Kernel executing | - |\n| 8.0s | NFS root mounted | NFS |\n| 15.0s | Login prompt | - |

Summary: Mastering RARP Operation

We have dissected the complete operation of RARP at both the packet level and the system level. Let's consolidate the key operational concepts:

Key Takeaways

•RARP follows a precise request-reply lifecycle — Client broadcasts request, server unicasts reply with the assigned IP address.
•Timeout and retry strategies are critical — Exponential backoff with jitter provides resilient operation without network overload.
•RARP is Layer 2 only — Broadcasts don't cross routers; servers must be on every segment needing RARP service.
•Multiple servers provide redundancy — All servers race to respond; client accepts the first reply.
•Database consistency is essential — All servers must have identical mappings to avoid intermittent failures.
•RARP integrates with TFTP and NFS — It's the first step in a multi-phase boot process for diskless workstations.

What's Next:\n\nIn the next page, we will explore BOOTP (Bootstrap Protocol), which extended RARP's concept to address its fundamental limitations. You will learn how BOOTP moved from Layer 2 to Layer 3, enabled complete boot configuration in a single exchange, and introduced the relay agent concept that allowed centralized servers to serve clients across routed networks.

Page Complete

You now understand RARP's operational mechanics in complete detail. You can trace through a RARP transaction packet by packet, identify and troubleshoot common failure scenarios, and explain how RARP integrates with the complete diskless boot process. This operational knowledge prepares you to appreciate how BOOTP improved upon RARP's foundation.

RARP Operation

Anatomy of a RARP Transaction

What You Will Learn

The Complete Request Lifecycle

rarp-client-initialization.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Boot ROM RARP Client - Initialization Phase
function initializeRARPClient():
    // Step 1: Initialize the NIC hardware
    nic = initializeNetworkInterface()
    
    // Step 2: Read our hardware address from NIC EEPROM
    myMAC = nic.readHardwareAddress()
    // e.g., myMAC = 00:1A:2B:3C:4D:5E
    
    // Step 3: Prepare the RARP request frame structure
    rarpRequest = {
        // Ethernet Header
        destMAC: FF:FF:FF:FF:FF:FF,    // Broadcast address
        srcMAC: myMAC,                  // Our MAC
        etherType: 0x8035,              // RARP protocol
        
        // RARP Payload
        hardwareType: 0x0001,           // Ethernet
        protocolType: 0x0800,           // IPv4
        hardwareLength: 6,              // MAC = 6 bytes
        protocolLength: 4,              // IPv4 = 4 bytes
        operation: 3,                   // RARP Request
        senderHardwareAddr: myMAC,      // Our MAC
        senderProtocolAddr: 0.0.0.0,    // Unknown
        targetHardwareAddr: myMAC,      // Query about ourselves
        targetProtocolAddr: 0.0.0.0     // This is what we need!
    }
    
    return rarpRequest

The Vanishing Broadcast

rarp-server-processing.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// RARP Server - Request Processing
function processRARPRequest(frame):
    // Step 1: Validate the frame
    if frame.etherType != 0x8035:
        return  // Not a RARP frame
    
    if frame.operation != 3:
        return  // Not a RARP Request
    
    // Step 2: Extract the target MAC (what the client is querying about)
    queryMAC = frame.targetHardwareAddr
    // e.g., queryMAC = 00:1A:2B:3C:4D:5E
    
    // Step 3: Look up in our database
    hostname = ethersDatabase.lookup(queryMAC)
    // e.g., ethersDatabase = { "00:1A:2B:3C:4D:5E": "workstation01" }
    
    if hostname == null:
        // Unknown MAC - let another server handle it
        log("Unknown MAC: " + queryMAC)
        return
    
    // Step 4: Resolve hostname to IP
    clientIP = hostsDatabase.resolve(hostname)
    // e.g., hostsDatabase = { "workstation01": "192.168.1.100" }
    
    if clientIP == null:
        log("Cannot resolve hostname: " + hostname)
        return
    
    // Step 5: Construct the RARP Reply
    rarpReply = {
        // Ethernet Header
        destMAC: queryMAC,              // Unicast to client
        srcMAC: myMAC,                  // Server's MAC
        etherType: 0x8035,              // RARP protocol
        
        // RARP Payload
        hardwareType: 0x0001,           // Ethernet
        protocolType: 0x0800,           // IPv4
        hardwareLength: 6,
        protocolLength: 4,
        operation: 4,                   // RARP Reply
        senderHardwareAddr: myMAC,      // Server's MAC
        senderProtocolAddr: myIP,       // Server's IP
        targetHardwareAddr: queryMAC,   // Client's MAC
        targetProtocolAddr: clientIP    // THE ANSWER!
    }
    
    // Step 6: Send the reply
    transmit(rarpReply)

Timing and Retry Strategies

Retry Strategy Comparison
Strategy	Initial Wait	Max Retries	Total Max Wait	Pros	Cons
Fixed interval	4 sec	5	20 sec	Simple implementation	May overload slow servers
Linear backoff	1,2,3,4,5 sec	5	15 sec	Some congestion adaptation	May give up too soon
Exponential backoff	1,2,4,8,16 sec	5	31 sec	Excellent congestion handling	Slow final retries
Exponential w/ cap	1,2,4,8,8 sec	5	23 sec	Balance of speed and safety	Slightly complex
Infinite retry	4 sec	∞	∞	Never fails if server exists	Could hang forever

Converting Mermaid diagram...

The Boot Storm Problem

Network-Level Behavior

The Router Boundary

Converting Mermaid diagram...

VLAN Configuration

Error Conditions and Recovery

RARP operation can fail for numerous reasons. Understanding the error conditions and their symptoms enables effective troubleshooting.\n\nCommon Failure Scenarios:

RARP Error Conditions and Diagnostics
Symptom	Likely Cause	Diagnostic Approach	Resolution
No response at all	No server on segment	Check server availability, verify same VLAN	Deploy server to segment or fix VLAN config
No response at all	Firewall blocking RARP	Review firewall rules for EtherType 0x8035	Allow RARP traffic
No response at all	Server not listening	Check if rarpd daemon is running	Start rarpd service
No response at all	Client MAC not in database	Check server logs for rejected queries	Add MAC to /etc/ethers
Intermittent failures	Network congestion	Monitor switch errors and broadcasts	Upgrade infrastructure, reduce broadcasts
Intermittent failures	Multiple replies colliding	Packet capture to verify	Reduce server count or add delays
Wrong IP received	Database mismatch	Verify all server databases are synchronized	Synchronize /etc/ethers across servers
Long boot time	Server overloaded	Check server CPU and disk I/O	Optimize server or add capacity

Silent Failures

Multiple Server Behavior

Converting Mermaid diagram...

The Inconsistency Nightmare

Integration with the Complete Boot Process

Converting Mermaid diagram...

The Cleverness of IP-based Filenames

Summary: Mastering RARP Operation

We have dissected the complete operation of RARP at both the packet level and the system level. Let's consolidate the key operational concepts:

Key Takeaways

•RARP follows a precise request-reply lifecycle — Client broadcasts request, server unicasts reply with the assigned IP address.
•Timeout and retry strategies are critical — Exponential backoff with jitter provides resilient operation without network overload.
•RARP is Layer 2 only — Broadcasts don't cross routers; servers must be on every segment needing RARP service.
•Multiple servers provide redundancy — All servers race to respond; client accepts the first reply.
•Database consistency is essential — All servers must have identical mappings to avoid intermittent failures.
•RARP integrates with TFTP and NFS — It's the first step in a multi-phase boot process for diskless workstations.

Page Complete