Ipv4 Header Format - Learning Module

Loading content...

0/240

Identification Field

Reassembling the Pieces

When an IP datagram is fragmented into multiple pieces, how does the receiving host know which pieces belong together? The Internet is a chaotic environment—fragments from dozens of different original datagrams might arrive interleaved, out of order, from different sources. The Identification field provides the critical link that allows fragments to be correctly matched and reassembled.

This 16-bit field, occupying bytes 4 and 5 of the IPv4 header, serves as a datagram identifier that, combined with the source address, destination address, and protocol, uniquely identifies the original datagram from which fragments were derived. But the Identification field's story doesn't end with fragmentation—its predictable generation in older systems has been exploited for network scanning, idle host detection, and other reconnaissance techniques.

What You Will Learn

By the end of this page, you will master the Identification field completely: its exact encoding and position, role in fragment reassembly, generation algorithms and their security implications, relationship with the Flags and Fragment Offset fields, and practical techniques for analyzing fragmented traffic.

Field Position and Encoding

The Identification field occupies bytes 4 and 5 of the IPv4 header (bit positions 32-47), immediately following the Total Length field and preceding the Flags field.

Identification Field Position in IPv4 Header
Bytes	Bit Positions	Field Name	Size
0	0-7	Version + IHL	8 bits
1	8-15	Type of Service	8 bits
2-3	16-31	Total Length	16 bits
4-5	32-47	Identification	16 bits
6-7	48-63	Flags + Fragment Offset	3 + 13 bits

Identification Field Specifications

•Size: 16 bits (2 bytes)
•Position: Bytes 4-5 (bit positions 32-47)
•Encoding: Unsigned 16-bit integer, big-endian
•Range: 0-65,535
•Set by: Source host when creating the datagram
•Copied to: All fragments derived from original datagram

Key Definition:

From RFC 791:

"An identifying value assigned by the sender to aid in assembling the fragments of a datagram."

The Identification field works in conjunction with three other fields to enable fragment reassembly:

Source Address: Fragments from different sources are never mixed
Destination Address: Fragments to different destinations are tracked separately
Protocol: Fragments for different upper-layer protocols are tracked separately
Identification: Distinguishes datagrams from the same source/dest/protocol

The Four-Tuple Uniqueness

A fragment is uniquely identified by the four-tuple: (Source IP, Destination IP, Protocol, Identification). The receiving host maintains separate reassembly buffers for each unique four-tuple. This prevents fragments from being incorrectly mixed even when Identification values overlap.

Role in IP Fragmentation

The Identification field is central to the IP fragmentation and reassembly mechanism. When a datagram is fragmented, all resulting fragments carry the same Identification value, allowing the receiver to group them correctly.

Fragmentation Process:

Original Datagram: The source assigns a unique Identification value (per destination/protocol)
Intermediate Router: When datagram exceeds a link's MTU and DF=0, router fragments it
Each Fragment: Receives the same Identification as the original
Receiver: Groups fragments by four-tuple, orders by Fragment Offset, waits for MF=0 fragment
Reassembly: When all fragments present, reassembles into original datagram

Fragmentation Example: 4000-byte Datagram
Datagram/Fragment	Identification	MF	Offset	Total Length
Original	0x1A2B	—	—	4000
Fragment 1	0x1A2B	1	0	1500
Fragment 2	0x1A2B	1	185	1500
Fragment 3	0x1A2B	0	370	1040

Key Observations:

All fragments share Identification = 0x1A2B
The receiver sees three packets with the same four-tuple and same Identification
Fragment Offset orders the data: 0, 1480, 2960 (offset × 8)
MF=0 on Fragment 3 indicates final fragment
When all fragments arrive, receiver reconstructs original 4000 bytes

What If Fragments Are Lost:

If any fragment is lost:

Receiver cannot complete reassembly
Partial reassembly buffer is discarded after timeout (typically 30-60 seconds)
For TCP, the sender retransmits the entire TCP segment (not just lost fragment)
For UDP, the entire datagram is lost; application must handle this

Path MTU Discovery Preference

Modern best practice is to avoid fragmentation entirely using Path MTU Discovery. Fragmentation has significant downsides: increased latency if any fragment is lost (entire datagram must be retransmitted), reassembly resource consumption at receivers, and security vulnerabilities in reassembly code. Set DF=1 and use PMTUD whenever possible.

Identification Generation Algorithms

RFC 791 doesn't specify how Identification values should be generated—only that they must be unique for each "source, destination, protocol, time" combination. Different operating systems have used various algorithms, with significant security implications.

Identification Generation Strategies
Strategy	Description	Security	Used By
Global Counter	Single counter incremented for all packets	Very Weak	Early Linux, early Windows
Per-Destination Counter	Separate counter per destination IP	Weak	Mid-era Unix systems
Per-Connection Counter	Counter per source/dest/protocol tuple	Moderate	Some embedded systems
Random	Cryptographically random per packet	Strong	Modern Linux, Windows, macOS
Hash-Based	Hash of connection tuple + secret + counter	Strong	FreeBSD, OpenBSD

The Problem with Counters:

Global Increment Vulnerability:

If a system uses a simple global counter (ID = previous ID + 1), an attacker can:

Send a packet to the target, noting the returned ID (e.g., 1000)
Wait brief period
Send another packet, note new ID (e.g., 1005)
Infer 4 packets were sent to other destinations in between
Repeat to track target's network activity patterns

Idle Scan (Zombie Scan):

The predictable global counter enabled the famous "idle scan" technique:

Find a "zombie" host with predictable IP ID
Probe zombie to get current ID
Spoof packet to target with zombie as source
Probe zombie again for new ID
If target port is open, zombie received RST and incremented ID
Attacker determines port status without touching target directly

ID Generation Comparison
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
import os
import hashlib
import struct
from typing import Dict
 
class InsecureGlobalCounter:
    """
    INSECURE: Global counter visible to network scans.
    DO NOT use in production.
    """
    def __init__(self):
        self.counter = 0
    
    def next_id(self, src, dst, proto) -> int:
        self.counter = (self.counter + 1) & 0xFFFF
        return self.counter
 
class SecurePerConnectionRandom:
    """
    SECURE: Cryptographically random per packet.
    Modern Linux, Windows 10+, macOS approach.
    """
    def next_id(self, src, dst, proto) -> int:
        return struct.unpack('!H', os.urandom(2))[0]
 
class SecureHashBased:
    """
    SECURE: Hash-based with per-boot secret.
    FreeBSD/OpenBSD approach. Provides ordering within
    connections while preventing external inference.
    """
    def __init__(self):
        self.secret = os.urandom(16)
        self.counters: Dict[tuple, int] = {}
    
    def next_id(self, src: str, dst: str, proto: int) -> int:
        # Create connection tuple
        key = (src, dst, proto)
        
        # Get or initialize per-connection counter
        if key not in self.counters:
            self.counters[key] = 0
        self.counters[key] = (self.counters[key] + 1) & 0xFFFF
        
        # Hash: secret + tuple + counter
        h = hashlib.sha256()
        h.update(self.secret)
        h.update(f"{src}:{dst}:{proto}".encode())
        h.update(struct.pack('!H', self.counters[key]))
        
        # Take 16 bits from hash
        return struct.unpack('!H', h.digest()[:2])[0]
 
# Demonstration
import random
 
print("Global Counter (INSECURE):")
gc = InsecureGlobalCounter()
for i in range(5):
    print(f"  Packet {i+1}: ID = {gc.next_id('10.0.0.1', '10.0.0.2', 6)}")
 
print("
Random (SECURE):")
rnd = SecurePerConnectionRandom()
for i in range(5):
    print(f"  Packet {i+1}: ID = {rnd.next_id('10.0.0.1', '10.0.0.2', 6)}")
 
# Output:
# Global Counter (INSECURE):
#   Packet 1: ID = 1
#   Packet 2: ID = 2  <- Predictable pattern!
#   Packet 3: ID = 3
# ...
# Random (SECURE):
#   Packet 1: ID = 48291
#   Packet 2: ID = 12847  <- No pattern
#   Packet 3: ID = 59102

RFC 7739 Recommendations

RFC 7739 (2016) provides detailed recommendations for secure IP ID generation. Key points: avoid global counters, use per-destination randomization or hash-based algorithms, and consider setting ID=0 for atomic (unfragmented, DF=1) datagrams since the field is unused.

Special Considerations: ID = 0

A subtle but important optimization involves the use of Identification = 0 for packets that will never be fragmented.

When ID Is Irrelevant:

The Identification field is only used for fragment reassembly. If a packet will never be fragmented, the ID value is never used. This occurs when:

DF bit is set (Don't Fragment): The packet cannot be fragmented; ID is unused
Packet size ≤ minimum MTU (576 bytes): Fragmentation cannot occur
Atomic datagrams (RFC 6864): Packets defined as never fragmentable

RFC 6864 (2013):

RFC 6864 updated the interpretation of the Identification field:

"The IP ID field is meaningless for atomic datagrams and SHOULD be set to zero."

Benefits of ID = 0:

Reduced state: No need to track counters for atomic datagrams
Improved security: Eliminates ID-based inference for DF=1 packets
Memory efficiency: No per-destination counter storage needed
CPU efficiency: No counter lookup or generation

When to Set ID = 0
Scenario	DF Bit	Set ID = 0?	Rationale
TCP MSS < path MTU	1 (DF)	Yes	Will never fragment
UDP, size ≤ 576	0 or 1	Caution	Could fragment on rare paths
PMTUD enabled	1 (DF)	Yes	By design never fragments
Unknown path	0 (allow)	No	Might fragment, need valid ID
IPsec ESP (tunnel)	1 (DF)	Yes	Inner packet, never exposed to fragmentation

Operating System Behavior

Linux kernel 3.18+ sets IP ID = 0 for unfragmented DF=1 packets by default. Windows 10 and macOS have similar optimizations. This behavior can usually be configured via kernel parameters if backwards compatibility is needed.

Security Implications

The Identification field has been central to numerous security issues, from network reconnaissance to connection hijacking. Understanding these vulnerabilities is essential for security practitioners.

ID-Based Attack Techniques

•Idle Scan (nmap -sI): Use predictable IP IDs on zombie host to scan target without revealing scanner's IP. Attacker spoofs packets from zombie, measures zombie's ID changes to determine if target ports are open.
•OS Fingerprinting: Different operating systems use different ID generation algorithms. By analyzing ID patterns, attackers identify the OS remotely (nmap's OS detection uses this).
•Connection Inference: Global counters reveal when other connections exist. If ID jumps from 1000 to 1010 in 1 second, 9 other packets were sent—useful for traffic analysis.
•Firewall Bypass: Some stateful firewalls track IP IDs. By predicting next ID, attackers can craft fragments that appear to belong to permitted flows.
•Fragment Overlap Attacks: Crafted fragments with same ID but overlapping offsets can confuse reassembly, potentially injecting malicious data (Teardrop, overlapping fragment evasion).

Idle Scan in Detail:

The idle scan is a sophisticated technique that exploits predictable IP IDs:

1. Attacker → Zombie: SYN/ACK probe
   Zombie → Attacker: RST (ID = 1000)

2. Attacker → Target (spoofed as Zombie): SYN to port 80

   IF port 80 is OPEN:
     Target → Zombie: SYN/ACK
     Zombie → Target: RST (ID = 1001)  ← Increments zombie's ID
   
   IF port 80 is CLOSED:
     Target → Zombie: RST
     Zombie: (ignores RST, no response, ID unchanged)

3. Attacker → Zombie: SYN/ACK probe
   Zombie → Attacker: RST (ID = ???)

   IF ID = 1002: Port was OPEN (zombie's counter incremented twice)
   IF ID = 1001: Port was CLOSED (only our probe incremented it)

This scan is undetectable by the target—all traffic appears to come from the zombie.

Detecting Predictable IDs
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
"""
Script to check if a host has predictable IP IDs.
For educational/authorized testing only.
"""
 
from scapy.all import IP, ICMP, sr1
import time
 
def check_ip_id_predictability(target: str, samples: int = 10) -> dict:
    """
    Send probes and analyze IP ID patterns.
    
    Returns analysis of ID generation behavior.
    """
    ids = []
    
    for i in range(samples):
        # Send ICMP Echo Request, collect Echo Reply
        pkt = IP(dst=target)/ICMP()
        reply = sr1(pkt, timeout=2, verbose=0)
        
        if reply and reply.haslayer(IP):
            ids.append(reply[IP].id)
        time.sleep(0.1)  # Small delay between probes
    
    if len(ids) < 2:
        return {'status': 'unreachable', 'ids': ids}
    
    # Analyze patterns
    differences = [ids[i+1] - ids[i] for i in range(len(ids)-1)]
    
    # Check for global increment pattern
    if all(d == 1 for d in differences):
        return {
            'status': 'vulnerable',
            'pattern': 'global_increment',
            'ids': ids,
            'note': 'Trivially predictable, exploitable for idle scan'
        }
    
    # Check for small, consistent increment
    if all(0 < d < 100 for d in differences):
        return {
            'status': 'vulnerable', 
            'pattern': 'increment_with_noise',
            'ids': ids,
            'note': 'Likely global counter, may be exploitable'
        }
    
    # Check if all IDs are zero (RFC 6864 optimization)
    if all(id == 0 for id in ids):
        return {
            'status': 'secure',
            'pattern': 'zero',
            'ids': ids,
            'note': 'RFC 6864 compliant, ID=0 for atomic datagrams'
        }
    
    # Random pattern
    avg_diff = sum(abs(d) for d in differences) / len(differences)
    if avg_diff > 1000:
        return {
            'status': 'secure',
            'pattern': 'random',
            'ids': ids,
            'note': 'Appears random, not exploitable'
        }
    
    return {
        'status': 'unknown',
        'pattern': 'unclear',
        'ids': ids,
        'differences': differences
    }
 
# Example check (requires root/admin and authorized target)
# result = check_ip_id_predictability("192.168.1.1")
# print(result)

Defense Recommendations

To defend against IP ID exploitation: (1) Use modern OS with random ID generation, (2) Set DF=1 to enable ID=0 optimization, (3) Block outbound ICMP from servers that shouldn't respond to probes, (4) Use stateless filtering rather than ID-based tracking in firewalls.

Relationship with Flags and Fragment Offset

The Identification field works closely with the Flags and Fragment Offset fields (bytes 6-7) to implement fragmentation. Together, these fields form a complete fragmentation control system.

Fragmentation Control Fields
Field	Size	Purpose	Values
Identification	16 bits	Groups fragments together	0-65535, same for all fragments
Reserved Flag	1 bit	Reserved, must be 0	Always 0
DF (Don't Fragment)	1 bit	Prohibit fragmentation	0=Allow, 1=Don't Fragment
MF (More Fragments)	1 bit	Indicates more fragments follow	0=Last/Only, 1=More coming
Fragment Offset	13 bits	Position in original data	Units of 8 bytes, 0-8191

Field Interactions:

Scenario 1: Unfragmented Datagram

Identification: 0 (or any value, unused)
DF: 1 (Don't Fragment)
MF: 0 (No more fragments)
Fragment Offset: 0

Scenario 2: First Fragment

Identification: 0x1A2B (copied from original)
DF: 0 (Allow fragmentation—already fragmented)
MF: 1 (More fragments coming)
Fragment Offset: 0 (First chunk of data)

Scenario 3: Middle Fragment

Identification: 0x1A2B (same as first)
DF: 0
MF: 1 (More fragments coming)
Fragment Offset: 185 (= 1480 / 8, offset in 8-byte units)

Scenario 4: Last Fragment

Identification: 0x1A2B (same as all others)
DF: 0
MF: 0 (This is the last fragment)
Fragment Offset: 370 (= 2960 / 8)

Fragment Analysis
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
import struct
from dataclasses import dataclass
from typing import List, Optional
 
@dataclass
class FragmentInfo:
    identification: int
    dont_fragment: bool
    more_fragments: bool
    fragment_offset: int
    total_length: int
    data_offset: int  # Where payload starts in original datagram
    data_length: int  # Payload size in this fragment
 
def parse_fragmentation_fields(packet: bytes) -> FragmentInfo:
    """Parse fragmentation-related fields from IPv4 header."""
    if len(packet) < 20:
        raise ValueError("Packet too short")
    
    # IHL
    ihl = packet[0] & 0x0F
    header_length = ihl * 4
    
    # Total Length
    total_length = struct.unpack('!H', packet[2:4])[0]
    
    # Identification (bytes 4-5)
    identification = struct.unpack('!H', packet[4:6])[0]
    
    # Flags + Fragment Offset (bytes 6-7)
    flags_offset = struct.unpack('!H', packet[6:8])[0]
    
    # Extract fields
    # Bit 0: Reserved (ignored)
    # Bit 1: DF (Don't Fragment)
    # Bit 2: MF (More Fragments)
    # Bits 3-15: Fragment Offset (13 bits)
    
    dont_fragment = bool(flags_offset & 0x4000)
    more_fragments = bool(flags_offset & 0x2000)
    fragment_offset = flags_offset & 0x1FFF  # Lower 13 bits
    
    return FragmentInfo(
        identification=identification,
        dont_fragment=dont_fragment,
        more_fragments=more_fragments,
        fragment_offset=fragment_offset,
        total_length=total_length,
        data_offset=fragment_offset * 8,  # Convert to bytes
        data_length=total_length - header_length
    )
 
def analyze_fragments(fragments: List[bytes]) -> dict:
    """Analyze a list of fragments for completeness."""
    parsed = [parse_fragmentation_fields(f) for f in fragments]
    
    # Verify all have same ID
    ids = set(p.identification for p in parsed)
    if len(ids) != 1:
        return {'error': f"Multiple IDs found: {ids}"}
    
    # Sort by offset
    parsed.sort(key=lambda p: p.data_offset)
    
    # Check for gaps
    expected_offset = 0
    gaps = []
    for p in parsed:
        if p.data_offset != expected_offset:
            gaps.append((expected_offset, p.data_offset))
        expected_offset = p.data_offset + p.data_length
    
    # Check for last fragment
    has_last = any(not p.more_fragments for p in parsed)
    
    # Calculate total size
    last_frag = next((p for p in parsed if not p.more_fragments), None)
    if last_frag:
        total_size = last_frag.data_offset + last_frag.data_length
    else:
        total_size = None
    
    return {
        'identification': parsed[0].identification,
        'fragment_count': len(parsed),
        'gaps': gaps,
        'complete': len(gaps) == 0 and has_last,
        'has_last_fragment': has_last,
        'total_data_size': total_size,
        'fragments': [
            {
                'offset': p.data_offset,
                'length': p.data_length,
                'is_last': not p.more_fragments
            }
            for p in parsed
        ]
    }

Fragment Offset Units

Fragment Offset is measured in 8-byte units, not bytes. This is because the 13-bit field can only express values 0-8191, but maximum original datagram size is 65,535 bytes. Using 8-byte units: 8191 × 8 = 65,528 bytes, which (with a 20-byte header) covers the maximum datagram. All fragments except possibly the last must have data lengths that are multiples of 8.

Practical Packet Analysis

Analyzing the Identification field in packet captures helps diagnose fragmentation issues and detect potential attacks.

Wireshark and tcpdump

Wireshark/tcpdump

# Wireshark display of fragmented traffic
 
# First fragment:
Internet Protocol Version 4, Src: 192.168.1.100, Dst: 10.0.0.1
    ...
    Identification: 0x1a2b (6699)
    Flags: 0x2000, More fragments
        0... .... .... .... = Reserved bit: Not set
        .0.. .... .... .... = Don't fragment: Not set
        ..1. .... .... .... = More fragments: Set
    Fragment offset: 0
    ...
    [Reassembly information:]
        [3 fragments, 4000 bytes total]
        [Fragment: 1/3]
        [Reassembled in packet #15]
 
# Wireshark display filters for fragmentation:
ip.id == 0x1a2b            # Specific ID value
ip.flags.mf == 1           # More Fragments flag set
ip.frag_offset > 0         # Non-first fragments
ip.flags.mf == 1 || ip.frag_offset > 0  # Any fragment
 
# Find fragmented conversations:
ip.id && (ip.flags.mf || ip.frag_offset)
 
# tcpdump for fragments:
$ tcpdump -n 'ip[6:2] & 0x3fff != 0'
# This matches: MF bit set OR fragment offset non-zero
 
# Show fragment details:
$ tcpdump -n -v 'ip[6:2] & 0x3fff != 0'
IP (tos 0x0, ttl 64, id 6699, offset 0, flags [+], proto UDP (17), length 1500)
    192.168.1.100.12345 > 10.0.0.1.53: ...
# [+] indicates MF flag set
 
# Capture all fragments with specific ID:
$ tcpdump -n 'ip[4:2] = 0x1a2b'  # bytes 4-5 = Identification

Troubleshooting Fragmentation Issues

•Lost fragments: Filter for ID with MF=1 but no MF=0 packet; indicates last fragment lost
•Reassembly timeout: Look for incomplete fragment sets; receiver discards after 30-60 seconds
•Fragment overlap attacks: Filter for same ID + overlapping offsets; indicates possible attack
•ID collision: Same four-tuple + ID with different content; causes reassembly corruption
•PMTUD failure: DF=1 packets disappearing + no ICMP Fragmentation Needed = PMTUD black hole

Wireshark Reassembly

Wireshark automatically reassembles fragments and shows 'Reassembly information' in the packet details. Look for 'Reassembled in packet #X' to find where complete datagrams appear. To see raw fragments without reassembly, go to Edit → Preferences → Protocols → IPv4 and uncheck 'Reassemble fragmented IPv4 datagrams.'

Summary: Identification Field

We've thoroughly examined the Identification field—its role in fragmentation, generation algorithms, and security implications. Let's consolidate the essential knowledge:

Key Takeaways

•Identification occupies bytes 4-5: A 16-bit value that groups fragments from the same original datagram.
•Four-tuple uniqueness: Fragments are identified by (Source IP, Destination IP, Protocol, Identification). The same ID from different sources are tracked separately.
•All fragments share the same ID: When a datagram is fragmented, all resulting fragments carry the identical Identification value.
•Generation algorithm matters significantly: Global counters are exploitable for idle scans and OS fingerprinting. Modern systems use random or hash-based generation.
•ID = 0 for atomic datagrams: RFC 6864 recommends setting ID = 0 for DF=1 packets, improving security and efficiency.
•Works with Flags and Offset: The Identification field is only meaningful in conjunction with MF flag and Fragment Offset. Together they enable reassembly.
•Security-sensitive field: Has been exploited for reconnaissance, connection inference, and firewall evasion. Modern systems randomize to prevent inference.

What's Next:

With Identification mastered, we conclude this module with a Header Overview—a comprehensive summary that brings together all the fields we've studied (Version, IHL, ToS, Total Length, Identification) along with the remaining fields (Flags, Fragment Offset, TTL, Protocol, Checksum, Addresses, and Options). This overview will provide a complete mental model of the IPv4 header structure.

Page Complete

You now possess comprehensive understanding of the Identification field—its encoding, role in fragmentation, generation security, and analysis techniques. This knowledge is essential for understanding packet fragmentation, network security assessments, and forensic analysis.