Loading content...
After years of network troubleshooting, experienced engineers recognize that the same problems appear repeatedly—different networks, different applications, but fundamentally the same root causes. This is both frustrating and powerful.
Frustrating because you'd think people would stop making the same mistakes. Powerful because once you recognize the pattern, diagnosis becomes fast.
This page catalogs the most common network issues you'll encounter in production environments. For each, we'll cover:
Think of this as a field guide—a reference you'll return to throughout your career.
This comprehensive guide covers connectivity failures, slow performance, DNS problems, DHCP issues, routing problems, MTU/fragmentation issues, duplex mismatches, and security-related connectivity blocks. For each, you'll learn the classic symptoms and efficient diagnostic approaches.
"I can't reach the server" — the most common complaint. Complete connectivity failures have many causes, but systematic diagnosis quickly identifies the layer at fault.
Symptom Categories:
| Symptom | Likely Layer | First Check |
|---|---|---|
| No network icon / no link light | Layer 1 (Physical) | Cables, port, NIC |
| APIPA address (169.254.x.x) | Layer 2-3 (DHCP) | DHCP service |
| Can't ping gateway | Layer 2-3 | ARP, VLAN, cables |
| Can ping gateway, can't ping internet | Layer 3 | Routing, NAT, firewall |
| Can ping IP, can't access by name | DNS | Name resolution |
| Can resolve name, can't connect | Layer 4+ | Port, firewall, service |
Issue: No Physical Connectivity
Symptoms:
ipconfig shows 'Media disconnected'Diagnosis:
# Windows
ipconfig /all # Look for 'Media State: Media disconnected'
# Linux
ip link show # Look for 'state DOWN' or 'NO-CARRIER'
ethtool eth0 # Check 'Link detected: yes/no'
Common Causes:
Resolution:
show interface on Cisco)Issue: APIPA/Link-Local Address
Symptoms:
Diagnosis:
# Check IP configuration
ipconfig /all # Windows
ip addr show # Linux
# Try to get new address
ipconfig /release # Windows
ipconfig /renew
dhclient -r eth0 # Linux
dhclient eth0
# Check for DHCP traffic (on capture)
sudo tcpdump -i eth0 port 67 or port 68
Common Causes:
Resolution:
For any connectivity issue, work through the layers systematically: (1) ping localhost (loopback test), (2) ping local IP (NIC binding), (3) ping gateway (local network), (4) ping external IP (routing/NAT), (5) ping by name (DNS), (6) test service port. The layer where testing first fails is where the problem lies.
"The network is slow" — possibly the most frustrating complaint because 'slow' is subjective and can have dozens of causes. Systematic analysis is essential.
First: Quantify 'Slow'
Before investigating, establish baselines:
Performance Issue Categories:
| Category | Characteristics | Common Causes |
|---|---|---|
| High Latency | Ping times elevated (> expected for distance) | Congestion, suboptimal routing, satellite links |
| Low Throughput | Transfers take too long | Bandwidth saturation, duplex mismatch, TCP window issues |
| Packet Loss | Retransmissions, timeouts | Congestion, bad hardware, interference (wireless) |
| High Jitter | Variable latency | QoS issues, buffer bloat, competing traffic |
| DNS Delays | First connection slow, then fast | Slow/unreachable DNS server |
| Application Delay | Network fast, app slow | Server overload, inefficient queries |
Issue: Network Congestion
Symptoms:
Diagnosis:
# Check interface utilization on network devices
show interface GigabitEthernet0/1 # Cisco
# Look for 'input rate' and 'output rate' near link capacity
# From endpoints, use iperf to test capacity
iperf3 -c server-ip # Client
iperf3 -s # Server
# mtr to identify where latency increases
mtr -n target.example.com
Common Causes:
Resolution:
Issue: Duplex Mismatch
Symptoms:
Diagnosis:
# Check duplex settings on switch
show interface GigabitEthernet0/1
# Look for 'Half-duplex' when 'Full-duplex' expected
# Look for 'late collision' or 'CRC error' counters
# On Linux
ethtool eth0
# Check 'Duplex: Full' or 'Duplex: Half'
# On Windows
Get-NetAdapter | Format-List -Property *
What happens in a duplex mismatch:
Resolution:
Duplex mismatches are insidious because they don't completely break connectivity—small transfers work fine. The problem only appears under load, making it seem like a capacity issue. Always check duplex when investigating load-dependent slowness.
Issue: TCP Window Size Limitation
Symptoms:
Explanation:
TCP window size limits how much data can be in-flight before acknowledgment. For high bandwidth-delay product links:
Max Throughput = Window Size / RTT
Example: 64 KB window, 100ms RTT
Max = 65536 bytes / 0.1 seconds = 655 KB/s = 5.2 Mbps
On a 100 Mbps link, you're limited to 5.2 Mbps!
Diagnosis:
tcp.analysis.zero_window or tcp.analysis.window_fullResolution:
# Linux - increase TCP buffers
sysctl -w net.core.rmem_max=16777216
sysctl -w net.core.wmem_max=16777216
sysctl -w net.ipv4.tcp_rmem='4096 87380 16777216'
sysctl -w net.ipv4.tcp_wmem='4096 65536 16777216'
"It's always DNS" — a meme in network engineering because it's often true. DNS issues cause symptoms that look like network problems but are actually name resolution failures.
Issue: DNS Server Unreachable
Symptoms:
Diagnosis:
# Test DNS resolution explicitly
nslookup www.example.com
nslookup www.example.com 8.8.8.8 # Test alternate DNS
# Check configured DNS servers
cat /etc/resolv.conf # Linux
ipconfig /all # Windows
# Ping DNS server
ping 192.168.1.1 # Your DNS server IP
# Test DNS port
nc -zv 192.168.1.1 53 # TCP
nc -zuv 192.168.1.1 53 # UDP
Common Causes:
Issue: DNS Returning Wrong Answer
Symptoms:
Diagnosis:
# Query authoritative server directly
dig www.example.com @ns1.example.com
# Check what your resolver returns
dig www.example.com
# Compare results - they should match
# Check TTL to see if cached
dig www.example.com # Note TTL in answer section
# Flush cache
ipconfig /flushdns # Windows
sudo systemd-resolve --flush-caches # Linux (systemd-resolved)
sudo rndc flush # BIND server
Common Causes:
| Symptom | Likely Cause | Quick Fix |
|---|---|---|
| All lookups fail | DNS server unreachable | Check connectivity to DNS, try 8.8.8.8 |
| Some lookups fail | Specific zone issue | Check authoritative server for that zone |
| Slow resolution | DNS server overloaded or far | Add local caching DNS or switch provider |
| Wrong IP returned | Cache poisoning or stale cache | Query authoritative directly, flush cache |
| Works externally, not internally | Split-horizon DNS issue | Check internal DNS zone configuration |
| Works by IP, not name | DNS broken, network fine | Verify DNS server and records |
Issue: DNS Propagation Delay
Symptoms:
Understanding TTL:
DNS records have a Time-To-Live (TTL) specifying how long resolvers should cache them. Until TTL expires, cached value is returned.
# Check current TTL
dig example.com
;; ANSWER SECTION:
example.com. 3600 IN A 93.184.216.34
^^^^
TTL in seconds (1 hour)
Resolution:
For immediate change needs:
When debugging DNS, query the authoritative server directly (dig @ns1.example.com domain.com) to see the ground truth. Then query your local resolver to see what's cached. The difference reveals caching or propagation issues.
MTU (Maximum Transmission Unit) issues cause frustrating partial failures: small packets work, large packets fail. This creates symptoms like:
Understanding MTU:
| Link Type | Typical MTU |
|---|---|
| Ethernet | 1500 bytes |
| PPPoE | 1492 bytes |
| VPN Tunnel | 1400-1460 bytes (varies) |
| GRE Tunnel | 1476 bytes |
| IPv6 minimum | 1280 bytes |
When a packet exceeds the path MTU, it must be fragmented or dropped. Modern hosts use Path MTU Discovery (PMTUD) to avoid fragmentation by discovering the smallest MTU along the path.
Issue: Black Hole Due to PMTUD Failure
Symptoms:
What's Happening:
Diagnosis:
# Test with large ping and DF bit
ping -M do -s 1472 target.example.com # Linux
ping -f -l 1472 target.example.com # Windows
# If this fails but smaller sizes work, MTU issue confirmed
ping -M do -s 1400 target.example.com
# Binary search to find path MTU
ping -M do -s 1450 target.example.com
ping -M do -s 1430 target.example.com
# ... continue until you find the breakpoint
# Packet capture: look for large packets with no response
sudo tcpdump -i eth0 -nn host target.example.com and greater 1400
ip route ... mtu for specific paths1234567891011121314
# Linux - set interface MTUsudo ip link set eth0 mtu 1400 # Permanent (varies by distro)# /etc/network/interfaces:# iface eth0 inet dhcp# mtu 1400 # Windows PowerShell (admin)Set-NetIPInterface -InterfaceAlias "Ethernet" -NlMtu 1400 # Verifyip link show eth0 | grep mtu # Linuxnetsh interface ipv4 show interfaces # WindowsMSS (Maximum Segment Size) clamping is often the best fix for MTU issues. Configured on routers/firewalls, it rewrites the TCP MSS value in SYN packets to match the path MTU. This prevents large packets from ever being generated, avoiding fragmentation and PMTUD issues entirely.
Routing issues cause traffic to go to the wrong place—or nowhere at all. They range from simple (missing default gateway) to complex (routing loops, asymmetric routing).
Issue: Missing or Wrong Default Gateway
Symptoms:
Diagnosis:
# Check routing table
route -n # Linux
ip route show # Linux
route print # Windows
Get-NetRoute # PowerShell
# Is there a default route (0.0.0.0)?
ip route show | grep default
# Can you reach the gateway?
ping <gateway-ip>
Common Causes:
Issue: Asymmetric Routing
Symptoms:
tcpdump shows traffic going out but no returnWhat's Happening:
Diagnosis:
# Traceroute from both endpoints
# On Host A
traceroute host-b
# On Host B
traceroute host-a
# If paths are different, you have asymmetric routing
# Check for multiple routes to same destination
ip route show
# Multiple routes with same metric cause unpredictable path selection
Resolution:
Issue: Routing Loop
Symptoms:
Example Traceroute with Loop:
1 192.168.1.1 1.2 ms
2 10.0.0.1 5.3 ms
3 10.0.1.1 10.2 ms
4 10.0.0.1 15.1 ms <- Same as hop 2!
5 10.0.1.1 20.3 ms <- Same as hop 3!
6 10.0.0.1 25.2 ms <- Looping forever
Common Causes:
Resolution:
debug ip routing (carefully!) to watch route changesRouting changes affect all traffic through that router, not just the problem you're fixing. Always have a rollback plan. Verify the change with a specific test, but also spot-check other traffic flows. And never make routing changes without a maintenance window for production networks.
Firewalls are the most common cause of 'It works from here but not from there' scenarios. Security controls exist to block traffic, so they're doing their job—but sometimes too well.
Issue: Traffic Blocked by Firewall
Symptoms:
Diagnosis:
# Test port connectivity
nc -zv target 443
telnet target 443
Test-NetConnection target -Port 443 # PowerShell
# Differentiate drop vs reject:
# - Timeout = DROP (packet silently discarded)
# - Connection refused = REJECT (explicit denial)
# Capture on firewall or destination
sudo tcpdump -i eth0 host source-ip
# See if packets arrive but aren't responded to
Common Firewall Issues:
| Symptom Pattern | Likely Cause | Where to Check |
|---|---|---|
| Works internally, fails externally | Perimeter firewall blocking | Edge firewall rules |
| Works one direction, fails reverse | Stateful rule only allows established | Check if rule is unidirectional |
| Worked before, stopped working | Rule change or expiration | Recent firewall changes, time-based rules |
| Works for some users | Source IP ACL | Check source-based rules |
| Intermittent blocks | Rate limiting, connection limits | Firewall connection table, rate rules |
| Returns RST from unexpected IP | Firewall in path sending reset | Capture at both ends to see RST source |
Issue: NAT Problems
Symptoms:
Diagnosis:
# Check if NAT is expected
# Is your source IP private (10.x, 172.16-31.x, 192.168.x)?
ip addr show
# What IP does external service see?
curl ifconfig.me # Shows your external IP
# Check NAT table on router (if accessible)
show ip nat translations # Cisco
conntrack -L # Linux
Common NAT Issues:
| Issue | Symptom | Solution |
|---|---|---|
| PAT exhaustion | Connections fail intermittently | Increase NAT pool, reduce timeout |
| Missing inbound NAT | External users can't reach internal server | Add port forwarding/DNAT rule |
| Hairpin NAT | Can't access external IP from internal network | Enable NAT reflection/loopback |
| ALG breaking traffic | Complex protocols (SIP, FTP) fail | Disable/fix ALG, use STUN/TURN |
Issue: Security Software False Positives
Symptoms:
Diagnosis:
# Check security software logs
# Windows Defender
Get-WinEvent -LogName 'Microsoft-Windows-Windows Firewall With Advanced Security/Firewall'
# Check IDS/IPS for relevant alerts
grep 'blocked' /var/log/snort/alert # Snort
# Temporarily disable (for testing ONLY) to confirm
# Be very careful with this in production
Resolution:
When suspecting firewall: (1) List all firewalls in the path (host, network, cloud), (2) Test with packet capture at each stage—where does traffic stop?, (3) Check firewall logs for denies at suspicious times, (4) NEVER create 'allow any any' rules—create minimum necessary exceptions.
Wireless networks add a layer of complexity—literally. RF interference, channel congestion, and authentication add failure modes not present in wired networks.
Issue: Slow WiFi or High Packet Loss
Symptoms:
Diagnosis:
# Linux - check WiFi signal and link quality
iwconfig wlan0
# Look for: Signal level, Link Quality, Tx-Power
# Check what channel you're on
iwlist wlan0 channel
# Scan for competing networks
sudo iw dev wlan0 scan | grep -E 'SSID|signal|Channel'
# Windows - detailed WiFi report
netsh wlan show interfaces
# Check signal strength (RSSI)
# -30 to -50 dBm = Excellent
# -50 to -60 dBm = Good
# -60 to -70 dBm = Fair
# -70 to -80 dBm = Weak
# Below -80 dBm = Unusable
| Issue | Symptoms | Common Cause | Solution |
|---|---|---|---|
| Weak Signal | Slow speeds, drops | Distance, obstacles | Move closer, add AP, remove obstacles |
| Channel Congestion | Slow, inconsistent | Neighboring networks on same channel | Change channel, use 5 GHz |
| Interference | Random drops | Microwave, Bluetooth, cordless phone | Identify and remove interferer |
| Authentication Failure | Can't connect | Wrong password, expired cert | Verify credentials, check RADIUS |
| IP Issues Post-Connect | Connected but no internet | DHCP not responding, captive portal | Check DHCP, look for portal |
| Band Steering Issues | Keeps switching, drops | Aggressive band steering | Adjust AP settings, force band |
Issue: Authentication Failures (Enterprise WiFi)
Symptoms:
Enterprise WiFi Components:
Diagnosis:
Common Causes:
WiFi problems can be: RF (signal strength, interference), 802.11 (channel, band, rate), authentication (802.1X, WPA), IP (DHCP, ARP), or application. A weak WiFi signal might look like a DNS problem if DNS packets are being lost. Use WiFi analyzer tools to check RF layer before assuming it's a network issue.
We've covered the most common network issues you'll encounter. But more important than memorizing specific problems is developing the troubleshooting mindset:
| Problem Category | First Tool | Key Check |
|---|---|---|
| No connectivity | ping gateway | Link light, IP config, cables |
| Slow performance | ping + mtr | Latency, loss, congestion, duplex |
| DNS issues | nslookup / dig | Resolution, TTL, correct server |
| MTU/fragmentation | ping with DF bit | Large packet test, tunnel overhead |
| Routing problems | traceroute | Path, loops, missing routes |
| Firewall blocks | nc / telnet to port | Logs, capture at each stage |
| WiFi issues | iwconfig / signal check | RSSI, channel, interference |
Module Complete:
You've completed the Troubleshooting module! You now have:
These skills will serve you throughout your career in networking. Practice them regularly—even on working networks—so they become second nature when you need them most.
Congratulations! You've mastered the fundamentals of network troubleshooting—from methodology to tools to common issue resolution. These skills transform you from someone who can configure networks to someone who can diagnose and repair them under pressure. The next module covers numerical problems and calculations essential for network design and interview preparation.