Network File Systems - Learning Module

Loading content...

0/240

Performance Considerations

The Performance Challenge

NFS abstracts network file access to appear local—but the network's presence can never be fully hidden. A local disk operation takes microseconds; a network round-trip takes milliseconds. This thousand-fold latency difference means that naive NFS usage can be painfully slow, while properly tuned NFS can closely approach local disk performance for many workloads.

Performance optimization in NFS requires understanding where time is spent: network latency, server processing, disk I/O, and the interactions between caching layers on client and server. Armed with this understanding, you can tune configurations, select appropriate mount options, and design applications that work with NFS's characteristics rather than against them.

This page provides a comprehensive guide to NFS performance: the fundamental bottlenecks, tuning parameters that matter, caching behaviors, and practical strategies for common workload patterns.

What You Will Learn

By the end of this page, you will understand NFS performance fundamentals, client and server caching mechanisms, critical tuning parameters, network optimization strategies, and how to diagnose and resolve common performance problems. You'll be able to configure NFS for optimal performance in your specific environment.

Understanding NFS Latency

Every NFS operation incurs latency from multiple sources. Understanding these sources helps identify bottlenecks and focus optimization efforts effectively.

Latency Components of an NFS Operation

Consider a simple file read that misses all caches:

Total Latency = Client Processing
              + Network Latency (request)
              + Server Processing  
              + Disk I/O
              + Network Latency (response)
              + Client Processing

Let's quantify typical values:

Typical Latency Components
Component	LAN (1Gbps)	WAN (100ms RTT)	Notes
Client RPC/XDR	10-50 µs	10-50 µs	CPU-bound, scales with data size
Network One-Way	0.1-0.5 ms	50 ms	Distance and congestion dependent
Server RPC Processing	10-100 µs	10-100 µs	Thread availability matters
Server Disk I/O	0.1-10 ms (SSD)	0.1-10 ms	Depends on cache hit
	5-15 ms (HDD)	5-15 ms	Seek time dominates
Total (cache miss)	5-20 ms	110-130 ms	Network dominates on WAN
Total (cache hit)	0.5-2 ms	100-105 ms	Still network-bound on WAN

Key Insights:

On LANs, disk I/O often dominates — For cache misses, the server's disk is the bottleneck. SSDs dramatically improve NFS performance.
On WANs, network latency dominates — Even with SSDs, 100ms network RTT makes every operation slow. Reducing round-trips is critical.
Caching is everything — The difference between 'data in cache' and 'data on disk' is 10-100x for local disk, but caching also avoids network round-trips entirely.
Request size matters — Larger read/write sizes amortize per-operation overhead. A 1MB read is much more efficient than 256 × 4KB reads.

Converting Mermaid diagram...

The Goal: Minimize Round-Trips

The most impactful performance optimization is reducing the number of network round-trips. Every technique we'll discuss—caching, read-ahead, compound operations, larger transfer sizes—ultimately serves this goal. One round-trip that transfers 1MB is far faster than 256 round-trips transferring 4KB each.

Client-Side Caching

The NFS client employs multiple caching layers to reduce network traffic. Understanding and tuning these caches is essential for performance optimization.

Data Cache (Page Cache)

File contents are cached in the kernel's page cache, just like local files. When an application reads data, the NFS client:

Checks if the page is in cache
If cached and valid, returns immediately (no network)
If not cached, issues NFS READ and caches the result

The challenge is cache validity: how does the client know if cached data is still current?

Close-to-Open Consistency

NFS implements close-to-open consistency by default:

When a file is opened, the client validates the cache against the server (GETATTR)
If the file's mtime/size changed, cached data is invalidated
Reads during the open session use cached data without revalidation
When the file is closed, all buffered writes are flushed to the server

This means changes made on one client may not be visible to another client until it reopens the file.

Close-to-Open Consistency Example
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
/* Demonstrating close-to-open consistency */
 
/* Client A */
int fd = open("/mnt/nfs/file.txt", O_RDWR);
// Client validates cache with server - file mtime=10:00:00
char buf[100];
read(fd, buf, 100);  // Reads from cache or server
// ... time passes, Client B modifies file ...
read(fd, buf, 100);  // Still sees old data! (cache not revalidated)
close(fd);
 
/* Client A reopens */
fd = open("/mnt/nfs/file.txt", O_RDONLY);
// Client validates cache - mtime now 10:05:00, cache invalidated
read(fd, buf, 100);  // Now sees Client B's changes
close(fd);
 
/* Summary:
 * - Changes are visible after close() + open()
 * - During a single open session, cache may be stale
 * - This is a deliberate design trade-off for performance
 */

Attribute Cache

File attributes (size, mtime, mode, etc.) are cached separately with configurable timeouts:

Mount Option	Default	Description
`actimeo=n`		Set all attribute timeouts to n seconds
`acregmin=n`	3	Min cache time for regular file attributes
`acregmax=n`	60	Max cache time for regular file attributes
`acdirmin=n`	30	Min cache time for directory attributes
`acdirmax=n`	60	Max cache time for directory attributes
`noac`		Disable attribute caching entirely

The timeout increases from min to max based on how long since the file was modified—files that haven't changed recently get longer cache times.

Directory Cache (dentry cache / DNLC)

Directory lookups (name → inode mapping) are cached:

First access to /mnt/nfs/deep/path/to/file.txt:
  LOOKUP "deep" → handle_deep (cached)
  LOOKUP "path" → handle_path (cached)
  LOOKUP "to"   → handle_to   (cached)
  LOOKUP "file.txt" → handle_file (cached)

Second access:
  All lookups hit cache, no network traffic!

Negative entries (non-existent files) are also cached, avoiding repeated lookups for missing files.

Cache Invalidation Triggers:

Attribute cache timeout expires
Directory mtime changes (detected on revalidation)
Explicit cache clear (not usually exposed to users)
Mount option lookupcache=none disables directory caching

Cache Tuning Mount Options
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Default caching - good for most workloads
mount -t nfs server:/export /mnt
 
# Aggressive caching - for read-mostly workloads with rare changes
mount -t nfs -o actimeo=3600 server:/export /mnt  # 1-hour attribute cache
 
# Reduced caching - for workloads needing fresher data
mount -t nfs -o actimeo=1 server:/export /mnt     # 1-second timeouts
 
# No attribute caching - strict consistency, poor performance
mount -t nfs -o noac server:/export /mnt          
# WARNING: Significant performance impact!
 
# Disable directory caching (NFSv4)
mount -t nfs -o lookupcache=none server:/export /mnt
 
# /etc/fstab examples
server:/export  /mnt/default  nfs  defaults                    0 0
server:/export  /mnt/read     nfs  ro,actimeo=3600              0 0
server:/export  /mnt/strict   nfs  noac,sync                    0 0
 
# View current cache statistics
cat /proc/fs/nfs/*/stats  # Client statistics
nfsstat -c                 # Client summary

noac is Rarely the Right Answer

When facing consistency issues, it's tempting to set noac to disable all caching. This devastates performance—every stat(), open(), and many other operations require network round-trips. Usually, better solutions exist: application-level coordination, proper file locking, or accepting close-to-open semantics.

Read-Ahead and Write-Behind

The NFS client employs sophisticated prefetching and write buffering to hide network latency from applications.

Read-Ahead: Prefetching Sequential Data

When the client detects sequential read patterns, it prefetches data before the application requests it:

Application reads:
  read(offset=0, 4KB)
  read(offset=4KB, 4KB)
  read(offset=8KB, 4KB)   ← Pattern detected: sequential!

NFS client behavior:
  Application: read(offset=12KB, 4KB)
  Client: Issues READ for 12KB AND prefetches 16KB-128KB
  By time app asks for 16KB, data is already in cache!

Read-ahead window grows as sequential access continues and shrinks on random access or errors.

Read-Ahead Tuning
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# View current read-ahead setting for NFS mount
cat /sys/class/bdi/*/read_ahead_kb
# Default is often 128KB or higher
 
# Set read-ahead for a specific device/mount
# First find the bdi (backing device info) for your NFS mount
mount | grep nfs
# /dev/nfs4 on /mnt type nfs4 (...)
 
# Set read-ahead to 1MB for better streaming performance
echo 1024 > /sys/class/bdi/0:XX/read_ahead_kb
 
# Or use blockdev (some systems)
blockdev --setra 2048 /dev/nfs4  # 2048 sectors = 1MB
 
# NFS-specific read sizing via mount options
mount -t nfs -o rsize=1048576 server:/export /mnt  # 1MB reads
 
# Application-level hint (posix_fadvise)
# Tells kernel access pattern for optimization
posix_fadvise(fd, 0, 0, POSIX_FADV_SEQUENTIAL);  // Will read sequentially
posix_fadvise(fd, 0, 0, POSIX_FADV_RANDOM);      // Random access pattern

Write-Behind: Buffering Writes for Efficiency

The NFS client buffers writes in memory before sending to the server:

Application calls write()
Data goes to page cache, write() returns immediately
Background flush sends data to server when:
- Buffer reaches threshold (wsize worth of data)
- Periodic flush timer fires
- sync() or fsync() is called
- File is closed

Write-behind benefits:

Application sees sub-millisecond write latency
Small writes can be merged into large server writes
Server receives larger, more efficient write requests

Write Flush Triggers
Trigger	What Happens	Performance Impact
Buffer full	Background flush of dirty pages	None - asynchronous
Periodic timer	Flush pages older than threshold	None - asynchronous
close()	Flush all dirty pages for file	Blocks until server confirms
fsync()	Flush + wait for server commit	High - synchronous wait
sync()	Flush all NFS dirty pages	Very high - global flush
Memory pressure	Writeback to free memory	Variable

Controlling Write Behavior

# Mount options affecting writes
mount -t nfs -o wsize=1048576 server:/export /mnt  # 1MB writes
mount -t nfs -o async server:/export /mnt          # Async (default)
mount -t nfs -o sync server:/export /mnt           # Sync (each write waits)

# The 'sync' mount option is different from NFSv3 stable writes:
# - sync mount: every write() syscall waits for server ACK
# - async mount + stable=DATA_SYNC: server commits data, not metadata
# - async mount + stable=FILE_SYNC: server commits everything

Write Congestion Control:

When the server can't keep up with writes, the client has congestion control:

# View congestion thresholds
cat /proc/sys/sunrpc/tcp_slot_table_entries   # Max concurrent RPCs
cat /proc/sys/fs/nfs/nfs_congestion_kb        # Dirty data before throttling

# If application writes faster than server can handle:
# 1. Dirty pages accumulate
# 2. Hits nfs_congestion_kb threshold
# 3. Client throttles writes, application blocks
# This prevents unbounded memory consumption

Data at Risk with Async Writes

With async writes, data is in client memory but not on server disk. A client crash loses this data. Applications requiring durability must use fsync() after critical writes, accepting the performance cost. Database applications typically use sync mounts or explicit fsync().

Transfer Size Optimization

The size of NFS read and write operations has a dramatic impact on throughput. Per-operation overhead (RPC marshalling, network packets, disk I/O setup) is relatively constant, so larger operations are more efficient.

rsize and wsize: Read/Write Size

The rsize and wsize mount options control the maximum size of NFS READ and WRITE operations:

mount -t nfs -o rsize=1048576,wsize=1048576 server:/export /mnt

Version	Maximum rsize/wsize
NFSv2	8 KB
NFSv3	1 MB+ (server negotiated)
NFSv4	1 MB+ (server negotiated)

Why Larger is Better (Usually):

Throughput Impact of Transfer Size
rsize/wsize	Ops for 1GB	Overhead	Typical Throughput
4 KB	262,144	Very High	10-30 MB/s
32 KB	32,768	High	50-100 MB/s
256 KB	4,096	Moderate	150-300 MB/s
1 MB	1,024	Low	300-800 MB/s

Negotiated vs. Specified:

Modern NFS clients and servers negotiate transfer sizes automatically. The client proposes its maximum, the server responds with what it supports, and both use the lesser value.

# View negotiated size
nfsstat -m
# Shows: rsize=1048576, wsize=1048576 (for example)

# If you specify smaller values, they're used
mount -t nfs -o rsize=65536 server:/export /mnt
# Result: rsize=65536 even if server supports 1MB

When Smaller Might Be Better:

Larger isn't always optimal:

Random I/O workloads: Reading 1MB when you need 4KB wastes bandwidth
Low-memory clients: Larger buffers consume more memory
High-latency networks: First-byte latency increases with larger transfers
Unreliable networks: Larger packets are more likely to be corrupted

Finding Optimal Transfer Size
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#!/bin/bash
# Benchmark different rsize/wsize values
 
SERVER="nfsserver"
EXPORT="/export/test"
MOUNTPOINT="/mnt/nfs_test"
TESTFILE="$MOUNTPOINT/testfile"
 
echo "Testing NFS read/write performance at different transfer sizes"
echo "============================================================="
 
for SIZE in 4096 16384 65536 262144 1048576; do
    SIZE_KB=$((SIZE / 1024))
    
    # Unmount if mounted
    umount $MOUNTPOINT 2>/dev/null
    
    # Mount with specific size
    mount -t nfs -o rsize=$SIZE,wsize=$SIZE $SERVER:$EXPORT $MOUNTPOINT
    
    # Clear caches
    sync
    echo 3 > /proc/sys/vm/drop_caches
    
    # Write test
    WRITE_SPEED=$(dd if=/dev/zero of=$TESTFILE bs=1M count=256 2>&1 |                   grep -oP '[d.]+s*[MG]B/s' | tail -1)
    
    # Clear cache for read test
    sync
    echo 3 > /proc/sys/vm/drop_caches
    
    # Read test  
    READ_SPEED=$(dd if=$TESTFILE of=/dev/null bs=1M 2>&1 |                  grep -oP '[d.]+s*[MG]B/s' | tail -1)
    
    echo "rsize/wsize=${SIZE_KB}KB: Write=$WRITE_SPEED, Read=$READ_SPEED"
    
    rm -f $TESTFILE
done
 
# Cleanup
umount $MOUNTPOINT

Default Is Usually Fine

Modern NFS implementations negotiate optimal transfer sizes automatically. The default (often 1MB) works well for most workloads. Only tune rsize/wsize if benchmarking shows improvement or you have specific constraints like limited memory.

Server-Side Performance

The NFS server's configuration and resources directly impact all clients' performance. Optimizing the server provides benefits that multiply across every connected client.

NFS Server Threads

The number of nfsd kernel threads determines how many operations can be processed concurrently:

# View current thread count
cat /proc/fs/nfsd/threads

# Set thread count (until reboot)
echo 64 > /proc/fs/nfsd/threads

# Permanent configuration: /etc/nfs.conf
[nfsd]
threads=64

Guidelines for thread count:

Minimum: 8 threads (default on many systems)
Rule of thumb: 1 thread per CPU core, minimum
High-load servers: 32-128 threads
Very high I/O: May need 256+ threads

More threads allow more parallel operations, but each thread consumes kernel memory. Monitor /proc/net/rpc/nfsd to see if threads are fully utilized.

Server Performance Monitoring
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# Comprehensive NFS server monitoring
 
# === Thread Utilization ===
cat /proc/net/rpc/nfsd
# Look for 'th' line: current threads, max threads, threads ever used
 
# Threads fully utilized example:
# th 8 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 100.000
# The 100.000 means all threads busy 100% of samples - need more threads!
 
# === NFS Statistics ===
nfsstat -s  # Server statistics
# Key metrics:
# - ops/sec per operation type
# - null procedure calls (often health checks)
# - getattr calls (high = poor client caching)
 
# === RPC Statistics ===
cat /proc/net/rpc/nfsd
# rc (reply cache): hits, misses, nocache
# High hits = lots of retransmits, possible network issues
# 
# io: bytes read, bytes written
# Use to track throughput over time
 
# === Per-Operation Latency (requires nfsstat or sar) ===
nfsstat -s -l  # If available: per-op latency histograms
 
# === Export-specific Statistics ===
cat /proc/fs/nfsd/exports
 
# === I/O Wait and Disk Performance ===
iostat -x 1  # Watch disk utilization
# High %util on NFS server disks = storage bottleneck
 
# === Network Performance ===
sar -n DEV 1  # Network utilization per interface
# Check NFS server's network interface isn't saturated

Server Buffer Cache

The server's buffer cache dramatically impacts performance. Frequently-accessed files served from RAM are orders of magnitude faster than disk.

# Linux buffer cache is automatic based on available RAM
free -h
# "buff/cache" column shows caching memory

# For dedicated NFS servers, maximize memory
# 64GB+ RAM recommended for active datasets > 100GB

# Check cache effectiveness
cat /proc/meminfo | grep -E '^(Cached|Buffers|Active|Inactive)'

Export Options Affecting Performance:

Option	Performance Impact	Recommendation
`sync`	Each write waits for disk	Data safety, slower
`async`	Writes buffered in RAM	Fast, but data at risk
`no_subtree_check`	Skip path verification	Faster, recommended
`no_root_squash`	No impact	Security consideration
`wdelay`	Batch writes slightly	Default, usually good

Server Hardware Recommendations

•Storage: SSDs for random I/O workloads (databases, VMs); HDDs acceptable for streaming workloads (large sequential files). NVMe for highest performance.
•Memory: Size for working set plus buffer cache. More RAM = more caching = better performance. 64GB minimum for serious workloads.
•Network: 10 GbE minimum for multi-client environments. 25/40/100 GbE for high-throughput requirements. Dedicated NFS network recommended.
•CPU: Modern multi-core. NFS is surprisingly CPU-intensive for XDR encoding and file system operations.
•RAID: Use appropriate RAID level. RAID 10 for write-heavy workloads; RAID 6 for read-heavy with capacity needs.

ZFS as NFS Backend

ZFS is an excellent backend for NFS servers: ARC caches reads efficiently, ZIL handles synchronous writes, compression reduces storage needs, and snapshots provide backups. The combination of ZFS + NFS is popular in enterprise environments.

Network Optimization

Network configuration can make or break NFS performance, especially for high-throughput or high-latency scenarios.

TCP Tuning for NFS

NFSv4 (and NFSv3 over TCP) benefits from TCP optimization:

# Increase TCP buffer sizes for high-bandwidth networks
# These are maximum values; actual sizes are auto-tuned

# /etc/sysctl.conf or /etc/sysctl.d/nfs.conf

# TCP receive buffer
net.core.rmem_max = 16777216
net.ipv4.tcp_rmem = 4096 262144 16777216

# TCP send buffer
net.core.wmem_max = 16777216  
net.ipv4.tcp_wmem = 4096 262144 16777216

# Apply: sysctl -p

For high-latency networks (WAN), larger buffers allow more data in-flight, improving throughput.

MTU and Jumbo Frames

Larger MTU (Maximum Transmission Unit) reduces packet overhead:

MTU	Packets for 1MB	Overhead
1500 (standard)	683	High
9000 (jumbo)	114	Low

# Enable jumbo frames (requires switch support)
ip link set eth0 mtu 9000

# Verify with ping
ping -M do -s 8972 nfsserver  # 8972 + 28 bytes header = 9000

# Permanent: add MTU=9000 to network config

Important: All devices in the path (client NIC, switch, server NIC) must support jumbo frames. Mismatched MTU causes fragmentation or packet drops.

Network Performance Troubleshooting
Symptom	Likely Cause	Diagnosis	Solution
Slow throughput	Bandwidth saturation	sar -n DEV; check util%	Upgrade network, or compress
High latency	Network congestion	ping -f (flood ping)	QoS, dedicated VLAN
Variable performance	Packet loss	netstat -s (retransmits)	Fix network, check cables
Sudden drops	MTU mismatch	ping -M do -s	Enable PMTUD or reduce MTU
Operation timeouts	Firewall issues	tcpdump for timeouts	Check firewall rules

Network Diagnostics for NFS
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Quick network health check for NFS
 
# 1. Basic connectivity
ping -c 3 nfsserver
 
# 2. MTU verification (test jumbo frames work)
ping -c 3 -M do -s 8972 nfsserver  # Will fail if jumbo not supported
 
# 3. TCP performance test (install iperf3 on both ends)
# Server: iperf3 -s
# Client: iperf3 -c nfsserver
# Should show line rate (e.g., 9.4 Gbps for 10GbE)
 
# 4. NFS-specific round-trip test
time rpcinfo -p nfsserver  # Should be < 1ms on LAN
 
# 5. Monitor retransmits during NFS activity
watch -n 1 'netstat -s | grep -E "(retrans|timeout)"'
 
# 6. Capture slow operations
tcpdump -i eth0 -w nfs_trace.pcap host nfsserver and port 2049 &
# Reproduce slow operation, then Ctrl+C
# Analyze with Wireshark: filter "nfs"
 
# 7. RPC layer diagnostics
rpcdebug -m nfs -s rpc  # Enable RPC debugging (verbose!)
dmesg | tail -50        # View debug output
rpcdebug -m nfs -c rpc  # Disable when done

Dedicated NFS Network

For performance-critical deployments, use a dedicated network (VLAN or physical) for NFS traffic. This ensures NFS bandwidth isn't competed for by other traffic and simplifies network diagnostics. Many enterprises use 10GbE+ dedicated storage networks.

Common Performance Problems

Experience with NFS reveals common patterns of performance problems. Knowing these patterns helps diagnose issues quickly.

Common Performance Issues

•Excessive GETATTR/ACCESS calls — Application or build system calling stat() on every file, causing thousands of RPCs. Symptoms: high RPC count, low data throughput. Solution: increase attribute cache timeouts, use delegation (NFSv4), or fix application.
•Small I/O operations — Application doing many small reads/writes. Look for rsize/wsize mismatch or application buffering issues. Solution: application buffering, larger rsize/wsize.
•Synchronous write bottleneck — Everything waiting for disk. Symptoms: high write latency, low write throughput. Solution: async export (if safe), server SSDs, write-back cache.
•Single-threaded server — Few nfsd threads cause queuing. Symptoms: high latency even with low disk I/O. Solution: increase thread count.
•Silly rename accumulation — Thousands of .nfsXXXX files. Symptoms: slow directory operations. Solution: client application bug leaving files open, or client crashed leaving orphans. Clean up manually.

Case Study: The Build System Disaster

A common pattern in software development environments using NFS for source code:

Symptom: Builds take 10x longer over NFS than local disk

Investigation:

nfsstat -c  # On client during build
# Shows: GETATTR: 500,000/min (!)
# Shows: READ: 200/min

Diagnosis: Build system (make, ninja) stats every source file to check modification times. With thousands of source files and header dependencies, this generates massive attribute traffic.

Solutions (in order of preference):

Use ccache with NFS—caches built artifacts locally
Aggressive attribute caching: actimeo=600
Use NFSv4 delegations (server grants caching permission)
Build in temporary local directory, commit to NFS
As last resort: noac won't help here, avoid it

Diagnosing Slow NFS
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#!/bin/bash
# NFS performance diagnostic script
 
echo "=== NFS Client Statistics ==="
nfsstat -c
 
echo ""
echo "=== Mount Options ==="
mount | grep nfs
 
echo ""
echo "=== NFS Mount Detailed Stats ==="
nfsstat -m
 
echo ""
echo "=== Current NFS Activity (5 second sample) ==="
echo "Before:"
cat /proc/net/rpc/nfs | grep -v '^#'
sleep 5
echo "After:"
cat /proc/net/rpc/nfs | grep -v '^#'
 
echo ""
echo "=== Network Statistics ==="
netstat -s | grep -E "(retrans|timeout|reset)"
 
echo ""
echo "=== Diagnosis ==="
# Check for common problems
 
# High GETATTR rate?
GETATTR=$(nfsstat -c | awk '/getattr/ {print $2}')
if [ "$GETATTR" -gt 10000 ]; then
    echo "WARNING: High GETATTR rate ($GETATTR). Consider actimeo= tuning."
fi
 
# Check for sync mount
if mount | grep nfs | grep -q 'sync'; then
    echo "WARNING: Sync mount detected. Consider async for performance."
fi
 
# Check rsize/wsize
RSIZE=$(nfsstat -m | grep rsize | head -1)
if echo "$RSIZE" | grep -q 'rsize=[0-9]{1,4}[^0-9]'; then
    echo "WARNING: Small rsize detected. Consider larger transfer sizes."
fi

Profile Before Optimizing

Don't guess at performance problems—measure them. Use nfsstat, tcpdump, and application profiling to identify actual bottlenecks. A perceived 'NFS is slow' problem might actually be application behavior, network issues, or server disk limits.

Summary: Performance Best Practices

We've explored the many facets of NFS performance. Here's a consolidated view of what matters most and best practices to follow:

Key Takeaways

•Minimize round-trips above all else — Caching, read-ahead, compound operations, and larger transfer sizes all serve this goal. The network round-trip is the fundamental unit of NFS latency.
•Understand close-to-open consistency — NFS revalidates on open, not continuously. Applications expecting immediate cross-client visibility need explicit coordination (locks, sync).
•Attribute caching is crucial — The acmin/acmax options control attribute cache lifetime. Tune based on workload—read-mostly can have long timeouts; rapidly changing needs shorter ones.
•Server resources matter — Threads, RAM (for buffer cache), disk speed (SSDs help immensely), and network bandwidth all impact client experience. Size appropriately.
•Network is often the bottleneck — On WANs especially, but even on LANs. Jumbo frames, TCP tuning, and dedicated networks help.
•Profile before optimizing — Use nfsstat, tcpdump, and iostat to identify actual bottlenecks. Random tuning often makes things worse.

Quick Reference: Performance Tuning Options
Setting	Where	When to Change
rsize/wsize	Mount option	Default usually optimal; test if different helps
actimeo	Mount option	Increase for read-mostly; decrease for frequent changes
nfsd threads	Server config	Increase if threads show 100% utilization
tcp_rmem/wmem	Kernel sysctl	Increase for high-bandwidth, high-latency networks
MTU	Network config	9000 if all gear supports jumbo frames
async/sync export	Server exports	async for performance; sync for data safety

The Performance Hierarchy

When troubleshooting or optimizing NFS, consider issues in this order:

Is the application NFS-appropriate? Some applications (many small random I/Os) will never perform well on NFS.
Is the network healthy? Packet loss, high latency, or saturation cause universal slowness.
Is the server adequately resourced? CPU, RAM, disk, network—any bottleneck here affects all clients.
Are mount options appropriate? rsize/wsize, caching timeouts, hard/soft mounting.
Is the application optimized for NFS? Batch operations, use buffering, avoid unnecessary stats.

Address higher-level issues before tuning lower-level parameters. A misconfigured network won't be fixed by adjusting read-ahead settings.

Module Complete

Congratulations! You've completed the Network File Systems module. You now understand NFS architecture from the ground up: the stateless design philosophy, the evolution of NFS versions, and how to optimize for real-world performance. This knowledge enables you to deploy, troubleshoot, and tune NFS in any environment—from simple file sharing to enterprise-scale infrastructure.