User Tools

Site Tools


linux:applications:nfs

NFS

Network File System (NFS) is a network file system protocol originally developed by Sun Microsystems in 1984, allowing a user on a client computer to access files over a network in a manner similar to how local storage is accessed. NFS, like many other protocols, builds on the Open Network Computing Remote Procedure Call (ONC RPC) system. The Network File System is an open standard defined in RFCs, allowing anyone to implement the protocol.

Diagnosing problems

Check your version

First make sure you have an up-to-date copy of NFS installed with

rpm -q nfs-utils 

or

rpm -q -f /usr/sbin/rpc.nfsd

Check dependencies (like portmap) with

rpm -q -R nfs-utils

and check their versions as well. See what files are affected by

rpm -q -l nfs-utils

See that your services are running with

rpcinfo -p [hostname]
On a client machine look for portmapper, nlockmgr and possibly amd or autofs. A server will also run mountd and nfs.

Saturated network ?

First exercise your disk with your own code or with a simple write operation like

time dd if=/dev/zero of=testfile bs=4k count=8182
  8182+0 records in
  8182+0 records out
  real    0m8.829s
  user    0m0.000s
  sys     0m0.160s

Writing files should be enough to test network saturation.

When profiling reads instead of writes, call umount and mount to flush caches, or the read will seem instantaneous.

cd /
umount /mnt/test
mount /mnt/test
cd /mnt/test
dd if=testfile of=/dev/null bs=4k count=8192

Check for failures on a client machine with

nfsstat -c

or

nfsstat -o rpc

If more than 3% of calls are retransmitted, then there are problems with the network or NFS server.

Look for NFS failures on a shared disk server with

nfsstat -s

or

nfsstat -o rpc

It is not unreasonable to expect 0 badcalls. You should have very few “badcalls” out of the total number of “calls.”

Lost packets

NFS must resend packets that are lost by a busy host. Look for permanently lost packets on the disk server with

head -2 /proc/net/snmp | cut -d' ' -f17

If you can see this number increasing during nfs activity, then you are losing packets.

You can reduce the number of lost packets on the server by increasing the buffer size for fragmented packets.

echo 524288 > /proc/sys/net/ipv4/ipfrag_low_thresh
echo 524288 > /proc/sys/net/ipv4/ipfrag_high_thresh 

This is about double the default.

Server threads

See if your server is receiving too many overlapping requests with

grep th /proc/net/rpc/nfsd
  th 8 594 3733.140 83.850 96.660 0.000 73.510 30.560 16.330 2.380 0.000 2.150

The first number is the number of threads available for servicing requests, and the the second number is the number of times that all threads have been needed. The remaining 10 numbers are a histogram showing how many seconds a certain fraction of the threads have been busy, starting with less than 10% of the threads and ending with more than 90% of the threads. If the last few numbers have accumulated a significant amount of time, then your server probably needs more threads.

Increase the number of threads used by the server to 16 by changing RPCNFSDCOUNT=16 in /etc/rc.d/init.d/nfs

Invisible or stale files

If separate clients are sharing information through NFS disks, then you have special problems. You may delete a file on one client node and cause a different client to get a stale file handle. Different clients may have cached inconsistent versions of the same file. A single client may even create a file or directory and be unable to see it immediately. If these problems sound familiar, then you may want to adjust NFS caching parameters and code multiple attempts in your applications.

Changing client mount properties

For more detailed information about tuning client mount properties for performance:

You can see what parameters you are using with:

cat /proc/mounts

Edit /etc/fstab to change properties.

rw

Usually you want the flag rw to allow read-write access, and it is off by default.

intr

Allow users to interrupt hung processes with this flag (off by default). This might sound risky, but in fact this property is consistent with the original nfs design and is well supported. Unnecessary hangs will be more destabilizing.

lock

If your code needs file locking, then by all means turn this on. But if you are certain that locking is not required (as in my current project), then turn it off. I could create unnecessary opportunities for timeouts.

hard

Avoid the complexity of amd if you can for simple clusters. Use hard

vers=3

This appears as v2 or v3 in /proc/mounts. The NFS version supposedly defaults to version 2, but version 3 is faster and supports big files. I get v3 by default much of the time.

tcp or udp?

Almost everyone runs NFS under udp for performance. But udp is an unreliable protocol and can perform worse than tcp on a saturated host or network. If nfs errors occur too often, then you may want to try tcp instead of udp. § wsize and rsize

If packets are getting lost on the network then it may help to lower rsize and wsize mount parameters (read and write block sizes) in /etc/fstab.

For reliability, prefer smaller rsize and wsize values in /etc/fstab. I recommend rsize=1024,wsize=1024 instead of the defaults of 4096.

timeo and retrans

If the server is responding too slowly, then either replace the server or increase the timeo or retrans parameters.

For more reliability when the machine stays overloaded, set retrans=10 to retry sending RPC commands 10 times instead of the default 3 times.

The default timeout between retries is timeo=7 (seven tenths of a second). Increase to timeo=20 (two full seconds) to avoid hammering an already overloaded server.

acregmin, acregmax, acdirmin, acdirmax, noac, cto

acregmax and acdirmax are the maximum number of seconds to cache attributes for files and directories respectively. Both default to 60 seconds. 0 disables caching and noac disables all caching. cto (on by default), guarantees that files will be rechecked after closing and reopening.

Minimum numbers of seconds are set with acregmin and acdirmin. acdirmin defaults to 30 seconds and acregmin to 3 seconds.

I recommend setting acdirmin=0,acdirmax=0 to disable caching of directory information and reduce acregmax=10 because we have had so many problems with directories and files not appearing to exist shortly after created.

noatime or atime

Performance should improve by adding the noatime flag. Everytime a client reads from a file, the server must update the server's inode time stamp for most recently accessed time. Most applications don't care about the most recent access time, so you can set the noatime with impunity.

Nevertheless, this flag is rarely set on a general purpose machine, and if you are more concerned about reliability, then use the default atime.

Setup Client Server

IP uses in this example setup:

  • Server IP: 192.168.63.233
  • Client IP: 192.168.63.234

Server configuration

Make sure nfs-utils are installed:

yum install nfs-utils

Configure access to the shared volume within /etc/exports

/srv/www/  192.168.63.234/32(rw,fsid=0,insecure,no_subtree_check,sync,anonuid=65534,anongid=65534)
For a detailed explanation about the options above (rw, ro, sync, check etc.) see the man pages for exports (“man exports”).

Rescan exports:

exportfs -rv

Register nfs to start automatically in run-level 3 and start the service:

chkconfig nfs on
service nfs start

Client configuration

Make sure nfs-utils are installed:

yum install nfs-utils

Create a mount point:

mkdir /mnt/www

Mount the NFS share with NFS v.4:

mount -t nfs4  192.168.63.233:/ /mnt/www/

Edit /etc/fstab to automatically mount the NFS share on boot time:

192.168.63.233:/  /mnt/www  nfs4  rw,hard,intr,proto=tcp,port=2049  0 0

After the entry in fstab has been created you can mount and umount the NFS share like this:

mount  -v /mnt/www
umount -v /mnt/www
/srv/wiki.niwos.com/data/pages/linux/applications/nfs.txt · Last modified: 2010/06/14 13:19 (external edit)