NFS Share disconnecting: Stale file handle

Fffrank · 11 November 2021 19:35

I have my NFS share mounting via fstab. It’s been working fine for over 18 months without any hiccups – but recently started having to reboot nearly every day due to ‘stale file handle’ errors. It does remount without issue once I restart – works fine until I leave OSMC idle (overnight or during the work day.)

Full logs here: https://paste.osmc.tv/ciruzitene

osmc@osmc:/mnt$ ls
ls: cannot access 'dietpi': Stale file handle

osmc@osmc:/mnt$ cat /etc/fstab
# rootfs is not mounted in fstab as we do it via initramfs. Uncomment for remount (slower boot)
#/dev/vero-nand/root  /    ext4      defaults,noatime    0   0
192.168.1.10:/mnt/user/media /mnt/dietpi nfs    noauto,x-systemd.automount,_netdev  0  0

osmc@osmc:/mnt$ mount
devtmpfs on /dev type devtmpfs (rw,relatime,size=1015064k,nr_inodes=253766,mode=755)
proc on /proc type proc (rw,relatime)
sysfs on /sys type sysfs (rw,relatime)
tmpfs on /run type tmpfs (rw,relatime)
/dev/mapper/vero--nand-root on / type ext4 (rw,relatime,stripe=1024,data=ordered)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/debug type cgroup (rw,nosuid,nodev,noexec,relatime,debug)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/schedtune type cgroup (rw,nosuid,nodev,noexec,relatime,schedtune)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=36,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=196)
sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
mqueue on /dev/mqueue type mqueue (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
systemd-1 on /mnt/dietpi type autofs (rw,relatime,fd=42,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11577)
configfs on /sys/kernel/config type configfs (rw,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
192.168.1.10:/mnt/user/media on /mnt/dietpi type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.1.11,local_lock=none,addr=192.168.1.10,_netdev)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=203728k,mode=700,uid=1000,gid=1000)

Tom_Doyle · 11 November 2021 22:07

Hi,

Has anything outside of the vero4k, been changed in your network recently?

Could you give us more details about your network, what’s between the vero4k and 192.168.1.10; switches or routers. Also it may be worth considering trying a different network cable on the vero4k.

You could also try autofs to test to see if it makes any differences:

Also FYI _netdev option is ignored by nfsv4.

Thanks Tom.

Fffrank · 11 November 2021 23:31

I did upgrade my Unraid server software recently – not sure if that caused any issues but I did post over on their boards as well.

They’re both connected to the same HP managed switch and a PFSense router is handling the network.

Good tip! I don’t think this was an option when I first set up my Vero4k (april '18.) I’ll have time to break/tinker with it again next week.

Thanks!

bmillham · 12 November 2021 01:31

I’d suggest that you checkout the Unraid server. The ‘stale file handle’ usually means that the physical drive was dismounted and then couldn’t mount to the same /dev/sdX on the server, so NFS is unable to see the remounted drive but is seeing the old mountpoint.