The Problem:
Last night, I tried playing a large (~50 GB) 4K video file and performance was unwatchable. It took forever to initially buffer and it stopped every few minutes to buffer. I had just watched a 25-30 GB file and it worked great, and my setup is pretty beefy, so I was surprised.
As part of testing whether it was a problem with the network, or the player itself, I transferred the file to a USB SSD drive mounted on my Macbook over the network via OSX’s native SMB implementation. 18 minutes to transfer the file, which by my math works out to 50 MB/s, so while there’s clearly a bottleneck somewhere, it should still be plenty to stream the file (18mins is much < than video length).
Setup:
Raspberry Pi 4 + WD Mybook over USB3
SMB mounted on the Vero 4K via fstab and CIFS
Config:
//<Local reserved IP>/galahad /mnt/Galahad cifs x-systemd.automount,rw,iocharset=utf8,,vers=3.0,username=<xxx>,password=<xxx>,noperm 0 0
Network (all Cat6 hardwired): Pi → Switch (office) → Switch (main hub in garage) → Switch (AV Rack) → Vero 4k+
Theoretically, it should be possible to get 100-125 MB/s over the network (even taking into account the HDD limitations, which are the biggest problem), which should be plenty to stream big ass 4K files. So, potential bottlenecks are network, disk i/o, and SMB.
Network
Quickly and easily ruled out using iPerf. Running between Pi ↔ OSMC I see:
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 109 MBytes 913 Mbits/sec
[ 5] 1.00-2.00 sec 112 MBytes 940 Mbits/sec
[ 5] 2.00-3.00 sec 111 MBytes 934 Mbits/sec
[ 5] 3.00-4.00 sec 112 MBytes 939 Mbits/sec
[ 5] 4.00-5.00 sec 112 MBytes 939 Mbits/sec
[ 5] 5.00-6.00 sec 112 MBytes 939 Mbits/sec
[ 5] 6.00-7.00 sec 112 MBytes 939 Mbits/sec
[ 5] 7.00-8.00 sec 112 MBytes 939 Mbits/sec
[ 5] 8.00-9.00 sec 112 MBytes 939 Mbits/sec
[ 5] 9.00-10.00 sec 86.9 MBytes 729 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.02 sec 1.07 GBytes 914 Mbits/sec 1 sender
[ 5] 0.00-10.00 sec 1.07 GBytes 915 Mbits/sec receiver
That’s pretty much saturating the Pi and Vero’s gigabit network cards, so everything looks groovy.
Disk i/o on Pi
A little bit harder to hunt down. For context, the drive is an exFAT drive that was previously being used as USB drive on a mac, and I was too lazy to reformat. I was concerned that the exFAT FUSE drivers were causing issues, so I ran benchmarks with dd and hdparm
For more context, wirecutter’s benchmarks on the drive I own were around 140 MB/s read and write (I’ve seen other folks getting closer to 180 MB/s)
In my tests writes were pretty constant, and much slower than 140 MB/s (I do think the drivers are the culprit), but still should be plenty for streaming large files:
pi@raspberrypi:~ $ dd if=/dev/zero of=/mnt/galahad/test.tst bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB, 391 MiB) copied, 8.13792 s, 50.3 MB/s
Reads varied. When the disk had to wake up, we were looking at more like 18 MB/s (not surprising), but with a warm disk, things looked good (repeated all these tests many times):
Disk warm (OS cache blown away, disk still potentially cached):
dd if=/mnt/galahad/test.tst of=/dev/null bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB, 391 MiB) copied, 2.41268 s, 170 MB/s
Disk cold (OS cache blown away, assuming disk cache dead):
dd if=/mnt/galahad/test.tst of=/dev/null bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB, 391 MiB) copied, 22.4917 s, 18.2 MB/s
HDparm on /dev/sda1:
Timing cached reads: 1584 MB in 2.00 seconds = 792.65 MB/sec
Timing buffered disk reads: 586 MB in 3.01 seconds = 194.57 MB/sec
So, probably some performance gains to be had here, but nothing showstopping. Ruling this out.
SMB performance
So, on to SMB performance. To do this, I used dd again and just tried reading from the mount points to /dev/null. As noted to my mac in a very un-scientific benchmark I was seeing around 50 MB/s transferring the file.
To confirm this quickly I ran DD from my macbook:
dd if=/Volumes/galahad/test.tst of=/dev/null bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes transferred in 7.012072 secs (58413547 bytes/sec)
so ~58 MB/s
On to OSMC:
dd if=/mnt/Galahad/test.tst of=/dev/null bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB, 391 MiB) copied, 34.5151 s, 11.9 MB/s
So, this looks to be the culprit. SMB is very slow on OSMC.
From
A suggested fix is to enable loose caching (with some downsides, of course). I tried this, remounted, and…
dd if=/mnt/Galahad/test.tst of=/dev/null bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB, 391 MiB) copied, 28.1725 s, 14.5 MB/s
Nope.
Any ideas here? Since everything in my house is on Linux or Mac, I should probably just switch over to NFS (I could even run it side-by-side with SMB) but I figured I’d ask here in case anyone had any ideas or it could help improve something. Any thoughts?