Watching recordings crash after upgrade from 3 B to 3 B+

My perfectly working setup was a Pi 3 B as an osmc tvheadend server with the latest osmc 2018.03-2, recordings on a Synology NAS mounted with nfs via fstab and client is a Shield TV 2017 with Kodi. Connection is gigabit ethernet. I needed the old Pi elsewhere, so I decided to upgrade osmc to 3B+ and a Flirc case. I swapped the SD card to a new box, plugged in the Hauppauge TV tuner stick and everything was as before, except 15-20 C cooler CPU. As far as I know, only thing that has changed (other than Pi hardware) is the IP address of the osmc box and I had to make that change in Shield Kodi tvheadend addon configuration.

But everything is not working as expected. Live tv is still perfect but when I watch recordings, connection between Shield and tvheadend crashes randomly. Sometimes immediately, sometimes after 20 minutes of watching. Kodi log doesn’t provide much information on the client side and I haven’t been able to pinpoint any cause for this loss of connection on the tvheadend side logs either. Loss of connection is reported in both logs but there’s no explanation why.

I thought maybe SD card is somehow not 1:1 compatible between Pi revisions and made a fresh install. Ok, I cheated a litte bit and backed up and restored ~/.hts directory for tvheadend. But no joy. Watching recordings still cause random connection losses. Then I swapped the SD card back to the old Pi 3 B and everything is back to normal again! No crashes and silk smooth operation.

Any suggestions, which keywords I should add to tvheadend logging to get useful information? Or any suggestions what could have gone wrong between identical setups of 3 B and 3 B+? Naturally I’m ready to provide any logs necessary.

I know there are some issues with the Ethernet driver for the 3b+ that I believe will be addressed with a future OSMC update (I’m hesitant to say “next” since I’m not one of the developers). So that certainly could be causing your problem. I also swapped a 3B+ for a 3B. I had more aggressive caching set in my advanced settings (both amount and speed), and removing those two lines seems to have stabilized things for me. I don’t know how much of that is just coincidence and how much might be that the default cache settings are less stress on the network connection.

Ethernet is indeed likely the issue

Sam

Ok, thanks for quick response! Ethernet driver issues seems a plausible culprit. I’ll give the 3 B+ a try over wifi and see if it works better.

Sam,

Given the number of people having issues with the 3B+ including myself, do you think it would be appropriate to warn people off this version with OSMC for now?

I think I can verify this is definitely Ethernet driver issue. Having read quite a lot of posts in Pi forums I found out that Raspberry engineers acknowledge there is a severe issue with RX flow control of the driver. I wouldn’t expect a quick solution.

Pi forum provided me with a workaround that I can live with. By forcing Ethernet interface to 100 Mbit/s it won’t drop frames, doesn’t get clogged and stays alive.

First I installed ethtool, then did what someone suggested in Pi forum and added following command to /etc/rc.local

/sbin/ethtool -s eth0 speed 100 duplex full autoneg on

EDIT: I’m having some problems with this command and /etc/rc.local - have to investigate a little bit…
EDIT2: I had to delay execution of rc.local until network is ready. Luckily there are instructions.

100 Mbit/s is quite all right and I can watch my recordings uninterrupted again.

2 Likes

The problem is older versions of OSMC don’t support the B+.

We are waiting for fixes from the Pi Foundation.

I read @dappernut’s post as saying you all might suggest not running OSMC on the 3B+ at all, not trying to run an older version of OSMC on it.

OSMC works on 3 B+. Problem is new “gigabit” Ethernet that can not handle incoming flow of packets over USB 2.0 bus where Ethernet hardware is attached to. Forcing Ethernet to a speed, i.e. 100 Mbit/s, that USB 2.0 bus can handle, OSMC can be used on Pi 3 B+. Or use wifi instead of Ethernet.

As far as i understand, this issue is not actually related to OSMC per se, it is a HW related Ethernet driver issue affecting all operating systems, distros and applications.

1 Like

It will likely be fixed in a future update.

Worst case scenario: we have to limit to 100Mbps by default and add an option to My OSMC.

Sam

I have locked the ethernet interface on my NAS to 100mbs full duplex and all the problems with my 3B+ have gone away, including buffering, audio drop-out and stopping and returning to the main menu.

What I find particularly interesting is that I have previously experienced a lot of problems with various RPi (1,2 and 3, and now a Vero4K too) all stopping playback and returning to the main menu, and locking the NAS to 100Mbs seems to have cured all those problems too. I know conventional wisdom is that these type of issue are a network problem, however after replacing various cables, switches, playback devices and NAS’s over a period of many months, I’m really not convinced at all that this is necessarilly the case, and in fact there’s not something fundementally iffy with the OSMC/Kodi handling of Ethernet. Based on breadcrumbs from around the web, I don’t beleive I’m the only one. Note: Using full gigabit from NAS to playback devices never gave me a buffering problem.

As my NAS has multiple Ethernet interfaces, I’ve locked SMB to a 1Gbs port for general file storage use and NFS to a 100Mbs port purely for KODI stream use (mounted via fstab). All seems good in this configuration.

The Vero 4K and Pi are completely different hardware. I’d be surprised if there’s a common OSMC problem affecting these platforms.

I’m running a Vero 4K on a Gigabit network without issue; and haven’t had to adjust speeds manually on any equipment. Some logs after an incident may give some clues. I’ve also updated the Pi kernel which has some improvements for the Gigabit based ethernet adapter.

Sam

To be absolutely clear, I’m a long time user and a big fan. :+1: I only want to make things better for everyone. I also forgot that I’d tried OpenELEC and LibreELEC on RPi and got pretty much the same results. “It’s your network for sure” I hear you say, however please bear with me.

Trying to determine root cause, my data points are as follows:-

  1. No other devices on my network (of which there are plenty) seem to be misbehaving.
  2. As mentioned before, I’ve tried swapping out different switches, patch leads, NAS’s, playback devices and versions of KODI.
  3. I’ve borrowed Fluke network test kit and checked my structured cabling which seems fine (then reterminated and checked again for good measure).
  4. iperf3 measurements using PC hardware indicates a high throughput/low packet loss network (though iperf3 on RPi is all over the place).
  5. if there is mild packet loss on my network, tcp-ip retransmission should take care of it with minimal measurable impact on a 30mbs video stream over a gigabit network.
  6. Turning off 802.3x ethernet flow control on my switches makes matters worse, turning it on improves things.
  7. In general I have never suffered a buffering problem (excluding the recent 3B+ problem).
  8. Problems are not related to the video player (in isolation), as files copied to a local thumb drive always play fine.
  9. I have previously provided debug logs for the playback stopping issue from which no useful information could be determined (though more than willing to try this again noting item 12 below).
  10. Limiting the output of the network source device to the 100mbs input limit of the playback device appears to solve all problems.
  11. When an playback issue occurs, ping and ssh conectivity to source and playback devices is unaffected.
  12. The totally random and unpredictable nature of the issue makes it very difficult to reliably draw conclusions when test configuration A vs configuration B. Sometimes things appear to work perfectly for hours, then fail repeatedly after only a few minutes with no dicernible cause or pattern.

Based on all this, my Janet & John hypothesis is:-
When KODI playback devices are getting overloaded with incoming Ethernet packets, some stuff gets lost and instead of re-fetching this data from the network source, instead something internally (withing the o/s and/or network stack) is erroring which is interpretted by the video player as an end of file. In this scenario, fast networks and source network devices are more troublesome than slower ones. …and by extension, If this is the case then if the internal error can be caught and dealt with appropriately (i.e. by retyring) then this problem could be solved for good.

Sorry, for the apparent diatribe, however as you can probably tell I’ve been mythering over this for a long time and I’m no longer buying into the network being faulty as being the root cause.

Kodi will think it hits EOF if there is a drop in network connectivity.
If you mount a Samba or NFS share via Kodi, you are more likely to suffer from this issue than an fstab based mount. This is because the kernel will read ahead; and therefore if the connection is briefly interrupted, the problem will not be as noticeable, unless there is significant disturbance.

Sam

I have never been able to capture any sign of Ethernet disconnecting at either end, nor of significant packet locss on the network. I can however see cause and effect between the speed that data is arriving at the playback device, and playback disruption.

Best regards,
Mark W.

From post #11:

Wouldn’t you agree that this at least suggests some kind of network-related issue? You talk about replacing switches, so it looks like there is the possibility you’d been having flow-control and/or packet-buffer issues on the path between the NAS and OSMC device(s)

If you’re up for it, you can always run ifstat on OSMC and see if there’s any discernible pattern to the network data when an error occurs. It’s not part of the standard OSMC build, so you’ll need to install it first. Then run:

nohup ifstat -nt -i eth0 >/home/osmc/ifstat.out 2>/dev/null &

It’'ll run in the background even after you’ve logged off and will record around 20 bytes of data each second, so you’re not going to fill up the storage space any time soon. If an error occurs, you should check its time from the Kodi log (/home/osmc/.kodi/temp/kodi.log) and see if there’s anything obviously occurring on the network interface at the same time. I find that importing the data to a spreadsheet and graphing it can often be very revealing.

Remember to kill the job once you no longer need it.

I absolutley agree that this appears to be related to networking, however not necessarilly indicative of a faulty network.

I’ll have a look at ifstat and see if that provides another data point.

Update:

Started looking at this again yesterday evening (after several days of blissful faultless playback with NAS connection at 100Mbs) and reverting to NAS connection at 1Gbs replay now seems to be perfect. I’ve seen this sort of random OK/not OK thing before however, so the bunting isn’t up yet. WIll repeat the testing when things start to play up again, as I’m comnfident they will.

I have however noted from the output of ifstat that with 100Mbs the throughput is nice and steady, and 1Gbs is constantly bouncing up and down. I can also see a count of pause frames being received by the switch which during testing at both 100Mbs and 1GBs has stayed at zero. When replay problems have been encountered previously, I believe I’ve seen the switch start to receive pause frames from the player however this hasn’t been during controlled test conditions; watch this space.

@dappernut, have you updated to latest osmc? I noticed there is a fix for some 3 B+ ethernet issues. I asked for details but haven’t seen any.

Could you clarify which machine you’re using for your latest tests?