Fixed: OSMC alpha 1-4 RPi CEC issue

I am testing new RPi 2 and new (only) OSMC_TGT_rbp2_20150202.img image.
CEC is somehow failing randomly (few seconds to few minutes) after boot. Here is all that I could find from Debug - which is not much:

osmc@osmc:~$ tail -n 1000 -f .kodi/temp/kodi.log | grep -i cec
12:50:44 T:3024531456   DEBUG: LoadMappings - loaded node "CEC Adapter"
12:50:44 T:3024531456   DEBUG: LoadMappings - loaded node "Pulse-Eight CEC Adapter" 
12:50:44 T:3024531456   DEBUG: SECTION:LoadDLL(libcec.so.2)
12:50:44 T:3024531456   DEBUG: Loading: libcec.so.2
12:50:45 T:3024531456   DEBUG: GetMappingForDevice - device (2708:1001) mapped to CEC Adapter (type = cec)
12:50:45 T:3024531456   DEBUG: InitialiseFeature - using libCEC v2.2.0
12:50:45 T:3024531456  NOTICE: Register - new cec device registered on cec->RPI: CEC Adapter (2708:1001)
12:50:45 T:2940658720  NOTICE: Thread CECAdapter start, auto delete: false
12:50:45 T:2940658720   DEBUG: OpenConnection - opening a connection to the CEC adapter: RPI
hdmi_ignore_cec_init=1
hdmi_force_cec_address=65535
12:50:46 T:2940658720   DEBUG: OpenConnection - connection to the CEC adapter opened
12:50:46 T:2903503904  NOTICE: Thread CECAdapterUpdate start, auto delete: false
12:50:48 T:2903503904   DEBUG: UpdateAudioSystemStatus - no CEC capable amplifier found
12:50:50 T:2884244512   DEBUG: OSMC PI hdmi_ignore_cec_init  =
12:50:50 T:2884244512   DEBUG: OSMC PI hdmi_ignore_cec =
12:50:25 T:2928669728   DEBUG: PushCecKeypress - received key a8 duration 0
12:50:25 T:2928669728   DEBUG: PushCecKeypress - received key a8 duration 141
12:50:26 T:2928669728   DEBUG: PushCecKeypress - received key a8 duration 0
12:50:26 T:2928669728   DEBUG: PushCecKeypress - received key a8 duration 139
12:50:30 T:2928669728   DEBUG: PushCecKeypress - received key a7 duration 0
12:50:30 T:2928669728   DEBUG: PushCecKeypress - received key a7 duration 124
12:50:31 T:2928669728   DEBUG: PushCecKeypress - received key a9 duration 0
12:50:31 T:2928669728   DEBUG: PushCecKeypress - received key a9 duration 147
12:50:33 T:2928669728   DEBUG: PushCecKeypress - received key  b duration 0
12:50:33 T:2928669728   DEBUG: PushCecKeypress - received key  b duration 139
12:50:52 T:2903503904   DEBUG: Thread CECAdapterUpdate 2903503904 terminating

After this, there is absolutely nothing that I can do using TV remote. The only solution so far is to restart via ssh.

Iā€™ve been using RPi B for some time now with raspbmc auto updated. Iā€™ve never seen this kind of behaviour.
For testing I used:

  • Two different Power Supplies
  • Two different HDMI cables
  • Two different HDMI ports on TV

I also tried restarting/powering off/on all/some devices - nothing changed.

Just to clarify, everything works fine on:
Old RPi 1 model B running on auto updated software, right now:

pi@raspbmc:~/.kodi/temp$ kodi --version
14.1 Git:2015-02-02-81c4046-dirty Media Center Kodi    
Copyright (C) 2005-2013 Team Kodi - http://kodi.tv

ā€¦connected to SHARP AQUOS LC 40LE830E and using standard TV Remote.

Please can you point me to some more verbose Debug mode. Or maybe there is some bug in current version of libcec. The weird thing is for example inconsistency in libCEC versions:

(...)
12:50:45 T:3024531456   DEBUG: InitialiseFeature - using libCEC v2.2.0
(...)
osmc@osmc:~$ dpkg -l | grep cec
ii  rbp-libcec-osmc                      2.1.4                     armhf        libcec
osmc@osmc:~$ dpkg -L rbp-libcec-osmc
/.
/usr
/usr/lib
/usr/lib/pkgconfig
/usr/lib/pkgconfig/libcec.pc
/usr/lib/libcec.la
/usr/lib/libcec.so.2.0.1
/usr/lib/libcec.a
/usr/bin
/usr/bin/cec-client
/usr/lib/libcec.so
/usr/lib/libcec.so.2

So I can see versions:

  • 2
  • 2.0.1
  • 2.1.4
  • 2.2.0

WTH?

Second - minor thing is that xbmc-send doesnā€™t work:
osmc@osmc:/usr/bin$ ./xbmc-send
Traceback (most recent call last):
File ā€œ/usr/bin/xbmc-sendā€, line 29, in
from xbmcclient import *
ImportError: No module named xbmcclient

It tries to: sys.path.append(ā€™ā€¦/ā€¦/lib/pythonā€™)
and there is no such directory.

This is minor bug and I am sure it will be corrected in near future

Raspbmc and OSMC use the exact same version of libcec, but it might be something to look into here.
My guess is that the fast boot time with OSMC time might be one difference though.
Personally I use 2 different setups and TVĀ“s with no big issues, the difference I see is the behavior close to boot but it never fails though.

One thing with my Panasonic Viera, if I have my old Pi B connected via HDMI at the same time as my Pi2, even if it is not powered (even power chord disconnected) CEC does not work at all.
But this is a problem with the TV though, same thing happens when I connect my laptop.

Regarding the ā€œversionsā€, there is only one version of libcec you are looking at and itĀ“s 2.2.0.
The others are the OSMC package version, actual library name and the rest symlinks (library names and symlink names are not versions).

This is sort of true. Iā€™m also using some popcornmix patches on top of the upstream libCEC version.

Try powering down that TV for a while.

Will look at xbmc-send, thanks for bringing it to my attention.

OK, Iā€™ve confirmed that this is definately software related.

I flashed OSMC_TGT_rbp_20150201.img to new SD card and put it in my otherwise fine working RPi 1 model B. After the initial setup and reboot, CEC froze almost imediately with almost identical error: Thread CECAdapterUpdate 2900804640 terminating. After switching back to SD card with raspmbc, everything went back to normal - without changing anything from TV side (options/power etc.).

So there must be a difference in libraries or firmware versions between raspbmc and osmc.

@sam_nazarko I already dercribed that I tried all as well as selective powering down and rebooting of everything.

@miappa I agree that there might be something related to fast boot, but then again the issue seems random and sometimes it happens after few seconds and sometimes after 1-2 minutes after boot

If I have some time, Iā€™ll try to use older RPi 1 images. Maybe one of them will work correctly - then we may have another variable like some changes in code to analyze.

I tried the earliest image - OSMC_TGT_rbp_20141221.img (I hoped it would work) but unfortunately it is as bad as with the most recent one. I am out of options right now. Correct me if I am wrong, but I think it is pointless to try every other build because clearly the problem was introduced somewhere between the most recent raspbmc and first available osmc build

Please, can someone point me in the right direction so I can further investigate the problem?

Also if there is something that can be done in order to find this bug please acknowledge.

I was going to slowly migrate from raspbmc to central mysql based osmc solution, but with no CEC operational it is not a viable option for me anymore. I would rather try to find the root cause of this, because I am very used to raspbmc and I had great fun using it. Iā€™ve never considered different migration paths then raspbmc->osmc. Right now I am very confused and donā€™t know what to do next.

If there is something that I could test on my side (different libraries/ firmwares/ options), I am willing to do it when provided with brief How to.

I continue my fight with this problem. I dwelt a little deeper in debug options and tried to find out what is causing this.
I pasted-bin 3 kodi logs.All of them with additional CEC and VIdeo debug turned on.
First: kodi-after-shutdown - Pastebin.com - this happens when I shutdown -r the OS - pretty much its the same every time
Second and third:
kodi-after-mediacenter-start1 - Pastebin.com
kodi-after-mediacenter-start2 - Pastebin.com

  • this happens when I stop the mediacenter and start it again using sudo systemctl

I tested it several times with exactly the same outcome (sooner or later) but for the sake of this discussion I wanted to do exact same things and present logs. So after shutdown or restart mediacenter I pressed ā€œLeftā€ on my remote one time,
it begins with entry: CecLogMessage - >> 02:44:03
ā€¦just to proof that it is working. After that I just waited

Basically after shutdown ā€˜Thread CECAdapterUpdate ā€¦ terminatingā€™ entry appears approx after 1,5 minutes of doing nothing.
After stop/start mediacenter it is unfortunately random. I am not very patient and I donā€™t have all day to wait. So after a while I started pressing buttons on my remote. I repeated this procedure about 4-5 times and sometimes it was few minutes, sometimes 10 minutes for CEC to fail. Right now, when I am writing this itā€™s been 35 minutes without a failure, but I know itā€™s just playing me, so I cannot present another log.
OK :slight_smile: failed after 37 minutes

I wanted to confirm is it the CEC that stops working or maybe libCEC or maybe some software inside/related to Kodi. So after the failure i started cec-client and I confirmed that CEC is working perfectly fine - see cec-client-test - cec-client-test - Pastebin.com

I also found this abandoned post on OpenELEC forum: http://openelec.tv/forum/124-raspberry-pi/71270-4-0-1-cec-thread-cecadapterupdate-terminating

It describes exactly the same issue I am facing. Apparently there was no answer to this problem, or maybe it just goes away in future release. The problem was seen when updating from OpenELEC 3.x to 4.x.

I thought, lets give it a shot. I flashed brand new OpenELEC-RPi2.arm-5.0.3.img.
I am testing it for few minutes right now, so far so good. Iā€™ll give it a shot for few hours, maybe days and I report my findings whenever it works properly or not.

Still if someone have some advice for me, how can I test it and make it work properly on osmc - I still hope that it can be done - please let me know - Iā€™ll appreciate any advice/suggestion.

Cheers

Fixed xbmc-send now. Available in the next release.

Re. CEC: These are the only changes since Raspbmc: https://github.com/samnazarko/osmc/blob/master/package/libcec-osmc/patches/rbp-000-add-popcornmix-enhancements.patch

I will CC @popcornmix to look.
Although I am doubtful this patch has caused the issue

Sam

I am not a programmer so it is hard for me to say. It also may be kernel or firmware related. Lets just wait for now to see if the same happens on openelec
OpenELEC:~/.kodi/temp # uname -a
Linux OpenELEC 3.18.6 #1 SMP PREEMPT Tue Feb 10 19:26:14 CET 2015 armv7l GNU/Linux

TIA for bringing it to popcornmix attention.

OpenELEC is running ~ 20 hours. No issues with CEC. I guess there have to be some difference in firmware/drivers/libraries that is causing this issue on OSMC.
@sam_nazarko Is there any possibility, that I could test those (libCEC), or some other libraries that are working on raspbmc somehow on osmc? Maybe it is just a matter of replacing some library files. If you can point me which ones, I can try to do that, so we can confirm if there is a problem with them or not. After all hardware is the same, so it is probably doable. What do you think?

@sam_nazarko I found the root cause of the problem. Hmmm maybe not the ā€œrootā€ but I identified the exact circumstances when this issue happens.

This is what I did. I disabled every service except ssh (I just deleted almost every link in /etc/systemd). So I end up with not usable system without kodi or anything really displayed on the TV.

I tested all of it using only cec-client command - no kodi or any other software involved or running.

Next step was disabling every service in /etc/rcX - one at a time.

Finally I identified the service that is causing the problem - NTP!!!

It seems that after clock synchronization, there is about 2 minutes window - I donā€™t know why 2 minutes - And after 2 minutes CEC stops working.

Iā€™ve just confirmed it. After disabling (deleting S01ntp) from /etc/rc5.d. Everything is working properly. After several hours, my remote is working and I see no issues with CEC.

I confirmed that after even manually syncing time with ntpdate - after 2 minutes CEC stops working.

Now, I donā€™t have time and energy to post all my test results right now - I am to sleepy, but I promise that I would post every my finding tomorrow (or is it today :).

This is definitely software fault and I can imagine that the @popcornmix patches can cause this, because last I checked there was many timing related entries in that patch - though as I wrote before - I am not a programmer so it is really hard for me to say.

Just please be aware that there is something wrong with the CEC libraries/firmware in current release (updated or not). Maybe it is Sharp TV specific, but I doubt it. I think that presenting this release to vast audience may reveal many more problems like mine.

I can see to ways of dealing with this:

  1. the hard way - Iā€™ll disable ntp and try to implement some kind of time sync and then restart of mediacenter evey now and then.
  2. The right way - fixing the libraries/firmware which is definitely broken right now

What do you think?
As I wrote, Iā€™ll try to describe all my tests procedures and present logs today which can definitely point to ntp as the trigger of the issue.

Maybe you have some answers from popcornmix yet?

@sam_nazarko: As promised here is detailed explanation and steps to replicate the issue:

Right now I am confident that the time change is causing this problem, not the ntpd itself. Here is simple testing procedure:
cec-test

Just paste it directly to your terminal at:
osmc@yourhost:~$

It should present following output:
cec-error

At least it does when my Sharp TV is connected.

Runing the same code on raspbmc is totally clean and undisruptive:
raspbmc-ok

Warning: The above script is disrupting CEC communication from kodi, so after running it, you have to run:

sudo systemctl restart mediacenter

or

sudo shutdown -r now

Explanation:
when changing time back 10,15 and 20 seconds, exactly after 10,15,20 seconds CEC suddenly disconnects.
Now while cec-client has auto reconnect built in, Kodi has not (maybe it should have?) and that is why remote becomes unresponding till next shutdown/restart.

Sam If any of this is unclear please let me know, Iā€™ll try to elaborate as much as I can.

1 Like

Try this

sudo apt-get update
sudo apt-get install fake-hwclock```

It didnā€™t help.
I investigate a little deeper though and finnally found out what causes time shifts. It turned out that one stupid application - I suspect connman - is doing independent ntp queries - even when ntpd is disabled, and it is asking default gw/dhcp server for time.

I checked and that particular UNCONFIGURABLE ntp service on that network device (unfortunatelly I cannot change it) is giving wrong time to the osmc. Then OSMC synchronize time with standard debian pool using ntpd and that is why time on the OS is going back and forth. I blocked on fw the ntp communication with the problematic router and - no more time shifts (at least for now).
So with a little workaround I fixed issues with time, which I hope will prevent issues with CEC from reoccuring, but I strongly recommend:

There is something wrong with libCEC and time shifts shouldnā€™t make it reset/brake/shutdown/reconnect. When I have time Iā€™ll look again at the popcornmix patch and Iā€™ll try to find something that can cause this. Now at least I know exactly what I am looking for: I suspect that there can be some counter and variable that is for example dividing by zero when time loops.

EDIT: I looked at the diff but I couldnā€™t find anything even remotely looking like an error causing instruction.
@sam_nazarko: I there any possibility that you can provide those libCEC files without those patches from after the raspbmc, so I can test it just to be sure?

I find it weird there is something wrong with libcec when I donā€™t see anybody replying here apart from sam and miappaā€¦
Iā€™m only using cec for 2 years now ā€¦ apart from one or two breaks it never had any flaws

FWIW Iā€™m following this thread because I have the same issue. (pi 2, osmc, cec works for a few minutes then stops, etc.)

Happy to supply any stats you need from my machine.

Forward time jumps are generally (somewhat) harmless as they cause timers to expire early and code that was supposed to wait for a delay to execute immediately, but backwards time jumps can cause timers to (effectively) never expire, thus code will hang/freeze waiting for a timed event that never happensā€¦

(For example if you are waiting for a 10 second delay by testing for current time + 10 seconds, if the time goes backwards by 40 years during that 10 second period your code will now wait for 40 years before thinking itā€™s 10 second delay is upā€¦)

If the time is jumping backwards a large amount of time (Iā€™m guessing itā€™s 12 hours or more) then that will cause issues with a lot of software. Short of removing all dependencies on system time from the code itā€™s not practical to guard against backwards time jumps.

The real issue is that your time is jumping backwards and forwards. Connman does indeed have a built in ntp client and while itā€™s easy to add additional ntp servers to itā€™s list, itā€™s not clear how that built in ntp client can be disabled without recompiling from source leaving out the ntp functionality completely.

As we are planning to build our own version of connman we will look at disabling the built in ntp client as we already have a separate ntp client running.

In current version of OSMC libraries going back 8+ seconds causes CEC to drop connection. It was not the case with raspbmc and in my opinion it is a bug that may cause other issues in the future if left unfixed.

There is very easy test procedure cec-test. If you run it youā€™ll probably see what I am talking about.

It will not mess with your installation in any way. Just restart of kodi is required after it is done.

if you donā€™t have it already, install cron, tcpdump and ntpdate:

sudo apt-get -y install cron ntpdate tcpdump

then run :

crontab -e

and add following at the end:

* * * * * /usr/sbin/ntpdate -q 128.4.24.98 >> /home/osmc/ntpdateq  

save it (Ctrl+x, then Y)

then restart your box and when the remote (CEC) is running correctly run

sudo tcpdump -i eth0 -e -v -n port 123

When cec stops working, copy the tcpdump output and the content of /home/osmc/ntpdateq and paste it here or on pastebin.com and post the link

EDIT: make sure you log in as quickly as possible and run the above command. It is best to run it while there is no Kodi menu screen visible yet. If you log in to late we may not get the result we want and it will be much more complicated to diagnose the problem.

1 Like

Will do. (It might take me a day or 2)

Edit: Well of course now itā€™s not crashing (yet?). :slight_smile:

@rmgn I am afraid that we are looking at different issue here. In my case it was totally repeatable. Please please try to get the logs though