Vero 4k+ freezes/slows to unusable after a few days

I enjoyed my Vero 4k since april 2017, and did an upgrade to the Vero 4k+. The new unit got set up and I enjoyed it for a few days when suddenly it wasn’t available when I wanted to use it. Black screen. Did a reboot, it worked again and I used it some more. After a few days it was locked up again, and this has repeated itself many times.

I started digging into this today, after a few minutes I was finally able to login via ssh. Everything runs really slow, the memory is mostly gone. Kswapd0 uses most of the CPU. No processes use any memory, ps aux shows 0,0% mem usage on every process. I can’t kill kswapd0 since it’s run by the kernel.

free -m:

osmc@Osmc:~$ free -m
              total        used        free      shared  buff/cache   available
Mem:           1788        1647          44          88          96           8
Swap:             0           0           0

ps aux:

osmc@Osmc:~$ ps aux | sort -nk +5
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
osmc       576  7.3  0.0      0     0 ?        Zl   Mar16 540:48 [kodi.bin] <defunct>
root         2  0.0  0.0      0     0 ?        S    Mar16   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S    Mar16   0:51 [ksoftirqd/0]
root         7  0.0  0.0      0     0 ?        S    Mar16   6:18 [rcu_sched]
root         8  0.0  0.0      0     0 ?        S    Mar16   0:00 [rcu_bh]
root         9  0.0  0.0      0     0 ?        S    Mar16   0:00 [migration/0]
root        10  0.0  0.0      0     0 ?        S    Mar16   0:00 [migration/1]
root        11  0.0  0.0      0     0 ?        S    Mar16   0:31 [ksoftirqd/1]
root        13  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kworker/1:0H]
root        14  0.0  0.0      0     0 ?        S    Mar16   0:01 [migration/2]
root        15  0.0  0.0      0     0 ?        S    Mar16   0:24 [ksoftirqd/2]
root        17  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kworker/2:0H]
root        18  0.0  0.0      0     0 ?        S    Mar16   0:00 [migration/3]
root        19  0.0  0.0      0     0 ?        S    Mar16   0:17 [ksoftirqd/3]
root        21  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kworker/3:0H]
root        22  0.0  0.0      0     0 ?        S<   Mar16   0:00 [khelper]
root        23  0.0  0.0      0     0 ?        S    Mar16   0:00 [kdevtmpfs]
root        24  0.0  0.0      0     0 ?        S<   Mar16   0:00 [netns]
root        25  0.0  0.0      0     0 ?        S<   Mar16   0:00 [suspend]
root        26  0.0  0.0      0     0 ?        S<   Mar16   0:00 [writeback]
root        27  0.0  0.0      0     0 ?        S<   Mar16   0:00 [bioset]
root        28  0.0  0.0      0     0 ?        S<   Mar16   0:00 [crypto]
root        29  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kblockd]
root        30  0.0  0.0      0     0 ?        S    Mar16   0:00 [khubd]
root        34  0.0  0.0      0     0 ?        S<   Mar16   0:00 [devfreq_wq]
root        36  0.0  0.0      0     0 ?        S    Mar16   0:00 [gp_pll]
root        38  0.0  0.0      0     0 ?        S<   Mar16   0:00 [rpciod]
root        39 33.8  0.0      0     0 ?        R    Mar16 2505:58 [kswapd0]
root        40  0.0  0.0      0     0 ?        S    Mar16   0:00 [fsnotify_mark]
root        41  0.0  0.0      0     0 ?        S<   Mar16   0:00 [nfsiod]
root        42  0.0  0.0      0     0 ?        S<   Mar16   0:00 [cifsiod]
root        56  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kthrotld]
root        57  0.0  0.0      0     0 ?        S<   Mar16   0:00 [iscsi_eh]
root        58  0.0  0.0      0     0 ?        S<   Mar16   0:00 [eth_moniter_tx_]
root        59  0.0  0.0      0     0 ?        S<   Mar16   0:00 [stmmac_wq]
root        64  0.0  0.0      0     0 ?        S<   Mar16   0:00 [dm_bufio_cache]
root        65  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kmpathd]
root        66  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kmpath_handlerd]
root        67  0.0  0.0      0     0 ?        S    Mar16   0:42 [kthread_hdcp]
root        68  1.0  0.0      0     0 ?        S    Mar16  80:29 [irq/250-sd_emmc]
root        70  0.0  0.0      0     0 ?        S    Mar16   0:00 [irq/249-sd_emmc]
root        71  0.0  0.0      0     0 ?        S    Mar16   0:00 [irq/99-sd_in]
root        72  0.0  0.0      0     0 ?        S    Mar16   0:00 [irq/101-sd_out]
root        73  0.0  0.0      0     0 ?        S    Mar16   0:00 [irq/248-sd_emmc]
root        79  0.0  0.0      0     0 ?        S    Mar16   0:00 [vmalloc_ion]
root        80  0.0  0.0      0     0 ?        S    Mar16   0:00 [codec_mm_ion]
root        81  0.0  0.0      0     0 ?        S    Mar16   0:00 [carveout_ion]
root        82  2.8  0.0      0     0 ?        R    Mar16 210:29 [mmcqd/0]
root        84  0.0  0.0      0     0 ?        S    Mar16   0:00 [mmcqd/0boot0]
root        85  0.0  0.0      0     0 ?        S    Mar16   0:00 [mmcqd/0boot1]
root        86  0.0  0.0      0     0 ?        S    Mar16   0:00 [mmcqd/0rpmb]
root        87  0.0  0.0      0     0 ?        S    Mar16   0:00 [ge2d_monitor]
root        88  0.0  0.0      0     0 ?        S<   Mar16   0:00 [gpio_pwm_wq]
root        89  0.0  0.0      0     0 ?        S    Mar16   0:39 [irq/76-vdec-1]
root        90  0.1  0.0      0     0 ?        D    Mar16   8:14 [vdec-core]
root        91  0.0  0.0      0     0 ?        S    Mar16   0:00 [irq/35-vsync]
root        92  0.0  0.0      0     0 ?        S    Mar16   5:14 [kthread_di]
root        93  0.0  0.0      0     0 ?        S<   Mar16   0:00 [cec_work]
root        95  0.0  0.0      0     0 ?        S<   Mar16   0:00 [ipv6_addrconf]
root        96  0.0  0.0      0     0 ?        S<   Mar16   0:00 [deferwq]
root       115  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kdmflush]
root       116  0.0  0.0      0     0 ?        S<   Mar16   0:00 [bioset]
root       128  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kworker/1:1H]
root       131  0.0  0.0      0     0 ?        S    Mar16   0:02 [jbd2/dm-0-8]
root       132  0.0  0.0      0     0 ?        S<   Mar16   0:00 [ext4-rsv-conver]
root       184  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kworker/3:1H]
root       191  0.0  0.0      0     0 ?        S    Mar16   0:00 [rc0]
root       192  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kworker/u9:0]
root       261  0.0  0.0      0     0 ?        S<   Mar16   0:15 [kworker/2:1H]
root       357  0.0  0.0      0     0 ?        S<   Mar16   0:00 [cfg80211]
root       578  0.0  0.0      0     0 ?        S    Mar16   0:00 [wl_event_handle]
root       579  0.0  0.0      0     0 ?        S    Mar16   0:00 [dhd_watchdog_th]
root       580  0.0  0.0      0     0 ?        S    Mar16   0:00 [dhd_dpc]
root       581  0.0  0.0      0     0 ?        S    Mar16   0:00 [dhd_rxf]
root      1024  0.0  0.0      0     0 ?        S<   Mar21   0:00 [kworker/0:1H]
root      1068  0.0  0.0      0     0 ?        S    Mar17   0:00 [ppmgr]
root      1072  0.0  0.0      0     0 ?        ZN   Mar17   0:00 [sudo] <defunct>
root      1223  0.0  0.0      0     0 ?        S<   Mar21   0:00 [kworker/0:0H]
root      1821  0.2  0.0      0     0 ?        S    00:57   0:08 [kworker/1:0]
root      1848  0.1  0.0      0     0 ?        S    01:00   0:04 [kworker/2:0]
root      1997  0.0  0.0      0     0 ?        S    01:13   0:00 [kworker/u8:0]
root      2269  0.1  0.0      0     0 ?        S    01:52   0:01 [kworker/0:0]
root      2271  0.0  0.0      0     0 ?        S    01:53   0:00 [kworker/3:0]
root      2293  0.0  0.0      0     0 ?        S    01:58   0:00 [kworker/3:2]
root      2302  0.0  0.0      0     0 ?        S    02:01   0:00 [kworker/1:2]
root      2313  0.0  0.0      0     0 ?        S    02:02   0:00 [kworker/2:2]
root      2314  0.0  0.0      0     0 ?        S    02:02   0:00 [kworker/0:1]
root      2325  0.0  0.0      0     0 ?        S    02:06   0:00 [kworker/1:1]
root      2328  0.2  0.0      0     0 ?        S    02:07   0:00 [kworker/0:2]
root      2329  0.0  0.0      0     0 ?        S    02:07   0:00 [kworker/2:1]
root      2715  0.0  0.0      0     0 ?        S    Mar16   0:00 [irq/64-parser]
root      2716  0.0  0.0      0     0 ?        S<   Mar16   0:00 [threadrw]
root      5460  0.0  0.0      0     0 ?        S<   Mar16   0:00 [codec_mm_sc]
root      5465  0.0  0.0      0     0 ?        S<   Mar16   0:00 [kthread_h265]
root     28870  0.0  0.0      0     0 ?        S    Mar21   0:00 [kworker/u8:2]
root       390  0.0  0.0   1776   116 ?        Ss   Mar16   0:05 /usr/sbin/lircd --driver=default --device=/dev/lirc0 --uinput --output=/var/run/lirc/lircd-lirc0 --pidfile=/var/run/lirc/lircd-lirc0.pid /etc/lirc/lircd.conf
root       430  0.0  0.0   1820   108 ttyS0    Ss+  Mar16   0:00 /sbin/agetty --keep-baud 115200,38400,9600 ttyS0 vt220
root       382  0.0  0.0   2016   192 ?        Ss   Mar16   0:00 /usr/sbin/eventlircd --evmap=/etc/eventlircd.d --socket=/var/run/lirc/lircd --repeat-filter --release=_UP -f
root       419  0.0  0.0   2764   268 ?        Ss   Mar16   0:00 /bin/bash /usr/bin/mediacenter
osmc      2218  0.3  0.0   2884   400 pts/0    Ss   01:42   0:05 -bash
osmc      2326 10.3  0.0   5036   528 pts/0    R+   02:07   0:03 ps aux
avahi      380  0.0  0.0   5192   288 ?        S    Mar16   0:00 avahi-daemon: chroot helper
avahi      365  1.3  0.0   5324   408 ?        Ss   Mar16 102:50 avahi-daemon: running [Osmc.local]
message+   366  0.9  0.0   5448   540 ?        Ss   Mar16  71:36 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root       567  0.0  0.0   5644   332 ?        S    Mar16   0:00 sudo -u osmc /usr/lib/kodi/kodi.bin --standalone -fs --lircdev /var/run/lirc/lircd
root       203  0.0  0.0   5788   368 ?        Ss   Mar16   0:12 /sbin/rpcbind -f -w
root       374  0.2  0.0   6056   420 ?        Ss   Mar16  17:53 /lib/systemd/systemd-logind
ntp        909  0.3  0.0   6984   436 ?        Ssl  Mar16  27:30 /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 105:107
osmc       594  0.0  0.0   7728   488 ?        Ss   Mar16   0:00 /usr/bin/udisks-glue --foreground
root       547  0.0  0.0   7756   380 ?        Ss   Mar16   0:00 /sbin/wpa_supplicant -u -s -O /run/wpa_supplicant
osmc      2327  0.3  0.0   7832    80 pts/0    S+   02:07   0:00 sort -nk +5
root       480  0.0  0.0   8488   536 ?        Ss   Mar16   0:00 /usr/sbin/sshd -D
root       375  0.0  0.0   8936   572 ?        Ss   Mar16   0:00 /usr/sbin/connmand -n --nodnsproxy --config=/etc/connman.conf
root      2169  1.1  0.0   9644   696 ?        Ss   01:36   0:21 sshd: osmc [priv]
osmc      2213  0.2  0.0   9776   740 ?        S    01:42   0:03 sshd: osmc@pts/0
root         1  2.6  0.0  25336  1036 ?        Ds   Mar16 193:53 /sbin/init

Note that I killed a few processes trying to get back control and maybe reboot this thing. No go, I can’t even reboot:

shutdown -r now:

Failed to set wall message, ignoring: Connection timed out
Failed to reboot system via logind: Connection timed out
Failed to open /dev/initctl: Permission denied
Failed to talk to init daemon."

sudo shutdown -r now doesn’t do anything:

osmc@Osmc:~$ sudo shutdown -r now
osmc@Osmc:~$

Now what?

Hi,

It’s hard to speculate without a full log.
When this happens again, can you try run grab-logs -A and send us a URL if this works?

Is the 4K setup the same as the 4K +?

Sam

Will do. Just lost contact with the unit… Looks like I have to pull the power.

Simpler setup on the 4k+, less addons. 4k ran like a charm for months on end.

My guess would be an add-on that is leaking memory; but it is hard to comment.

Does sudo systemctl stop mediacenter help?
Can you post output of dmesg?

Assuming everything’s on the same version, could you copy ~/.kodi over from the other device entirely and move the existing directory to ~/.kodi-backup or archive it for yourself?

(run sudo systemctl stop mediacenter on the devices first for integrity purposes)

I reformatted some of your original post to make it more readable.

Given that the system seemed to have been up for around 5-6 days, the CPU total times for some processes seem to be excessively high. A few examples:

avahi-daemon 102 mins 50 seconds
dbus-daemon 71 minutes 36 seconds
/sbin/init 193 minutes 53 seconds
systemd-logind 17 minutes 53 seconds
ntpd 27 minutes 30 seconds

Also, kodi.bin has become a zombie process, possibly as a result of memory issues. All in all, it’s a bit of a mess.

If moving the .kodi directory, as suggested by Sam, makes no difference, your machine might have been compromised.

It’s running fine now so not much to find errors in. I’ve disabled the Apple TV screensaver-video-thingy since I found it to be a bit buggy and sluggish. We’ll see in a few days how it works now.