Can ping RPi, but can't SSH into it

If you want to upload a text file that’s under 10 MB, run paste-log <filename> .

While NextPVR is quite likely to be the culprit, I still think it wiould be useful to switch it off each night, as already discussed.

Since we think that it’s likely to be a memory-related problem, once you’re satisfied that the problem is with NextPVR, you could run a cron job each minute that reports on system memory usage. That will help us to see if the out-of-memory problem occurs after a slow build-up or if it occurs very quickly, such as you might get when the EPG info is downloaded.

Thanks for the help. As you know, my Linux skills are very much limited, so I was unaware of the past-log command. The latest (kodi) log file is about 2 MB. It’s here: paste.osmc.tv/wapucaxawi

I think you’ll see that this log file is dominated by messages that it could not connect to the server. I don’t understand that, because my network was up at that time and all indications were that my server was up and running as well. Of course, I might have missed something.

I’ll switch NextPVR off every night and we’ll see what happens.

Cron is something I don’t understand, so it will take me awhile to come up to speed. I know I have to install crontab then write a script and put it somewhere, then add a cron job to cron tab that points to that script. I know the big idea, but the devil is always in the details for me!

I had been making a habit of running free -m a couple times a day when I stopped restarting OSMC. Here’s the latest result of that:

osmc@osmc:~$ free -m
              total        used        free      shared  buff/cache   available
Mem:         746         234         176           5         335        553
Swap:             0           0           0

The last time I ran it before it locked up gave this:

            total        used        free      shared  buff/cache   available
Mem:          746         662         30         5         53           31
Swap:           0           0          0

It was in the above range for at least 24 hours, so I’d conjecture that something “bad” happens, things calm down for awhile, then something bad happens again and it locks up. But you see my amateurism showing. :blush:

Thanks again for the help.

We/I can step you through the process of getting a cron job running if/when the need arises.

The first set of free figures are a bit strange, since available memory is usually greater than free memory – but perhaps it’s just a copy/paste issue and should be 53x.

The second set of free figures show that the machine is right on the limits, with only 31 MB estimated to be available. (It’s a guesstimate, so could potentially be even less in reality.) One EPG and you’re toast. :slight_smile:

You’re right. It should have been 553. I messed up entering the second set of numbers and inadvertently changed that number. I corrected it above.

This type of behavior hasn’t happened to me before with NextPVR. I am using their newish v5 and I did recently begin using comskip (I probably noted that above), and it seems like this bad stuff started happening around then, but again we can’t be sure.

I think I see where we’re headed: The proximal cause of the lockup; e.g., an EPG download, may not be the real cause, which we’ve yet to determine. What if we started fresh (with a reboot or restart of OSMC), then collected and posted the full log file documenting everything that happened between fresh and lockup? I’ll go ahead and keep debug logging on until it locks up again, if it does with NextPVR switched off nightly.

Thanks very much for your support.

1 Like

Perhaps, as a last try you could try to catch the kernel messages from the phase between boot and the freeze activating persistent kernel logs and provide them here but no guarantee:

Unfortunately we need kernel messages from previous boots which are disabled by default with OSMC. To activate and provide such information, please, follow the steps below:

  1. login via SSH to the OSMC device, user osmc, password osmc
  2. cd /var/log
  3. sudo mkdir journal
  4. (from now, kernel messages are written to new directories for every boot)
  5. sudo shutdown -r now
  6. now wait for the issue/event which is the problem of this topic
  7. once it happens again and you are forced to reboot the OSMC device or it rebooted automatically, you’ve to identify the right kernel message log:
    7.a) login via SSH and invoke
    sudo journalctl --list-boots --no-pager
    7.b) the lines start with an index id like 0, -1, -2, etc. and contain the date and time when log was started
  8. upload the appropriate kernel log using
    sudo journalctl -k -b <identified index> --no-pager|paste-log
    (replace <identified index> with the real index id, see above)
  9. also, upload the appropriate full log using
    sudo journalctl -b <identified index> --no-pager|paste-log
    (replace <identified index> with the real index id, see above)
  10. provide the returned URLs here
  11. don’t forget to remove the created journal directory otherwise your system’s root file system gets filled
    11.a) login via SSH
    11.b )cd /var/log
    11.c) sudo rm -R -f journal && sudo reboot (repeat this line if you get a ‘cannot remove’ error till it works and your ssh connection gets lost by the reboot)

@JimKnopf
Thank you for the advice. I can do this, but I wonder how I can monitor the size of the root filesystem. df -h currently produces the following:

osmc@osmc:~$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        368M     0  368M   0% /dev
tmpfs           374M  5.1M  369M   2% /run
/dev/mmcblk0p2   29G  1.2G   27G   5% /
tmpfs           374M     0  374M   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           374M     0  374M   0% /sys/fs/cgroup
/dev/mmcblk0p1  316M   29M  287M  10% /boot
tmpfs            75M     0   75M   0% /run/user/1000

@dillthedog
I disabled NextPVR last night at about 22:00. free -m currently produces the following:

osmc@osmc:~$ free -m
              total        used        free      shared  buff/cache   available
Mem:            746         271         121           5         352         415
Swap:             0           0           0

My plan is to enable NextPVR at around 18:00, watch a couple programs, then disable it again. It could take a very long time with this plan for the Pi to lock up again.

Thank you both again.

Very long time? Didn’t it take 2 days to lock up last time? I note that in post #9 you mention it was hanging “about once per week”, but if you reach 3-4 days without it locking up when NextPVR is disabled each night, then you’re probably ready to proceed to the next phase.

Sorry for not being clear.

Initially it seemed to be hanging about once per week. When I enabled debug logging and began to restart OSMC each night to manage log file size, it went two weeks and didn’t hang at all. At that point, I decided to let it run without restarting OSMC and it hung in two days, then hung again the very next day. That led to my most recent log posts here.

My thinking behind “very long time” is that if something about NextPVR is the culprit, then it won’t hang unless NextPVR is running. The approach we’re using now runs NextPVR a minimum amount of time, so it could take a “very” long time to hang – or so it would seem to me. Maybe my logic is faulty here?

On the other hand, if it does hang quickly without restarts with NextPVR disabled most of the time, then perhaps something else is the culprit.

Thanks!

Thanks for clarifying.

I agree with this bit.

While absence of evidence cannot be taken 100% to mean evidence of absence, if it doesn’t lock up after 3-4 days, I’d suggest that this can be taken as being an “indicative” pointer. You’re of course free to extend the test period, if you wish.

Understood and agreed. After 3-4 days if it doesn’t lock up, let’s proceed to the next phase.

Well, my Pi hung last night, 2/12/2021 around 22:00, after running since 2/10/2021 at 9:20 – about 3 days.

TLDR; I don’t think the problem is being caused by NextPVR. The Pi seems to be leaking memory at the same rate regardless of whether or not NextPVR is enabled. More detail follows.

During the time I was evaluating the impact of NextPVR, I disabled NextPVR for most of the day, enabling it only for about 3-4 hours each day to watch some recorded TV. The pattern was to enable NextPVR around 18:00, then disable it around 22:00 each day. I ran free -m right before enabling NextPVR and right after disabling it, and once again in the morning when it was disabled and had been disabled all night.

My hypothesis going in was that I would see a drop in “available” memory after enabling NextPVR, and near-constant available memory while NextPVR was disabled. That was not the case. Instead, I saw constant dropping of available memory throughout the period, independent of whether NextPVR was enabled or not. The following chart shows results of free -m during the period:

Date            Time          Total   Used    Free  Shared  Buff/Cache  Avail   Notes
2/10/2021	9:20	Mem:	746	211	212	5	323	477	After disable NPVR and restart OSMC
		Swap:	0	0	0				
2/10/2021	18:30	Mem:	746	301	91	5	353	386	Before enable NPVR
		Swap:	0	0	0				
2/10/2021	22:00	Mem:	746	348	112	5	285	338	After disable NPVR 
		Swap:	0	0	0				
2/11/2021	9:40	Mem:	746	443	56	5	247	246	NPVR still disabled
		Swap:	0	0	0				
2/11/2021	18:10	Mem:	746	467	27	5	251	222	Before enable NPVR
		Swap:	0	0	0				
2/11/2021	21:45	Mem:	746	524	49	5	173	166	After disable NPVR
		Swap:	0	0	0				
2/12/2021	10:00	Mem:	746	576	28	5	141	114	NPVR still disabled
		Swap:	0	0	0				
2/12/2021	19:00	Mem:	746	602	29	9	114	83	Before enable NPVR
		Swap:	0	0	0

Apologies for the sloppy display of the table. I’m missing something on how to do that correctly.

In this table, the difference in available memory between “Before enable NPVR” and “After disable NPVR” on the same day represents the change in memory while NPVR was running. The difference between “Before enable NPVR” on a day and “After disable NPVR” on the previous day represents the change in available memory while the OSMC/Pi was idle with NPVR disabled. There were two observations of each. The drops while NPVR was running were 48M and 56M. The drops while OSMC/Pi was idle and NPVR was disabled were 116M and 83M. The totals were higher when OSMC/Pi were idle, but the time period was much longer. I conclude from this that NPVR is not responsible for the memory leakage I’m experiencing, but I’m probably missing something. In total my Pi leaked almost 400M or memory in just under 3 days.

I have Kodi debug logs for most of this period. I’ll post them if there is any interest. I didn’t try to save off kernel messages during this period as @JimKnopf advised. I will do so if that’s needed for a next iteration. Alternatively or in addition, I can post up full logs from “My OSMC” before the Pi runs out of memory.

I’ve been experimenting on a second possible cause of the memory leakage – the skin helper service. I installed and enabled that awhile back because I had designs on cleaning up the display of movies in my library. I didn’t manage to do that and don’t care to now, but I forgot to disable or uninstall the skin helper service. I’m wondering if that might be the problem. I’ve disabled it and I’m collecting results of free -m again. So far, results are tentative and inconclusive. I’ll post results I obtain in a couple days or whenever my Pi runs out of memory again. In the meantime, I’m open to any ideas, thoughts, and suggestions.

Thanks again for the support.

Good work. Don’t worry about the formatting; I can see that the available memory is progressively shrinking.

  1. The reduction in available memory while NextPVR is disabled is interesting. We all assume that NextPVR is completely inactive once disabled, but perhaps that’s not the case. It might be worth restarting Kodi (systemctl restart mediacenter) once the NextPVR add-on has been disabled. This should mean that any memory it was using has been released. Running free -m before and after the restart should show an increase in available memory.

  2. We’re assuming that it’s the Kodi process that’s chewing up the memory. While probably the case, you need to confirm that this is the case. Run:

    ps aux --sort -rss |head -3

    to get the top 2 memory users on the system, whenever you run free -m.

OK. Thanks for the response. I’ll start a new data collection effort today with a fresh restart of OSMC and then restart OSMC each time I disable NextPVR. I’ll also run ps aux as per your recommendation each time I run free -m. Just for grins, I ran each command just now and got the following results:

Mem:            746         382          63           5         299         305
Swap:             0           0           0

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
osmc       466 21.8 32.5 684136 248680 ?       Rl   Feb12 500:58 /usr/lib/kodi/kodi.bin --standalone -fs
root       201  0.0  0.9  15528  7100 ?        Ss   Feb12   0:02 /lib/systemd/systemd-journald

I’m not sure how to interpret this, but I wanted to ask if this is what we’re looking for before I began my efforts.

Also, NextPVR is running right now, and I’ve uninstalled the skin helper service. As far as I can tell, NextPVR is now the only significant addon I’m running.

Thanks again. I appreciate your help very much!

That ps aux output looks good. We’re interested in the RSS figure.

Noted.

Hello again,

My system (and my house) was down for 4 days due to the widespread power outages here in Texas, so I was unable to collect data per our ongoing discussion. I’m back online now and have a few observations that might be useful. I’ll continue to collect data in case more is necessary.

I’ve posted the data I’ve collected as “pictures” to improve formatting. Below are results of free command. The “NextPVR” column notes whether NextPVR was enabled or disabled when the command was run. Each time that column switches from Enabled to Disabled, I also restarted OSMC. I’m seeing that memory is being chewed up when NextPVR has been disabled, even after OSMC has been restarted.

Here is a picture of the corresponding runs of ps aux.

I realize there’s not much data here, but I’m primarily interested now in whether I’m collecting sufficient data to diagnose the problem. Also, I continue to run debug logs, but we’re not using them. I wonder if debug logging is consuming memory? Should I turn it off?

Thanks again for all the help!

So you’re in Texas! I do hope all’s well with you all after your recent weather-related troubles.

The figures you’ve provided are very interesting.

  1. From 07:32 to 18:00, when NextPVR is disabled, the VSZ and RSS are totally unchanged and, while the available memory diminishes, the change is very small. That’s different from your findings in post #53, where you hadn’t restarted Kodi.
  2. At 20:42, when you enable NextPVR, there is a large increase in memory use, which increases significantly in the next four minutes (to 20:46).
  3. When you disable NextPVR and restart Kodi, at 23:06, VSZ and RSS drop but have increased by 08:37, though the numbers remain below those of the period from 07:32 to 18:00. It might be that the VSZ/RSS figures eventially stabilize and reach a ceiling, close to those for the period 07:32 to 18:00. We’ll possibly know a bit later.

Since you’re running on a Pi, you have the option of installing a fresh installation to a separate SD card. You can always return to the original SD card at any time.

Thanks again for the support.

I’ll continue collecting this data for awhile to see if things stabilize and if we can get some useful insights. For now, it seems like something unrelated to NextPVR is chewing up memory.

Since debug logs are not producing any helpful insights, I’ve stopped collecting them. I made the change to advancedsettings.xml and restarted OSMC, collecting several observations as I did. Whereas I thought free memory would increase by doing that, it didn’t. It declined into the 485 range.

I will try a new installation whenever I can actually buy an SD card. The installation I’m running is actually pretty new, but I guess it’s possible that I screwed something up and that’s what’s causing my memory leak.

I found this old thread searching for “osmc memory leak”. I did not see a resolution there, so, rather than revive that thread, I thought I’d try to connect it with this one in case there might be anything useful to be gained from it.

Each case is likely to be different. I could find very little detail in that thread.

@FrogFan: You’ve invested so much time till here … can you find some to post a kernel log from such hang as requested?