Unable to play certain video files properly

I occasionally download video-shorts from X and experience that my new Vero V can not properly play certain files originating from Instagram or TikTok.

They play fine on my laptop or the video player of my video projector (running on some older Android TV), but show interruptions and video/audio out-of-sync on my Vero V (audio always too fast or video too slow).

I have already played around with various settings in the player menu, but nothing helped.

I have uploaded such a video file here so you can try for yourself.

Any idea what the problem may be?

Can you provide the actual X link?

Here you go.

I have downloaded it using yt-dlp and other videos work without problem that way, except seemingly content originating from Instagram and TikTok.

Not forgotten. Will try and check by end of the week.

Can you also confirm the version you are using? I will test against:

root@osmc:/home/osmc# ./yt-dlp_linux_armv7l --version
2025.02.19

Sam

I tested your file and had no issues.
I also downloaded that video from your link using other methods (Chrome plugins like dwhelper, Stream Recorder, etc).
They all played fine.
Maybe I’m missing it though.
Can you be more specific with what happens in that video and at what time stamp?

Always the latest stable version of yt-dlp, so yes, currently that’s 2025.02.19, but it does not have anything to do with the yt-dlp version, but the source video/audio format.

Did you download the file that I have uploaded as per link in my initial post? If you have downloaded the video from that Twitter/X link using another method than yt-dlp, maybe it has been transcoded? And you tested playing it on a Vero V?

I can see no possibility that the described problem has anything to do with my setup. A lot of files I download from Twitter/X using yt-dlp just work fine, but any seemingly originating from TikTok and/or that are shot in portrait mode (so unusual video dimensions) just don’t, in which case what happens is that video/audio get out of sync badly and video resets in intervals likely to resync with the audio that’s going too fast.

In the specific case of the video with the girl dancing and the guy in the background getting a slap:

The downloaded video’s data (I do not use any yt-dlp option or transcoding):

File: Slap.mp4
Length: 00:00:20
Video Resolution: 720 x 1280
Video Codec: H.264 (High Profile)
Frames per second: 30
Audio Codec: MPEG-4 AAC
Rate: 48000 Hz
Channels: 2

The file itself is OK, it plays fine on my laptop and the video player of my projector (running on Android TV 10), but not on the Vero V.

During some testing I have converted this video file using Handbrake as follows:

File: Slap.mkv
Length: 00:00:20
Video Resolution: 720 x 1080
Video Codec: H.265 (Main Profile)
Frames per second: 30
Audio Codec: MPEG-4 AAC
Rate: 48000 Hz
Channels: 2

That file does better on the Vero V, at least no interruptions / video resets due to loss of A/V sync, but it’s still not OK because the video ends about a second early (so the video ends before the girl reaches for the glass). Again, that file plays fine and till the very end on my laptop and the player of the projector as well.

I will try to make a video of the exact behavior when playing that video on the Vero V using my mobile phone.

That’s probably to be expected.

Do you have ffmpeg installed on your system?

My output is:

osmc@osmc:~$ ./yt-dlp_linux_armv7l https://x.com/PicturesFoIder/status/1881033214367498274
[twitter] Extracting URL: https://x.com/PicturesFoIder/status/1881033214367498274
[twitter] 1881033214367498274: Downloading guest token
[twitter] 1881033214367498274: Downloading GraphQL JSON
[twitter] 1881033214367498274: Downloading m3u8 information
WARNING: ffmpeg not found. The downloaded format may not be the best available. Installing ffmpeg is strongly recommended: https://github.com/yt-dlp/yt-dlp#dependencies
[info] 1881033186316021760: Downloading 1 format(s): http-2176
[download] Destination: non aesthetic things - Guy is too caught up [1881033186316021760].mp4
[download] 100% of    3.85MiB in 00:00:00 at 8.91MiB/s

and the MediaInfo is:

General
Complete name                            : non aesthetic things - Guy is too caught up [1881033186316021760].mp4
Format                                   : MPEG-4
Format profile                           : Base Media
Codec ID                                 : isom (isom/iso4)
File size                                : 3.85 MiB
Duration                                 : 20 s 245 ms
Overall bit rate mode                    : Variable
Overall bit rate                         : 1 597 kb/s
Encoded date                             : UTC 2025-01-19 17:37:21
Tagged date                              : UTC 2025-01-19 17:37:21

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High@L3.1
Format settings                          : CABAC / 5 Ref Frames
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 5 frames
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 20 s 200 ms
Bit rate                                 : 1 463 kb/s
Maximum bit rate                         : 1 834 kb/s
Width                                    : 720 pixels
Height                                   : 1 280 pixels
Display aspect ratio                     : 0.562
Frame rate mode                          : Constant
Frame rate                               : 30.000 FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.053
Stream size                              : 3.52 MiB (91%)
Title                                    : Twitter-vork muxer
Writing library                          : x264 core 164 r3095 baee400
Encoding settings                        : cabac=1 / ref=5 / deblock=1:0:0 / analyse=0x3:0x113 / me=hex / subme=2 / psy=0 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=0 / threads=4 / lookahead_threads=1 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / stitchable=1 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=infinite / keyint_min=30 / scenecut=0 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=0 / crf=28.0 / qcomp=0.60 / qpmin=10 / qpmax=69 / qpstep=4 / vbv_maxrate=2048 / vbv_bufsize=2048 / crf_max=0.0 / nal_hrd=none / filler=0 / ip_ratio=1.40 / pb_ratio=1.30 / aq=2:1.00
Tagged date                              : UTC 2025-01-19 17:37:21
Codec configuration box                  : avcC

Audio
ID                                       : 2
Format                                   : AAC LC
Format/Info                              : Advanced Audio Codec Low Complexity
Codec ID                                 : mp4a-40-2
Duration                                 : 20 s 245 ms
Bit rate mode                            : Variable
Bit rate                                 : 128 kb/s
Maximum bit rate                         : 134 kb/s
Channel(s)                               : 2 channels
Channel layout                           : L R
Sampling rate                            : 48.0 kHz
Frame rate                               : 46.875 FPS (1024 SPF)
Compression mode                         : Lossy
Stream size                              : 316 KiB (8%)
Title                                    : Twitter-vork muxer
Default                                  : Yes
Alternate group                          : 1
Tagged date                              : UTC 2025-01-19 17:37:21

Yes.
That’s the first thing I tested.

General
Complete name                            : J:\Demo\non aesthetic things on X - Guy is too caught up\Slap.mp4
Format                                   : MPEG-4
Format profile                           : Base Media
Codec ID                                 : isom (isom/iso2/avc1/mp41)
File size                                : 3.86 MiB
Duration                                 : 20 s 246 ms
Overall bit rate                         : 1 597 kb/s
Frame rate                               : 30.000 FPS
Writing application                      : Lavf59.27.100

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High@L3.1
Format settings                          : CABAC / 5 Ref Frames
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 5 frames
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 20 s 200 ms
Bit rate                                 : 1 463 kb/s
Width                                    : 720 pixels
Height                                   : 1 280 pixels
Display aspect ratio                     : 0.562
Frame rate mode                          : Constant
Frame rate                               : 30.000 FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.053
Stream size                              : 3.52 MiB (91%)
Title                                    : Twitter-vork muxer
Writing library                          : x264 core 164 r3095 baee400
Encoding settings                        : cabac=1 / ref=5 / deblock=1:0:0 / analyse=0x3:0x113 / me=hex / subme=2 / psy=0 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=0 / threads=4 / lookahead_threads=1 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / stitchable=1 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=infinite / keyint_min=30 / scenecut=0 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=0 / crf=28.0 / qcomp=0.60 / qpmin=10 / qpmax=69 / qpstep=4 / vbv_maxrate=2048 / vbv_bufsize=2048 / crf_max=0.0 / nal_hrd=none / filler=0 / ip_ratio=1.40 / pb_ratio=1.30 / aq=2:1.00
Codec configuration box                  : avcC

Audio
ID                                       : 2
Format                                   : AAC LC
Format/Info                              : Advanced Audio Codec Low Complexity
Codec ID                                 : mp4a-40-2
Duration                                 : 20 s 246 ms
Bit rate mode                            : Constant
Bit rate                                 : 128 kb/s
Channel(s)                               : 2 channels
Channel layout                           : L R
Sampling rate                            : 48.0 kHz
Frame rate                               : 46.875 FPS (1024 SPF)
Compression mode                         : Lossy
Stream size                              : 316 KiB (8%)
Title                                    : Twitter-vork muxer
Default                                  : Yes
Alternate group                          : 1


Yes

Please describe the problem in more detail on this clip.
Maybe it’s something subtle I’m missing.
I didn’t see anything painfully obvious when I watched it.

For reference, here are the other files I downloaded myself using other methods to see if there was some difference I was missing.

General
Complete name                            : J:\Demo\non aesthetic things on X - Guy is too caught up\dwhelper.mp4
Format                                   : MPEG-4
Format profile                           : Base Media
Codec ID                                 : isom (isom/iso2/avc1/mp41)
File size                                : 3.86 MiB
Duration                                 : 20 s 246 ms
Overall bit rate mode                    : Variable
Overall bit rate                         : 1 597 kb/s
Frame rate                               : 30.000 FPS
Writing application                      : Lavf60.16.100

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High@L3.1
Format settings                          : CABAC / 5 Ref Frames
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 5 frames
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 20 s 200 ms
Bit rate                                 : 1 463 kb/s
Width                                    : 720 pixels
Height                                   : 1 280 pixels
Display aspect ratio                     : 0.562
Frame rate mode                          : Constant
Frame rate                               : 30.000 FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.053
Stream size                              : 3.52 MiB (91%)
Title                                    : Twitter-vork muxer
Writing library                          : x264 core 164 r3095 baee400
Encoding settings                        : cabac=1 / ref=5 / deblock=1:0:0 / analyse=0x3:0x113 / me=hex / subme=2 / psy=0 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=0 / threads=4 / lookahead_threads=1 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / stitchable=1 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=infinite / keyint_min=30 / scenecut=0 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=0 / crf=28.0 / qcomp=0.60 / qpmin=10 / qpmax=69 / qpstep=4 / vbv_maxrate=2048 / vbv_bufsize=2048 / crf_max=0.0 / nal_hrd=none / filler=0 / ip_ratio=1.40 / pb_ratio=1.30 / aq=2:1.00
Codec configuration box                  : avcC

Audio
ID                                       : 2
Format                                   : AAC LC
Format/Info                              : Advanced Audio Codec Low Complexity
Codec ID                                 : mp4a-40-2
Duration                                 : 20 s 246 ms
Bit rate mode                            : Variable
Bit rate                                 : 129 kb/s
Channel(s)                               : 2 channels
Channel layout                           : L R
Sampling rate                            : 48.0 kHz
Frame rate                               : 46.875 FPS (1024 SPF)
Compression mode                         : Lossy
Stream size                              : 316 KiB (8%)
Title                                    : Twitter-vork muxer
Default                                  : Yes
Alternate group                          : 1

And:

General
Complete name                            : J:\Demo\non aesthetic things on X - Guy is too caught up\Stream Recorder.mp4
Format                                   : MPEG-4
Format profile                           : Base Media / Version 2
Codec ID                                 : mp42 (isom/iso2/avc1/mp41/mp42)
File size                                : 3.84 MiB
Duration                                 : 20 s 246 ms
Overall bit rate                         : 1 593 kb/s
Frame rate                               : 30.000 FPS

Video
ID                                       : 1
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : High@L3.1
Format settings                          : CABAC / 5 Ref Frames
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 5 frames
Codec ID                                 : avc1
Codec ID/Info                            : Advanced Video Coding
Duration                                 : 20 s 200 ms
Bit rate                                 : 1 463 kb/s
Width                                    : 720 pixels
Height                                   : 1 280 pixels
Display aspect ratio                     : 0.562
Frame rate mode                          : Constant
Frame rate                               : 30.000 FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.053
Stream size                              : 3.52 MiB (92%)
Title                                    : Twitter-vork muxer
Writing library                          : x264 core 164 r3095 baee400
Encoding settings                        : cabac=1 / ref=5 / deblock=1:0:0 / analyse=0x3:0x113 / me=hex / subme=2 / psy=0 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=0 / threads=4 / lookahead_threads=1 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / stitchable=1 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=infinite / keyint_min=30 / scenecut=0 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=0 / crf=28.0 / qcomp=0.60 / qpmin=10 / qpmax=69 / qpstep=4 / vbv_maxrate=2048 / vbv_bufsize=2048 / crf_max=0.0 / nal_hrd=none / filler=0 / ip_ratio=1.40 / pb_ratio=1.30 / aq=2:1.00
Color range                              : Limited
Codec configuration box                  : avcC

Audio
ID                                       : 2
Format                                   : AAC LC
Format/Info                              : Advanced Audio Codec Low Complexity
Codec ID                                 : mp4a-40-2
Duration                                 : 20 s 246 ms
Bit rate mode                            : Constant
Bit rate                                 : 128 kb/s
Channel(s)                               : 2 channels
Channel layout                           : L R
Sampling rate                            : 48.0 kHz
Frame rate                               : 46.875 FPS (1024 SPF)
Compression mode                         : Lossy
Stream size                              : 316 KiB (8%)
Title                                    : Twitter-vork muxer
Default                                  : Yes
Alternate group                          : 1

Come to think of it, we’re probably not even on the same OSMC build since I am testing other things too.
So that might have some difference, maybe, but not likely.

Out of curiosity, could you try downloading it using another lossless method like the Chrome Stream Recorder plugin which can be found here: https://www.hlsloader.com/index.html
After looking closer now I noticed a slight hickup, maybe a dropped frame, when she is first standing up before she starts dancing.
Also, you could maybe upload a video of the issue using your cell phone.

And here’s a report showing the difference between your file and what the Stream Recorder generates: Slap Difference Report

How comes? I mean, why would it play to the very end on my laptop (using SMPlayer) and the player of the projector, but not on Vero V?

Yes, ffmpeg version n5.1.2

Here the output of yt-dlp looks as follows:

$ yt-dlp https://x.com/PicturesFoIder/status/1881033214367498274
[twitter] Extracting URL: https://x.com/PicturesFoIder/status/1881033214367498274
[twitter] 1881033214367498274: Downloading guest token
[twitter] 1881033214367498274: Downloading GraphQL JSON
[twitter] 1881033214367498274: Downloading m3u8 information
[info] 1881033186316021760: Downloading 1 format(s): hls-1600+hls-audio-128000-Audio
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 7
[download] Destination: /mnt/bt/youtube-dl/non aesthetic things - Guy is too caught up.fhls-1600.mp4
[download] 100% of    3.53MiB in 00:00:04 at 847.53KiB/s

Looks identical here, except for the audio:

Duration                                 : 20 s 246 ms
Bit rate mode                            : Constant
Bit rate                                 : 128 kb/s

Well, the problem I am experiencing can not be overlooked, a/v gets out of sync every few seconds resulting in an interruption with black screen when the player seemingly resyncs a/v, then it continues and the same thing happens again. At the end audio ends prematurely (since audio is too fast or video too slow) and the video keeps going without audio till it’s over.

I’ll do that today evening.

I don’t have ffmpeg installed. I suspect yt-dlp is doing some unwanted conversion.

I think it’s the container.

It’s odd, because the output of yt-dlp does not suggest any conversion, which would have to take place after the download, though I can see that video and audio is downloaded separately and then muxed. The full output is as follows:

$ yt-dlp https://x.com/PicturesFoIder/status/1881033214367498274
[twitter] Extracting URL: https://x.com/PicturesFoIder/status/1881033214367498274
[twitter] 1881033214367498274: Downloading guest token
[twitter] 1881033214367498274: Downloading GraphQL JSON
[twitter] 1881033214367498274: Downloading m3u8 information
[info] 1881033186316021760: Downloading 1 format(s): hls-1600+hls-audio-128000-Audio
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 7
[download] Destination: /mnt/bt/youtube-dl/non aesthetic things - Guy is too caught up.fhls-1600.mp4
[download] 100% of    3.53MiB in 00:00:03 at 967.11KiB/s
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 7
[download] Destination: /mnt/bt/youtube-dl/non aesthetic things - Guy is too caught up.fhls-audio-128000-Audio.mp4
[download] 100% of  321.74KiB in 00:00:01 at 162.61KiB/s
[Merger] Merging formats into "/mnt/bt/youtube-dl/non aesthetic things - Guy is too caught up.mp4"
Deleting original file /mnt/bt/youtube-dl/non aesthetic things - Guy is too caught up.fhls-audio-128000-Audio.mp4 (pass -k to keep)
Deleting original file /mnt/bt/youtube-dl/non aesthetic things - Guy is too caught up.fhls-1600.mp4 (pass -k to keep)

In that case no video downloaded that way would work properly, but it really is only effecting those that have been created in portrait mode and/or originate from TikTok.

Unfortunately I forgot to make that video showing the problem with my mobile phone yesterday, I’ll try to do it this evening.

OK, so I have started to download videos from Twitter/X using yt-dlp with the output as Matroska containers (instead of Quicktime) and the main problem with A/V getting out of sync every few seconds is gone, but the problem that about 1 second from the videos is missing at the beginning and the end remains.

@sam_nazarko You wrote that’s probably to be expected, but why? Why is this on the Vero V, but not on the player of the projector or on my laptop?

And nevertheless, how can those previous MP4 files cause the described problem on the Vero V, but not on the play of the projector or on my laptop?

All those files just play fine without interruptions and from the beginning till the end, just not on the Vero V.

I mean, the matter would be clear if it was just the video files to blame if they wouldn’t play correctly on any device, but it drives me nuts that they just play fine, just not on the Vero V.

P.S. Yeah, I forgot to record that video using my mobile phone again yesterday. Will try today evening.

There might be something about them that isn’t playing nice with the hardware decoder. You might try them via software by changing around the settings in settings>player>videos>

It’s definitely going to be the HW decoder. I just haven’t ascertained if ffmpeg post processing is causing the issue.

I suggest either removing ffmpeg or (if possible) a less extreme approach and avoiding conversion. I am sure yt-dlp will have a flag to facilitate this.

I have a feeling yt-dlp is doing something wrong with the container.
@i11 can you try this file and report back on if it has the same issues?
I used the Chrome extension Stream Recorder to capture the file.
Alternatively, you can try the plugin yourself also.

Wow, it really was the hardware decoder! The setting for “h264” is set to “Always” by default, changing it to “HD and up” solved the issue. :smiley:

Though, why exactly is that? Is it a hardware fault? I mean, shouldn’t it all work with always using the hardware decoder for all codecs (so all decoder related settings set to “Always”)?

No, it more indicates that the encoding is bad and the Hardware decoder is just more prudent than the software decoder.

2 Likes