Support for TV-led Dolby Vision on all devices

So I have worked out what is required to enable tv-led dolby vision on any device and have a very basic proof-of-concept that currently has some show-stopping limitations for any practical use. I have done this testing in CoreELEC as detailed in this thread. So save needing to go through the whole thread - most relevant posts are here, here, here, and here.

In short, all that is needed to do to enable tv-led Dolby vision is to embed the metadata in the top pixels of the 12-bit 422 signal that is outputted from the device and send the correct hdmi vendor-specific infoframe and avi packet (these are known). The metadata to embed in these pixels can be easily formed from the metadata that the dovi_tool extracts. The major problem I am stuck on is how to embed the required data into the top pixels that are actually output.

Given the concept is not CoreELEC specific, I am posting here for two reasons,

  1. To raise awareness incase this info is useful to you guys in enabling TV-led DV your devices.
  2. More selfishly ( :grinning:), in the hope is that a developer here gets interested enough and is able to overcome the problems I am having in embedding the metadata properly - so that I can copy that work get TV-led DV on my device.

You are doing some great work, there. If we knew how to embed those bits without a DV-enabled chip we would probably have done it by now.

Are you saying that you previously knew exactly what you needed to do for TV-led dolby vision, but just didn’t know how to get the bits embedded?

If so, that’s an interesting bit of information that I couldn’t find shared anywhere before…

ETSI GS CCM 001 may help you.

Thanks, - but I’ve already read that, can both decode the metadata from captures, and know how to put the metadata in (tested using a computer and a lossless encoded video). I know what to do, just not the how to make it practical

Sorry I misunderstood the reason for your question. The answer then is yes, it’s just the how that’s missing. I’m told it may be impossible to write to the secured video buffer.

By who? and where is the secured part?

I can write on the fly to the decoded video stream if I set double_write_mode to 1 - this results in a NV21 format that I can read/write to embed metadata on the fly and get working tv-led DV and responses to the embedded metadata - but unfortunately forces/only produces an 8-bit output.

Problems are this only works for 16:9 videos as the metadata needs to go into the top pixels that are transmitted. And for 10-bit videos (i.e., actual uses cases), the decoded video stream is in a compressed and scattered format that I can’t work out how to read, let alone edit on the fly

There are devs that know more about these things than me. I suspect you are bumping up against AFBC. It could be the same issue that screws up screenshots for HEVC.

Is your code on github, maybe? It all sounds very encouraging.

Problems with AFBC is about as far as I got as well, only got to where I am by noticing that hyperion works for 10-bit videos and tracking down why (it changes the double write mode).

Not yet on github,

If interested, the metadata injection is currently being done by adding the following:

injection code
	if (inject_metadata){
		if (metadata_toggle){
			metadata_select = !metadata_select;
			metadata_toggle = false;
			pr_info("  metadata_select: %d", metadata_select);
		}
		psrc = phys_to_virt(dispbuf->canvas0_config[0].phy_addr);
		line_start = psrc;
		for(i=0; i < 3; i++){
			for(j=0; j < num_metadata_buffer; j++){
				if(metadata_select){
					injected_byte = (char) metadata_buffer2[j];
				} else {
					injected_byte = (char) metadata_buffer[j];
				}
				for(k=0; k < 8; k++){
					line_start[i*num_metadata_buffer*8 + j*8 + k] = ((injected_byte) >> (7 - k)) & 1;
				}
			}
		}
	}

before this line. The metadata buffers are module params I added that contain preformed byte sequences to inject. To simply things, I’m only writing to the luma channel and am using test videos where I have zeroed all the data in the top few rows to avoid dealing with the scrambling and issues that could occur with chroma up-sampling.

1 Like

… but the metadata should be in the chroma??

I’ve been keeping an eye on your thread.
We recently got Profile 5 tonemapping working and we’re keen to progress further with Dolby Vision support.

Since the release of ‘newer’ AMLogic SoCs such as S905X-4, AMLogic got significantly tougher with the security model. All devices now utilise a trusted video pathway as the only way of providing hardware accelerated video decoding and used for applications like Widevine L1 which had security requirements. Previously this was optional. This is implemented with OpTEE and during playback a secure memory instance is created and allocated. Video firmware is uploaded and verified by a Trusted Application.

Furthermore on S905X4 and later, AMLogic now operate an over-under signing method. So the days where you could sign your own TAs, pack the keys in to BL32 and have complete control of the OPTEE environment are gone. Previously one could tie the keys to a specific device as they authored all of the keys. Now AMLogic sign the TA as well, which means that any TA developed has to be signed by AMLogic as well.

This provides stronger security for AML and authorised TAs, but means that TAs are now SoC bound and not device bound. It took significant time for us to get AMLogic to sign our TAs which we use for video enhancement, security and 3D.

Double write mode is useful for applications like Hyperion but impairs the quality and resolution. I don’t think it’s appropriate for DV.

It’s not that it’s compressed, it is secured. You will not be able to access it trivially from the REE (Linux) world.

Nominally correct - but perhaps not the best description. The doc you linked says in should go into the LSB of alternating chroma channels (remembering a 4:2:2 format is used), but this is done with a bit-scrambling, i.e., to embedded a bit value of 1 from the metadata, either a 1 or a 0 is written to the LSB of the chroma channel. Selecting which depends on the combined parity of the other 11 bits of the chroma channel and the 12bits of the luma channel.

As such, it is perhaps more accurate to say a 1 bit of the metadata is embedded if the combined parity of all 12 bits from both the luma and corresponding chroma sample has an odd parity and a 0 bit for even parity. In practice, this means that you can really embed the metadata in any bit of either the luma or chroma channels, it is just the final combined parity that matters. Though the LSB of the alternating chroma channels of the final 4:2:2 signal would be best bit to modify w.r.t. minimising visual impact.

As I only have access to the video signal when it is still in a 4:2:0 form, I have been setting all the chroma samples to 0 (means that regardless of chroma upsampling used they will stay 0) and embedding in the luma channel as that doesn’t go though a chroma-upsampling - giving me control of the parity of each sample that is needed.

Now I see where that comment came from about a secured part… Glad, I’m playing with an older device at the moment.

I am aware of the problems with using double write mode (though setting it to 1 doesn’t impair the resolution - has a 1:1 ratio), part of the reason I said the current implementation isn’t suitable for actual use

Understood.

Update original post

4 Likes

You are a genius! Keep it up!

1 Like

And it’s now woking… at least for L1 at the moment.

3 Likes

Very interesting indeed :+1::love_you_gesture:t2::black_joker::ok_hand: