The problem is that the H.264 spec is not clear about how to encode and signal non-progressive video. To make the compression better, the two fields from an interlaced video are stored and compressed as a single frame, just like every codec before. The problem is that a 30 frame per second interlaced video is 60 fields per second, and the spec is unclear as to whether this should be signaled as 30i or 60i.
The second problem is the signaling of field order and repeats, which is again not clearly specified. A really smart non-realtime de-interlacer (like many of the AviSynth filters) can look as enough frames/fields and figure out the cadence and do things perfectly with no signaling information (other than a top/bottom first hint). But, it’s hard to do this in realtime and maintain audio sync.
The end product is a video that contains 24 frames per second progressive content that has been telecined to 25 interlaced frames per second video, and marked as 24p, when it should be marked as either 25i or 50i, depending on how you read the spec.
The absolute best solution is to use AviSynth (or a wizard-like program such as Handbrake) to convert it back to 24p and re-enode. I have one region A BluRay that is a 24p film that had 2:3 pulldown added to make it 30 frames per second interlaced, then for some bizarre reason, they they doubled every frame to turn it into 60 frames per second.