feat(packet): add keyframe detection for VP8, VP9 and AV1#869
feat(packet): add keyframe detection for VP8, VP9 and AV1#869farit2000 wants to merge 1 commit intoalgesten:mainfrom
Conversation
Add public functions to detect keyframes from raw RTP payloads without fully depacketizing: - detect_vp8_keyframe: parses VP8 RTP descriptor (RFC 7741), checks P bit in payload header - detect_vp9_keyframe: checks P bit in VP9 RTP descriptor - detect_av1_keyframe: checks N bit in AV1 aggregation header Useful for SFU-style forwarding where you need to identify keyframes for PLI handling or layer switching without decoding.
|
@farit2000 Is this exposing functions because you don't get enough data in existing output from str0m? Or is it because you want to reuse functions in other contexts? |
|
Maybe these could just be unversioned so users don't have to implement their own versions of this. Also add H264 and H265 :) But this is useful indeed. |
In rtp_mode we forward raw RTP packets without depacketizing, so we don't get CodecExtra. But we still need to know when a keyframe arrives — for PLI handling, to know when a new subscriber can start decoding, and for layer switching decisions. These functions let us check that from the raw payload without running the full depacketizer. |
Good idea, I can add H264 and H265 too. And yeah marking them unversioned makes sense since the detection logic is straightforward and unlikely to change. |
|
@farit2000 It feels like we're doubling up on functionality we already have. We already detect keyframes in the media level API. Do you see any way we could avoid having separate keyframe detector logic for RTP level vs media level? |
The existing keyframe detection in the depacketizers works on accumulated frame data after reassembly. In rtp_mode there's no depacketization - we get individual RTP packets and forward them as-is. These functions work on single RTP packets (checking the P bit in VP8/VP9 descriptor, N bit in AV1 aggregation header), which is fundamentally different from the bitstream-level detection in the depacketizers. We could potentially refactor the depacketizers to call these functions internally, but they serve different layers - RTP header inspection vs bitstream parsing. |
|
Fair enough. If we can have detectors for all the supported codecs, then let's merge. |
We're using str0m as an SFU and need to detect keyframes from forwarded RTP packets without fully depacketizing them. This is needed for things like PLI request handling, layer switching decisions, and knowing when a new participant can start decoding.
Added three public functions:
detect_vp8_keyframe— parses the VP8 RTP payload descriptor per RFC 7741 (handles all the X/I/L/T/K extension combinations and 7/16-bit PictureID), then checks the P bit in the VP8 payload headerdetect_vp9_keyframe— checks the P bit in byte 0 of the VP9 RTP descriptor, works with both flexible and non-flexible modedetect_av1_keyframe— checks the N bit (new coded video sequence) in the AV1 RTP aggregation headerAll three are exported from
packetand re-exported fromformatalongside the existingCodecExtratypes.Tests cover keyframes, interframes, various extension combinations, truncated payloads, and edge cases.