Extend isH264IDRStart to handle STAP-A aggregates (NAL type 24, RFC 6184
§5.7.1). The first NAL in the aggregate starts at byte 3 (after the
2-byte size field); if its type is 5 (IDR slice) the packet is treated
as an IDR start and the burst cache is reset.
This closes the gap noted in NOTES.md: a publisher using STAP-A for IDR
(e.g. a custom GStreamer pipeline or hardware encoder) will now correctly
reset the burst rather than accumulating packets until hitting the 512-
packet / 2 MiB capacity cap.
Introduces keyFrameCache — a bounded ring buffer that retains all RTP
packets from the most recent H.264 IDR NAL unit until the packet just
before the next one. New WHEP subscribers receive this burst immediately
on Subscribe(), cutting first-frame latency from up to one IDR interval
(typically ~2 s at GOP=60/30fps) to nearly zero.
Design notes:
- Detection covers single-NAL (type 5) and FU-A start (type 28, start
bit set, inner type 5). STAP-A IDR leading is not handled — FFmpeg
never uses STAP-A for IDR slices in practice.
- Bounded at 512 packets / 2 MiB per source to cap memory per stream.
- push() is called only from the single-goroutine readLoop; the lock
it holds is tiny and brief.
- snapshot() returns a shallow copy; *rtp.Packet values are immutable
after being placed in the cache so sharing is safe.