fix(capture): derive deltacast audio PTS from sample count, not wall-clock

Removing -use_wallclock_as_timestamps on the SDI audio input. The bridge writes
SDI-clock-paced samples, so PTS from the 48kHz sample count shares the video's
clock domain and the audio length tracks the video length exactly. Wall-clock
timestamps made audio length = real elapsed time, which drifted ~1% longer than
the frame-count video when the encoder dipped under realtime (pitch-up).
This commit is contained in:
Zac Gaetano 2026-06-04 04:01:54 +00:00
parent 51f939b1fe
commit 8e5405c3f9

View file

@ -706,13 +706,18 @@ class CaptureManager {
'-video_size', fcSize,
'-framerate', fcFps,
'-i', 'pipe:0',
// Audio FIFO → ffmpeg input 1. Keep wallclock on audio so A/V sync
// aligns by arrival time; aresample=async=1 (applied on the master
// output) resamples audio to match the video CFR timestamps.
// Audio FIFO → ffmpeg input 1. The bridge writes EXACTLY the SDI-clock
// paced samples (group 0 is the reference, same slot clock as video),
// so we DERIVE audio PTS from the sample count at 48 kHz — NOT from
// wall-clock arrival. Wall-clock timestamping made the audio stream's
// length equal real elapsed time while video length = frame_count/fps;
// when the encoder ran a hair under realtime the audio ended up ~1%
// longer than video (heard as a pitch-up). Reading the raw stream at
// its natural rate keeps both in the same SDI clock domain; the
// master-output aresample=async=1 still soaks up any micro-jitter.
// The FIFO carries the full 16ch the bridge publishes; channel
// SELECTION (keep first N) is applied as an output filter so the
// discrete broadcast channels are preserved, not downmixed.
'-use_wallclock_as_timestamps', '1',
'-thread_queue_size', '512',
'-f', 's16le',
'-ar', '48000',