Removing wallclock made A/V length drift far worse (audio 11.8% long). The
known-clean config used wallclock + master aresample=async=1; the leading
silence is a standby backlog artifact addressed by the bridge live-edge flush +
record-start audio FIFO drain, not by changing the timestamp source.
The persistent ~2.5s of leading silence was the master aresample=async=1 PADDING
the audio to reconcile a PTS-origin mismatch: video PTS starts at frame 0
(-framerate), but -use_wallclock_as_timestamps stamped the first audio chunk at
its wall-clock arrival time (~2.5s after the ffmpeg graph opened). aresample
filled the gap with silence.
Drop wallclock: audio PTS now comes from the 48kHz sample count starting at 0 —
the same origin as video frame 0 — so the streams align with no pad. The bridge
already hands live audio (backlog flushed on attach), so no rate reference is
needed from wallclock.
Root cause of 'silent first ~1s then clean' + ~0.5% audio-too-long: in standby
the bridge keeps filling the audio FIFO while the idle-preview consumes only
video, so when recording starts ffmpeg reads a ~0.5s backlog of stale audio,
AND the video-only pre-roll discards video frames the audio never had.
Fix: (1) skip the video-only pre-roll in standby (warm slot = no unstable
frames), (2) drain the audio FIFO non-blocking immediately before ffmpeg opens
it, so audio starts at the live edge aligned with the first real video frame.
The 16ch interleave in the deltacast bridge produced audio at HALF the correct
sample rate (measured 24224 vs 48000 samples/s/ch), which broke A/V sync and
pitch. Per the working baseline (audio was clean before the channel selector),
revert the bridge audio thread to the original single-group 2ch extraction and
the capture-manager audio input to -ac 2 + wallclock + aresample.
KEPT the good fixes: long-GOP HEVC for non-growing (NVENC realtime, no frame
drops) and GPU-only codec list. 16ch/channel-select is shelved for a separate,
properly-validated change.
The pre-roll drained only the video pipe (fc_pipe) while the audio FIFO kept
buffering, so ffmpeg read ~PRE_ROLL_SECONDS of surplus pre-roll audio — making
audio longer than video, which when synced compresses audio ~0.5% (pitch-up,
measured: 2591573 audio samples vs 2579395 expected for the video duration).
In standby the framecache slot is already warm (no unstable startup frames), so
the drain is unnecessary; skipping it lets ffmpeg open video and audio together
from the same instant. Cold on-demand spawns keep the brief drain.
- restore -use_wallclock_as_timestamps on audio input: without it ffmpeg's raw
s16le reader stalled the graph (NVENC idle at 9%, ~half frames dropped). With
it + long-GOP HEVC the encoder runs realtime and A/V length stays locked.
- remove all CPU codec options (prores*, dnxh*, libx264/265) from recorder UI;
GPU NVENC only (hevc_nvenc / h264_nvenc). 3x L4 cluster, no reason for CPU.
- GPU codec defaults in env builders + proxy default h264_nvenc.
The hevc_nvenc codec was hardcoded to all-intra (-force_key_frames expr:1), which
is ~4x the NVENC load. Applied to every recording it exceeded the L4's realtime
budget at 1080p59.94 10-bit -> fc_pipe dropped ~half the frames -> video came out
shorter than the (correct) audio -> A/V drift + pitch-up on playback.
Now all-intra is used ONLY when growing-files is on (where it's required for the
editable head). Normal recordings use efficient long-GOP HEVC (2s GOP, 2 B-frames)
which NVENC sustains in realtime with zero drops.
Removing -use_wallclock_as_timestamps on the SDI audio input. The bridge writes
SDI-clock-paced samples, so PTS from the 48kHz sample count shares the video's
clock domain and the audio length tracks the video length exactly. Wall-clock
timestamps made audio length = real elapsed time, which drifted ~1% longer than
the frame-count video when the encoder dipped under realtime (pitch-up).
- capture-manager: remove dead legacy deltacast FIFO video path (FC_SLOT_ID
is now always set by node-agent, framecache mandatory on all SDI nodes)
- node-agent: correct stale comment about legacy FIFO fallback
- onboard-node.sh: harden detect_sdi (device-node checks, not just lspci) and
persist COMPOSE_PROFILES so framecache survives every redeploy on SDI nodes
- remove committed capture.js.bak
Root cause of this session's outage: zampp3 came up without the capture
compose profile, so framecache never started; the bridge published to shm
with no consumer and recorders showed 'receiving' with no real capture.
The framecache ring delivers frame-accurate frames at exactly the SDI clock
rate. -use_wallclock_as_timestamps was wrong for this source — it stamped
frames by ffmpeg arrival time rather than capture time, causing the recorded
file to report wrong framerates (e.g. 56.06 instead of 59.94) and a
glitchy first second at startup (NVENC cold-start backlog bunched timestamps).
Fix: remove -use_wallclock_as_timestamps from the rawvideo (pipe:0) input
and rely on -framerate for correct CFR timestamps from frame 0.
Audio keeps its FIFO wallclock; aresample=async=1 on the master output
resamples audio to align with the CFR video PTS.
Sidecars now spawn at recorder CREATE time instead of /start time.
The container boots in STANDBY=1 mode (idle preview only, no ffmpeg master).
On /start, mam-api sends per-session params (CLIP_NAME, ASSET_ID, PROJECT_ID)
to the running sidecar via HTTP POST /capture/start — ffmpeg starts in <1s.
On /stop, mam-api calls HTTP POST /capture/stop — container stays alive in
standby, ready for the next take immediately.
Container is only killed on recorder DELETE.
This eliminates: Docker create/start overhead (~1-2s), bridge startup (~2-5s),
and pre-roll wait (~5s). Latency from 'record' click to first encoded frame
drops from ~10s to ~1s.
Changes:
- capture/src/index.js: boot in standby when STANDBY=1 env is set; still
start idle preview (live thumbnail visible before recording)
- capture/src/routes/capture.js: POST /start accepts full codec params and
asset_id in body (skips mam-api asset creation when asset_id provided)
- node-agent/index.js: handleSidecarStandby() + POST /sidecar/standby route;
warms bridge at recorder create time
- recorders.js POST /: spawn standby sidecar after DB insert (non-fatal)
- recorders.js POST /:id/start: HTTP fast-path to standby sidecar; falls
back to on-demand spawn if standby not available
- recorders.js POST /:id/stop: HTTP /capture/stop, keep container in standby
- recorders.js GET /:id/status: use port-based URL for local capture status
Reverts the local-temp+faststart approach from 549ca6c. Masters now stream
ffmpeg stdout directly to S3 via multipart upload — no local disk consumed
on the worker. Uses +frag_keyframe+empty_moov+default_base_moof which
Premiere Pro 25.x handles natively (to be confirmed separately).
Zero /tmp/capture files. Worker disk stays flat during recording.
- services/capture/src/capture-manager.js:
- Added 5s pre-roll delay for Deltacast, Blackmagic, and SDI capture paths.
- Activates when is present.
- Spawns the process immediately, drains/discards the unstable frames for 5 seconds, and then pipes the stdout of the to the actual process.
- Keeps the master output perfectly clean from frame 0.
- services/web-ui/public/screens-ingest.jsx:
- Removed currentFps framerate display from both recording and idle status states.
- services/framecache/src/net_ingest.c: new network ingest process
- Spawns ffmpeg to decode SRT/RTMP → raw UYVY422 stdout
- Reads decoded frames and writes into framecache slot via shm
- Registers slot with framecache HTTP API on startup
- Deregisters slot on clean exit (SIGTERM)
- Reconnect loop for listener mode (stays alive between sessions)
- --url, --slot-id, --fc-url, --width, --height, --fps-num/den,
--source-type, --listen, --listen-port, --stream-key args
- Emits format JSON to stderr on first frame
- services/framecache/CMakeLists.txt: add net_ingest target
- services/framecache/Dockerfile: copy net_ingest to runtime image
- services/node-agent/index.js:
- startNetIngest() / stopNetIngest(): lifecycle management per recorder
- Spawns net_ingest before sidecar start for srt/rtmp sourceTypes
- Injects FC_SLOT_ID=net-<containerId> into sidecar env
- Sets IpcMode=host for network sidecars using framecache
- Maps temp id → real containerId after container create
- stopNetIngest() called on sidecar stop
- NET_INGEST_BIN env var (default: docker exec framecache net_ingest)
- services/capture/src/capture-manager.js:
- _buildInputArgs srt/rtmp: framecache path when FC_SLOT_ID set
(spawns fc_pipe, uses pipe:0 rawvideo input — same as SDI path)
- Falls back to direct URL when FC_SLOT_ID not set (legacy path)
- audioMap: network via framecache uses '0🅰️0?' (video-only fc_pipe,
no audio FIFO — audio-in-shm is roadmap)
- HLS tee: sdiHlsDir covers network-via-framecache; legacy tee gated
on !FC_SLOT_ID to avoid duplicate HLS outputs
- fc_pipe piped to ffmpeg stdin for network framecache path
- docker-compose.worker.yml: FC_URL + NET_INGEST_BIN in node-agent env
fs.writeFile/fs.readFile/fs.stat are callback-based and don't return
Promises in the UXP sandbox. await on them resolves immediately, causing
race conditions where files aren't written before import.
Added _writeFile/_readFile/_stat wrappers that use fs.promises when
available and fall back to manual Promise wrapping otherwise.
Also bumped version to 2.2.3 to match web-ui data.jsx.
The live-thumbnail and manual /start,/stop sidecar->mam-api calls hit the CSRF
guard (403 missing X-Requested-With). Match the working pattern in index.js:
send Authorization: Bearer $MAM_API_TOKEN (= CAPTURE_TOKEN, injected by
recorders.js), which is CSRF-exempt. Falls back to the UI header only when no
token is set (dev). Fixes [livethumb] failed ... 403 — posters now persist.
🤖 Generated with Claude Code
Replace the HLS 'connecting…' player in the library with a real frame grabbed
from the start of the recording, while the recording is still live.
Flow:
- recorders.js already pre-creates the asset as status='live' + ASSET_ID env
- capture-manager.start() fires _publishLiveThumbnail() (non-blocking): polls
/live/<id> for the first seg-*.ts, extracts frame 0 via ffmpeg (scaled JPEG,
yuvj420p), uploads to S3 thumbnails/<id>.jpg, then POSTs the key to mam-api
- new mam-api POST /assets/:id/live-thumbnail sets thumbnail_s3_key on the still
-live row (status untouched); idempotent no-op once finalized
- visuals.jsx AssetThumb: for live assets, show the static poster once the key /
signed URL is available, else fall back to the live HLS preview. Pulsing LIVE
border kept either way
- POST /assets gains an optional status param (default 'processing'); 'live'
skips the proxy/thumbnail queue
- capture /stop route now finalizes the pre-created asset by id (guarded) instead
of POSTing a duplicate
🤖 Generated with Claude Code
ROOT CAUSE of 'connecting' hangs and intermittent port failures:
The DELTA-12G-e-h 8c is a bidirectional card. Without calling
VHD_SetBiDirCfg(board_index, VHD_BIDIR_80) before streaming, the
board remains in its default bi-dir config (likely 4RX/4TX) — so
RX stream opens fail with VHDERR_RESOURCEUNAVAILABLE on channels
configured as TX, causing random 'connecting' hangs per the SDK docs.
Per SDK Tools.cpp SetNbChannels() pattern:
1. Open temporary board handle
2. Check IS_BIDIR + channel counts
3. Call VHD_SetBiDirCfg(board_index, VHD_BIDIR_80) for 8ch bidir
4. Close temp handle, then open real board handle for streaming
Also add VHD_SetChannelProperty(VHD_CHANNEL_MODE_SDI) for ASI-type
channels per Sample_RX.cpp — required for 12G-ASI/3G-ASI channel
types to correctly detect incoming video standard.
🤖 Generated with Claude Code
The old architecture spawned one deltacast-capture per recorder port; each
called VHD_OpenBoardHandle, triggering a BufMngr.c:781 OOB fault in the
delta_x300 kernel driver whenever two opens raced.
Fix: a single deltacast-bridge daemon opens the board once, opens RX
streams for all requested ports concurrently, and writes each port's
video/audio to named FIFOs (/dev/shm/deltacast/video-<N>.fifo,
/dev/shm/deltacast/audio-<N>.fifo). Capture sidecars read from those
FIFOs directly — no board handle, no race, no flock.
Changes:
services/capture/deltacast-bridge/main.c
- Complete rewrite: --ports csv arg, board opened once, one
video+audio thread pair per port, FIFO paths per port, format
JSON emitted per port on signal lock, SIGTERM clean shutdown.
- flock/serialize logic removed (no longer needed).
- --port single-port compat alias retained.
services/capture/deltacast-bridge/CMakeLists.txt
- Rename target deltacast-capture -> deltacast-bridge.
- POST_BUILD symlink deltacast-capture -> deltacast-bridge for compat.
services/capture/src/capture-manager.js
- deltacast _buildInputArgs: remove bridge spawn; wait up to 30s
for FIFOs to exist (bridge may be starting); return rawvideo +
s16le FIFO inputArgs. bridgeProcess=null.
- audioMap: keyed on sourceType instead of bridgeProcess (both
inputs are always present for deltacast).
- Remove readFirstStderrLine helper (no longer needed).
- Remove bridgeProcess.stdout.pipe / processes.bridge stop signal.
services/node-agent/index.js
- Add import spawn for bridge daemon management.
- Add startDeltacastBridge / stopDeltacastBridge: host-process
lifecycle for the shared bridge, ref-counted by sidecar count.
- handleSidecarStart: on deltacast, increment counter + start bridge;
decrement on container create/start failure.
- handleSidecarStop: decrement counter; stop bridge when last sidecar.
- _containerSourceType map tracks containerId->sourceType for stop.
- Old acquireDcLock mutex retained but no longer called.
With 8 deltacast bridges serializing via flock (each holding the lock
for ~35s during signal wait + settle), the last bridge in queue waits
~280s before getting the lock. The 35s readFirstStderrLine timeout was
firing before those bridges could even open the board, causing them to
fail silently while the bridge was still queued. 300s (5min) covers
8 bridges * 35s each with margin.
The flock-based board serialization in deltacast-bridge emits [board] log
lines to stderr before the JSON format line. readFirstStderrLine was
failing on the first non-JSON line. Now loops over complete lines,
skips any not starting with {, and waits for the actual JSON.
The deltacast bridge now emits [board] log lines before the format JSON
(while waiting for flock). readFirstStderrLine was parsing the first line
only and failing with 'invalid JSON'. Now it accumulates all lines and
skips any that do not start with '{', continuing to wait for the JSON
format line. Error lines ({\"error\":...}) still reject immediately.
25 Mbps is sufficient for XDCAM HD422 1080i/1080p at broadcast quality
and halves storage use. Operators can still override via recording_video_bitrate.
The growing-master ffmpeg orchestrator declared split=2[vhi][vlo] but only
consumed [vlo] inside the `if (hlsDir)` block. For deltacast sources the
caller passed hlsDir=null (the ternary only matched sourceType==='sdi'), so
[vlo] was left unconnected → ffmpeg aborted with "Filter 'split' has output 1
(vlo) unconnected" / "Error binding filtergraph inputs/outputs" → 0 frames →
no HLS → "playback failed" on all deltacast previews.
Fix:
- Pass sdiHlsDir for deltacast as well as sdi (deltacast also produces the
2nd-output HLS preview from the single SDI read).
- Make the orchestrator filter_complex conditional: split=2[vhi][vlo] when an
HLS dir is present, split=1[vhi] (master only) otherwise, so no split output
is ever orphaned regardless of source type.
Restores deltacast growing-master capture (master MXF + HLS preview). No poster
tap (the incomplete recorder-thumbnails poster on the deploy node added an
mjpeg output that destabilised the shared ffmpeg; tracked separately on the
feature/recorder-thumbnails branch).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Parse the recorder's SOURCE_CONFIG JSON in the bootstrap and pass the deltacast
capture channel (`port`) and optional `board` into captureManager.start(), so a
recorder can select which of the board's 8 channels to capture.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Audio map: the deltacast bridge delivers audio on a separate FIFO wired as
ffmpeg input 1, so the finalized master + HLS preview (and the growing
orchestrator) now map audio via `audioMap` (1🅰️0? for deltacast, 0🅰️0? for
DeckLink SDI / network) instead of an unconditional 0🅰️0?. Without this the
deltacast master/preview carried no audio.
- Channel/port: spawn the bridge with --device = board index (default 0) and
--port = source_config.port (falling back to the device index), so a recorder
can capture from any of the board's 8 channels. Adds `port`/`board` params to
start() and _buildInputArgs().
- Bridge stdin: the finalized-master ffmpeg reads the bridge's raw video from
pipe:0, so its stdin must be 'pipe' when a bridge is present (was 'ignore',
which made hiresProcess.stdin null and threw "Cannot read properties of null
(reading 'on')" at bridgeProcess.stdout.pipe(...)).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
On-node empirical testing of this bmx v1.6 build showed that raw2bmx's
rdd9 writer with --part already maintains a live, correct header Duration
as the file grows: ffprobe reads a growing duration mid-write (e.g. 2.04s
of a 10s clip while still recording), and the structural-metadata
Duration fields (tags 02020008 / 30020008) hold the real frame count
(0x33 = 51), not -1.
The dur-patch.py added in the previous commit searched the header for
Duration=-1 (0xFF*8) and found 0 fields on rdd9 ("[dur-patch] 0 Duration
fields"), so it was a no-op. Worse, opening the MXF r+b to patch it while
raw2bmx appends over CIFS is a concurrency hazard. Remove it entirely and
rely on raw2bmx's native growing Duration. rdd9 + --index-follows remains
the Premiere-recommended growing flavour (Sony XDCAM essence, index in the
essence partition).
Verified on-node (ffprobe/byte-probe). Live edit-while-record in Premiere
itself still requires user confirmation.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The previous sed/python in-place edits on the node broke capture: the
hires stderr parser was written with literal 0x08 BACKSPACE bytes instead
of regex word boundaries, so it never matched ffmpeg output.
framesReceived stayed 0, the shutdown handler saw "no frames" and marked
every asset as an error even though video was captured. The ffmpeg base
args had also been changed to -progress pipe:2, whose key=value output
puts frame= and fps= on separate lines and does not match a combined
regex.
Fixes:
- Parser: single robust regex matching ffmpeg's classic -stats line
(frame= and fps= together). No backspace bytes, no word boundaries.
- ffmpeg base args back to -stats (drop -progress pipe:2).
Growing-file (Premiere edit-while-record), per bmx thread 87ac5750 and
Drastic/Softron edit-while-ingest docs:
- raw2bmx clip type op1a -> rdd9 (Sony XDCAM / RDD-9, the flavour Premiere
reads while growing) with --index-follows so the IndexTableSegment is
written in the same partition as the essence it indexes (lets a reader
re-scanning body partitions seek toward the record head). NOT --avid-gf
(Avid OP-Atom, Media-Composer-only, needs a companion AAF).
- dur-patch.py: overwrite header Duration=-1 to 0 immediately at
clip-open (Premiere rejects -1 on import), then track the live frame
count every 3s from the last body partition IndexTableSegment. Shipped
as services/capture/dur-patch.py (/app/dur-patch.py in the image).
Deployed to wild-dragon-capture:latest on zampp2 via overlay build.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>