Root cause A (main.c): audio_thread set the global g_stop flag on EPIPE
(ffmpeg reader died). This killed ALL port threads across the entire bridge
process. Bridge process then exited with all 8 ports gone.
Root cause B (node-agent/index.js): startDeltacastBridge() skipped respawn
when FIFOs existed in /dev/shm/deltacast, even if the bridge process was dead.
Next ffmpeg opened the audio FIFO read-end and blocked forever (no writer) →
no audio (or video) for any new recording.
Fix A (main.c):
- Add per-port atomic g_port_stop[MAX_PORTS] flags.
- Audio thread: on EPIPE, close the FIFO fd and loop back to reopen it.
The VHD ANC stream stays open across reconnects. Other ports unaffected.
- Video thread: on EPIPE or stream error, set only g_port_stop[port], not
the global g_stop. Other ports keep running.
- MAX_PORTS #define moved before globals so g_port_stop[MAX_PORTS] compiles.
Fix B (node-agent/index.js):
- Add _dcBridgeProcessAlive() — scans /proc/<pid>/cmdline for deltacast-bridge.
- startDeltacastBridge(): if FIFOs exist but no live bridge process is found,
spawn a fresh bridge instead of silently returning. Detects bridges started
externally (e.g. sudo on the host before node-agent started).
Requires: bridge rebuild + restart on zampp3. No capture image rebuild needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Capture bridge emits per-port format JSON on signal lock. Node-agent
now caches these by port and injects DELTACAST_VIDEO_SIZE, DELTACAST_FRAMERATE,
DELTACAST_INTERLACED into the sidecar env so capture-manager uses
actual signal dimensions instead of hardcoded 1920x1080/25fps defaults.
The old architecture spawned one deltacast-capture per recorder port; each
called VHD_OpenBoardHandle, triggering a BufMngr.c:781 OOB fault in the
delta_x300 kernel driver whenever two opens raced.
Fix: a single deltacast-bridge daemon opens the board once, opens RX
streams for all requested ports concurrently, and writes each port's
video/audio to named FIFOs (/dev/shm/deltacast/video-<N>.fifo,
/dev/shm/deltacast/audio-<N>.fifo). Capture sidecars read from those
FIFOs directly — no board handle, no race, no flock.
Changes:
services/capture/deltacast-bridge/main.c
- Complete rewrite: --ports csv arg, board opened once, one
video+audio thread pair per port, FIFO paths per port, format
JSON emitted per port on signal lock, SIGTERM clean shutdown.
- flock/serialize logic removed (no longer needed).
- --port single-port compat alias retained.
services/capture/deltacast-bridge/CMakeLists.txt
- Rename target deltacast-capture -> deltacast-bridge.
- POST_BUILD symlink deltacast-capture -> deltacast-bridge for compat.
services/capture/src/capture-manager.js
- deltacast _buildInputArgs: remove bridge spawn; wait up to 30s
for FIFOs to exist (bridge may be starting); return rawvideo +
s16le FIFO inputArgs. bridgeProcess=null.
- audioMap: keyed on sourceType instead of bridgeProcess (both
inputs are always present for deltacast).
- Remove readFirstStderrLine helper (no longer needed).
- Remove bridgeProcess.stdout.pipe / processes.bridge stop signal.
services/node-agent/index.js
- Add import spawn for bridge daemon management.
- Add startDeltacastBridge / stopDeltacastBridge: host-process
lifecycle for the shared bridge, ref-counted by sidecar count.
- handleSidecarStart: on deltacast, increment counter + start bridge;
decrement on container create/start failure.
- handleSidecarStop: decrement counter; stop bridge when last sidecar.
- _containerSourceType map tracks containerId->sourceType for stop.
- Old acquireDcLock mutex retained but no longer called.
Remove 'Capture' from the Operations nav section in shell.jsx — users
configure recorders via the Recorders page; the Capture route/component
is left intact for any remaining references.
Also remove 'capture' from the ingest open-group list (it was listed
as an ingest child despite living in Operations, now moot).
Add a prominent amber warning banner at the top of the Playout page
body (screens-playout.jsx) to make clear the feature is in testing and
not ready for production use.
No cherry-pick from fix/library-and-signal-indicator — all commits on
that branch are already present on main.
The signal timeout deadline was set at process start before waiting for
the flock. Bridges queued behind earlier ports waited minutes for the
lock, then found their 30s signal deadline had already expired before
they even opened the board, causing false "no signal" failures on ports
that have live signal.
Fix: move clock_gettime deadline initialisation to AFTER flock acquired
and board opened, so the full sig_timeout is always available for signal
detection regardless of queue wait time.
With 8 deltacast bridges serializing via flock (each holding the lock
for ~35s during signal wait + settle), the last bridge in queue waits
~280s before getting the lock. The 35s readFirstStderrLine timeout was
firing before those bridges could even open the board, causing them to
fail silently while the bridge was still queued. 300s (5min) covers
8 bridges * 35s each with margin.
The flock-based board serialization in deltacast-bridge emits [board] log
lines to stderr before the JSON format line. readFirstStderrLine was
failing on the first non-JSON line. Now loops over complete lines,
skips any not starting with {, and waits for the actual JSON.
The deltacast bridge now emits [board] log lines before the format JSON
(while waiting for flock). readFirstStderrLine was parsing the first line
only and failing with 'invalid JSON'. Now it accumulates all lines and
skips any that do not start with '{', continuing to wait for the JSON
format line. Error lines ({\"error\":...}) still reject immediately.
Concurrent VHD_OpenBoardHandle calls from multiple capture sidecars
trigger delta_x300 BufMngr.c:781 array-index-out-of-bounds, wedging all
RX channels until the module is reloaded. The node-agent stagger only
delays container start — the bridge binary starts ~2s later and can still
race. This fix acquires an exclusive flock on /dev/shm/deltacast/bridge.lock
before VHD_OpenBoardHandle and holds it until signal lock succeeds (then
adds a 4s settle before releasing so the board's buffer queues stabilize).
Lock is released on signal failure too so the next bridge is never
permanently blocked. All 8 channels can now start safely by serializing
through the same lock file mounted into every sidecar.
Simultaneous VHD_OpenBoardHandle calls from 8 sidecars trigger a kernel
array-index-out-of-bounds in delta_x300 BufMngr.c:781 that wedges all
RX channels. Serialize deltacast-only sidecar starts through a
promise-chain mutex with a configurable settle delay
(DELTACAST_START_STAGGER_MS, default 3500ms). All other source types
(SDI, SRT, RTMP) are unaffected — they bypass the mutex entirely.
25 Mbps is sufficient for XDCAM HD422 1080i/1080p at broadcast quality
and halves storage use. Operators can still override via recording_video_bitrate.
Simultaneous VHD_OpenBoardHandle calls from 8 sidecars trips a kernel
array-index-out-of-bounds in BufMngr.c:781 (delta_x300 v6.34.1). Fix:
a process-wide promise-chain mutex gates deltacast sidecar starts so only
one board open is in flight at a time, with a configurable settle delay
(DELTACAST_START_STAGGER_MS, default 3500ms) before releasing the lock.
SDI, SRT, RTMP and all other source types are unaffected.
The growing-master ffmpeg orchestrator declared split=2[vhi][vlo] but only
consumed [vlo] inside the `if (hlsDir)` block. For deltacast sources the
caller passed hlsDir=null (the ternary only matched sourceType==='sdi'), so
[vlo] was left unconnected → ffmpeg aborted with "Filter 'split' has output 1
(vlo) unconnected" / "Error binding filtergraph inputs/outputs" → 0 frames →
no HLS → "playback failed" on all deltacast previews.
Fix:
- Pass sdiHlsDir for deltacast as well as sdi (deltacast also produces the
2nd-output HLS preview from the single SDI read).
- Make the orchestrator filter_complex conditional: split=2[vhi][vlo] when an
HLS dir is present, split=1[vhi] (master only) otherwise, so no split output
is ever orphaned regardless of source type.
Restores deltacast growing-master capture (master MXF + HLS preview). No poster
tap (the incomplete recorder-thumbnails poster on the deploy node added an
mjpeg output that destabilised the shared ffmpeg; tracked separately on the
feature/recorder-thumbnails branch).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Deltacast picker's selected index is the capture channel on the single
board. Write it into source_config.port (in addition to device_index) so the
capture sidecar maps "pick channel N" to the bridge's --port N. device_index is
retained for backward-compatible display/fallback.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Parse the recorder's SOURCE_CONFIG JSON in the bootstrap and pass the deltacast
capture channel (`port`) and optional `board` into captureManager.start(), so a
recorder can select which of the board's 8 channels to capture.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Audio map: the deltacast bridge delivers audio on a separate FIFO wired as
ffmpeg input 1, so the finalized master + HLS preview (and the growing
orchestrator) now map audio via `audioMap` (1🅰️0? for deltacast, 0🅰️0? for
DeckLink SDI / network) instead of an unconditional 0🅰️0?. Without this the
deltacast master/preview carried no audio.
- Channel/port: spawn the bridge with --device = board index (default 0) and
--port = source_config.port (falling back to the device index), so a recorder
can capture from any of the board's 8 channels. Adds `port`/`board` params to
start() and _buildInputArgs().
- Bridge stdin: the finalized-master ffmpeg reads the bridge's raw video from
pipe:0, so its stdin must be 'pipe' when a bridge is present (was 'ignore',
which made hiresProcess.stdin null and threw "Cannot read properties of null
(reading 'on')" at bridgeProcess.stdout.pipe(...)).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ffmpeg opens all inputs before processing; input 1 is the audio FIFO. The
bridge previously opened the FIFO writer only after VHD_OpenStreamHandle +
VHD_StartStream succeeded, returning early on failure / no embedded audio and
never opening the FIFO -> ffmpeg blocked forever on input 1 -> 0 fps and an
empty HLS preview. Now the FIFO writer is opened unconditionally and first,
and the audio thread feeds a continuous, wall-clock-paced s16le stereo stream
(real samples when available, otherwise silence). SIGPIPE is ignored so a
dying ffmpeg returns EPIPE instead of killing the bridge.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
On-node empirical testing of this bmx v1.6 build showed that raw2bmx's
rdd9 writer with --part already maintains a live, correct header Duration
as the file grows: ffprobe reads a growing duration mid-write (e.g. 2.04s
of a 10s clip while still recording), and the structural-metadata
Duration fields (tags 02020008 / 30020008) hold the real frame count
(0x33 = 51), not -1.
The dur-patch.py added in the previous commit searched the header for
Duration=-1 (0xFF*8) and found 0 fields on rdd9 ("[dur-patch] 0 Duration
fields"), so it was a no-op. Worse, opening the MXF r+b to patch it while
raw2bmx appends over CIFS is a concurrency hazard. Remove it entirely and
rely on raw2bmx's native growing Duration. rdd9 + --index-follows remains
the Premiere-recommended growing flavour (Sony XDCAM essence, index in the
essence partition).
Verified on-node (ffprobe/byte-probe). Live edit-while-record in Premiere
itself still requires user confirmation.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The cluster heartbeat upserts cluster_nodes ON CONFLICT (hostname), so two
machines reporting the same os.hostname() clobber each other's row. A cloned
capture VM whose /etc/hostname was "zampp1" (same as the primary) caused its
4 DeckLink cards to land on the primary's row, then get overwritten by the
primary's cardless heartbeat — so the New Recorder modal showed "No SDI
devices auto-detected" despite healthy hardware.
- node-agent now reports process.env.NODE_NAME || os.hostname() as its cluster
identity, so node identity is explicit and collision-proof.
- docker-compose.worker.yml exposes NODE_NAME to the container.
- onboard-node.sh always writes NODE_NAME to the node .env (defaults to the OS
hostname) so future onboarding pins identity even on cloned images.
Live remediation already applied to the zampp2 capture node: compose hostname
pinned to zampp2 and its node token rebound to zampp2; DB now reports bmd=4
for zampp2.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The previous sed/python in-place edits on the node broke capture: the
hires stderr parser was written with literal 0x08 BACKSPACE bytes instead
of regex word boundaries, so it never matched ffmpeg output.
framesReceived stayed 0, the shutdown handler saw "no frames" and marked
every asset as an error even though video was captured. The ffmpeg base
args had also been changed to -progress pipe:2, whose key=value output
puts frame= and fps= on separate lines and does not match a combined
regex.
Fixes:
- Parser: single robust regex matching ffmpeg's classic -stats line
(frame= and fps= together). No backspace bytes, no word boundaries.
- ffmpeg base args back to -stats (drop -progress pipe:2).
Growing-file (Premiere edit-while-record), per bmx thread 87ac5750 and
Drastic/Softron edit-while-ingest docs:
- raw2bmx clip type op1a -> rdd9 (Sony XDCAM / RDD-9, the flavour Premiere
reads while growing) with --index-follows so the IndexTableSegment is
written in the same partition as the essence it indexes (lets a reader
re-scanning body partitions seek toward the record head). NOT --avid-gf
(Avid OP-Atom, Media-Composer-only, needs a companion AAF).
- dur-patch.py: overwrite header Duration=-1 to 0 immediately at
clip-open (Premiere rejects -1 on import), then track the live frame
count every 3s from the last body partition IndexTableSegment. Shipped
as services/capture/dur-patch.py (/app/dur-patch.py in the image).
Deployed to wild-dragon-capture:latest on zampp2 via overlay build.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
raw2bmx v1.6 does not have a --growing-file option; using it causes
'Unknown Input Option' and immediately crashes the pipeline. The
--part interval alone is sufficient — body partitions with updated
IndexDuration are written every 30 frames, and the file has no footer
(open state) while recording, which is what Premiere's growing-file
reader polls for.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without --growing-file, raw2bmx writes body partitions via --part but
does NOT mark them as closed partitions with self-contained index table
segments. Premiere Pro's growing-file reader requires closed partitions
to safely parse an in-progress MXF and detect that the duration has
advanced — without this flag the file imports fine but never shows
growth in the timeline.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Added useEffect to parse location.hash and update route state.
Fixes deep links like /#/library not rendering correct screen.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Cap monitor column at 960px width so full GUI fits 1920x1080 without scroll.
Preview now ~960×540px (16:9), leaves room for 300px rail + margins.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
ffmpeg's mxf muxer cannot write a growing file — its header/index duration
stays N/A until the footer at close (proven: file grows on disk but readable
duration never advances), so Premiere never sees growth. Replace the growing
master muxer with bmx/raw2bmx --growing-file, the reference growing-OP1a writer.
Capture image builds bmx (bbc/bmx v1.6) from source (bmxlib-tools absent in
bookworm). Growing pipeline: one ffmpeg decodes SDI -> split into MPEG-2 422
essence + PCM (to named FIFOs) + the H.264 HLS preview; raw2bmx muxes the
growing OP1a MXF to the share, updating IndexDuration incrementally. FIFO
open-order deadlock fixed by parent-priming both FIFOs. Stop forwards SIGINT
so ffmpeg EOFs and raw2bmx finalizes the footer; stop() awaits raw2bmx exit
before the promotion worker uploads. Raster/fps -> raw2bmx essence flag via
deriveGrowingRaster (default 1080i59.94).
Proven live (zampp2): IndexDuration grows 43->223->403 frames at 3/8/15s
mid-write (ffmpeg stayed N/A); finalized file valid; HLS preview intact.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The promotion worker promoted on mtime-idle (>=8s), but CIFS attribute caching
makes an actively-growing MXF look idle, so it grabbed the live file ~15s into
recording, uploaded it, flipped the asset live->ready, and unlinked it ("a
worker is stealing the file"). Gate promotion on the recorder's live status:
the growing asset's display_name is the recorder's current_session_id, so skip
promotion while a recorder with that session is status='recording'. Only
promote once recording has stopped.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Growing root cause (4th attempt): Premiere doesn't import H.264-in-.ts
("unsupported compression type"); its growing-file support is MXF OP1a.
Prior MXF/DNxHR failed because DNxHR is VBR and never flushes the incremental
index — XDCAM HD422 (mpeg2video, CBR) DOES write index segments into body
partitions mid-record (proven live via SIGKILL: 5 index segments, readable,
no footer). Growing master is now MXF OP1a / XDCAM HD422 4:2:2 CBR + PCM s16le,
operator bitrate as CBR (default 50M). live-path returns .mxf to match.
GUI: bitrate input is now always editable in growing mode (was hidden for
ProRes-selected codecs); codec menu shown disabled-with-explanation under
growing (it had only looked "missing" due to a stale served bundle).
Requires Premiere prefs: Media > "Automatically refresh growing files" ON,
and disable the two XMP-write-on-import options.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
screens-playout.jsx declared a top-level function fmtDuration(secs) that, in
the shared global script scope, overwrote data.jsx's fmtDuration(ms). After
the playout redesign loaded, normalizeAsset(duration_ms) hit the seconds-based
version, rendering every asset duration x1000 (15000ms shown as 4:10:00).
Rename the playout-local helpers to playoutFmtDur/playoutFmtTC.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>