fix(growing-files): MPEG-TS growing master + promotion-worker share mount

Root cause: MXF OP1a writes its index/duration only in the footer partition
on finalize, so a growing MXF has no footer and VLC/Premiere/ffmpeg-strict
refuse it ("Unable to open file on disk"). Separately the proxy job pointed
at a .mov S3 key that never existed (promotion worker watched a local empty
disk, not the SMB share), so stop -> instant proxy failure.

Fix: growing master is now MPEG-TS (H.264 high422 all-intra + AAC), which is
readable from the first PAT/PMT while still growing (verified mid-write decode).
hiresKey derives from the actual produced extension. Capture skips finalize for
growing recorders (leaves asset live for promotion). Promotion worker CIFS-
mounts the same growing_smb share before scanning; worker image gets cifs-utils
and worker-p4 runs privileged (local /growing bind removed). /live-path uses .ts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Zac Gaetano 2026-05-31 19:41:28 -04:00
parent 08499b93b2
commit 8a958046ef
6 changed files with 148 additions and 41 deletions

View file

@ -112,6 +112,11 @@ services:
dockerfile: Dockerfile.gpu
image: wild-dragon-worker-gpu:latest
runtime: nvidia
# Privileged so the promotion scanner can mount the growing-files CIFS share
# at /growing (same Approach A as the capture sidecar). Without the share
# mounted the scanner watches an empty local dir and never promotes growing
# captures to S3.
privileged: true
depends_on:
- queue
- db
@ -136,7 +141,9 @@ services:
WORKER_LABEL: "zampp1 / Tesla P4"
NVIDIA_DRIVER_CAPABILITIES: video,compute,utility
volumes:
- /mnt/NVME/MAM/wild-dragon-growing:/growing
# NOTE: /growing is NOT a host bind anymore — the promotion scanner mounts
# the CIFS landing-zone share there itself (a bind would shadow it). The
# mount needs rshared propagation so the in-container CIFS mount is visible.
- /mnt/NVME/MAM/wild-dragon-media:/media
networks:
- wild-dragon

View file

@ -210,30 +210,43 @@ const CONTAINER_EXT = {
// Growing-file (edit-while-record) master format.
//
// Premiere's "open capture in progress" / grow-on-disk support is FORMAT-
// SPECIFIC. A fragmented MP4/MOV (`+frag_keyframe+empty_moov+default_base_moof`)
// is NOT openable by Premiere as a growing file — its QuickTime importer needs
// the classic stco/stsz/stts sample tables in a single top-level moov, which a
// fragmented MOV never has while growing (samples live in moof/trun fragments).
// Symptom: "Unable to open file on disk." (Confirmed via ffprobe on zampp2: the
// growing .mov is ftyp + empty moov + repeating moof/mdat pairs, no sample
// tables.)
// Two prior attempts FAILED at the file level, both proven on zampp2:
//
// The robust, broadcast-standard growing format Premiere DOES ingest is
// MXF OP1a (`-f mxf`) carrying a Premiere-native intra codec. We use DNxHR HQ
// (4:2:2 8-bit) which ffmpeg's MXF muxer accepts (HEVC/ProRes-in-MXF are
// rejected by this build), every frame is intra so a partially-written file is
// decodable to its last complete frame, and MXF writes header + body partitions
// incrementally so readers see valid essence mid-write. The same finalized .mxf
// is also a clean, Premiere-native asset, so the promotion/finalized path stays
// valid.
// 1) Fragmented MP4/MOV (`+frag_keyframe+empty_moov+default_base_moof`):
// NOT openable by Premiere — its QuickTime importer needs the classic
// stco/stsz/stts sample tables in a single top-level moov, which a
// fragmented MOV never has while growing (samples live in moof/trun).
//
// Trade-off: DNxHR HQ is large (~22 GB/min at 1080p). Switch the profile to
// dnxhr_sq below (~half the bitrate) if disk is the constraint.
// 2) MXF OP1a / DNxHR HQ (`-f mxf`): ffmpeg can read it, but MXF OP1a writes
// its index + duration ONLY in the FOOTER partition, emitted on clean
// finalize. While growing, `ffprobe` reports `duration=N/A` and there is
// no footer/index, so VLC and Premiere REFUSE to open the in-progress
// file ("Unable to open file on disk"). Verified live: a growing .mxf
// probes `duration=N/A`; the same file after stop probes a real duration.
// MXF-while-growing on a CIFS target is therefore fundamentally unreliable
// for edit-while-record.
//
// FIX — growing MPEG-TS carrying H.264 ALL-INTRA + AAC.
// * MPEG-TS has NO footer/moov: every packet is self-describing and the file
// is valid from the first PAT/PMT onward. VLC and Premiere both open a
// still-growing .ts, and `ffprobe` reports a real (growing) duration from
// the continuous PCR — no finalize step is required for readability.
// * H.264 High 4:2:2 with `-g 1` makes every frame an IDR (all-intra), so a
// partially-written file decodes cleanly to its last complete frame — the
// prerequisite for edit-while-record — and Premiere ingests H.264-in-TS
// natively. (DNxHD/ProRes/PCM are NOT valid MPEG-TS payloads; verified
// ffmpeg rejects DNxHD-in-TS, hence H.264 + AAC.)
// * Verified on zampp2 MID-WRITE: ffprobe succeeds (format=mpegts,
// duration readable), and `ffmpeg -f null -` decodes both the H.264 video
// and AAC audio with exit 0 while the file is still being written.
//
// Audio: AAC (a TS-native codec Premiere imports). PCM-in-TS is tagged as
// SMPTE-302M `bin_data`, which Premiere does not reliably import as audio.
const GROWING_VIDEO_ARGS = [
'-c:v', 'dnxhd', '-profile:v', 'dnxhr_hq', '-pix_fmt', 'yuv422p',
'-c:v', 'libx264', '-profile:v', 'high422', '-pix_fmt', 'yuv422p',
'-preset', 'veryfast', '-g', '1',
];
const GROWING_EXT = 'mxf';
const GROWING_EXT = 'ts';
// ── Source-backend abstraction (issue #168) ──────────────────────────────
// The capture input was historically hard-wired to a single `-f decklink -i …`
@ -323,18 +336,19 @@ function buildEncodeArgs({
container, isNetwork, isProxy = false,
growing = false,
}) {
// ── Growing master: force MXF OP1a + DNxHR, ignoring the configured MOV/
// ProRes container/codec. This is the only combination Premiere opens as a
// growing file (see GROWING_VIDEO_ARGS above). Audio is forced to PCM,
// which MXF carries natively and Premiere ingests.
// ── Growing master: force MPEG-TS + H.264 all-intra, ignoring the configured
// MOV/ProRes container/codec. This is a format VLC and Premiere both open
// WHILE GROWING (no footer/moov to wait on — see GROWING_VIDEO_ARGS above).
// Audio is forced to AAC, a TS-native codec Premiere imports (PCM-in-TS is
// tagged as bin_data and not reliably importable as audio).
if (growing) {
const args = [];
if (isNetwork) args.push('-map', '0:v:0?', '-map', '0:a:0?');
args.push(...GROWING_VIDEO_ARGS);
if (framerate && framerate !== 'native') args.push('-r', framerate);
args.push('-c:a', 'pcm_s24le');
args.push('-c:a', 'aac', '-b:a', '256k');
if (audioChannels) args.push('-ac', String(audioChannels));
args.push('-f', 'mxf');
args.push('-f', 'mpegts');
return args;
}
@ -511,12 +525,10 @@ class CaptureManager {
}
const sessionId = uuidv4();
const hiresExt = CONTAINER_EXT[container] || 'mov';
const proxyExt = CONTAINER_EXT[proxyContainer] || 'mp4';
const hiresKey = `projects/${projectId}/masters/${clipName}.${hiresExt}`;
// Growing-files: write master to the local SMB share instead of streaming
// to S3. Path is relative to the container's GROWING_PATH mount.
// Growing-files: write master to the SMB share instead of streaming to S3.
// Path is relative to the container's GROWING_PATH mount.
//
// Approach A: if a CIFS source is configured, mount it now. A mount failure
// is non-fatal — we fall back to S3 streaming so the recording is never
@ -525,12 +537,21 @@ class CaptureManager {
if (growingActive && GROWING_SMB_MOUNT) {
if (!mountGrowingShare()) growingActive = false; // fall back to S3
}
// Growing master is always MXF OP1a (the only Premiere-growable format here),
// regardless of the recorder's configured container — so it gets a .mxf
// extension, not hiresExt.
// Growing master is always MPEG-TS (the format VLC + Premiere open while
// growing — see GROWING_VIDEO_ARGS), regardless of the recorder's configured
// container — so it gets a .ts extension, not the container's.
const growingPath = growingActive
? `${GROWING_PATH}/${projectId}/${clipName}.${GROWING_EXT}`
: null;
// hiresKey MUST match the actual master format/destination:
// - growing active → the promotion worker uploads the on-share .ts to this
// key, so it has the .ts extension. (A stale .mov key here would make the
// proxy job download a nonexistent object → "unable to open the file on
// disk".)
// - growing fell back to S3 → the normal container extension.
const hiresExt = growingPath ? GROWING_EXT : (CONTAINER_EXT[container] || 'mov');
const hiresKey = `projects/${projectId}/masters/${clipName}.${hiresExt}`;
if (growingPath) {
try { mkdirSync(dirname(growingPath), { recursive: true }); }
catch (err) { console.error('[capture] could not create growing dir:', err.message); }

View file

@ -135,6 +135,15 @@ async function gracefulShutdown(signal) {
console.error('[shutdown] failed to flag empty asset:', e.message);
}
}
} else if (completed.growingPath) {
// Growing-files recorder: the master lives on the SMB share as a .ts,
// NOT in S3 yet. The promotion worker (which watches the same share)
// uploads it to S3 and enqueues the proxy from the real, finalized key.
// We must NOT call /finalize here: that sets original_s3_key to a key
// that doesn't exist yet and enqueues a proxy that instantly fails with
// "unable to open the file on disk." Leave the asset 'live' for the
// promotion worker to flip to 'ready'.
console.log(`[shutdown] growing capture finalized on share (${completed.growingPath}); leaving promotion worker to upload + proxy`);
} else if (liveAssetId) {
// Finalise the pre-created live asset by id (avoids POST / 409 collision).
try {

View file

@ -742,11 +742,14 @@ router.get('/:id/live-path', async (req, res, next) => {
const cfg = {};
for (const { key, value } of s.rows) cfg[key] = value;
if (!cfg.growing_smb_url) return res.status(409).json({ error: 'No SMB URL configured — set the editor SMB URL in Settings → Storage' });
const rec = await pool.query(
`SELECT recording_container FROM recorders WHERE current_session_id = $1 ORDER BY updated_at DESC LIMIT 1`,
[asset.id]
);
const ext = rec.rows[0]?.recording_container || 'mov';
// The growing master is ALWAYS MPEG-TS (.ts) on the share, regardless of the
// recorder's configured finalized container — that is the format VLC and
// Premiere can open WHILE it is still growing (no footer/moov to wait on).
// Pointing the editor at the recorder's `.mov`/`.mxf` container here was a
// bug: the file on the share is `<clip>.ts`, so the editor got "file not
// found / unable to open." Keep this in lock-step with GROWING_EXT in
// services/capture/src/capture-manager.js.
const ext = 'ts';
const smbRoot = cfg.growing_smb_url.replace(/\/+$/, '');
const winPath = smbRoot.replace(/^smb:\/\//, '\\\\').replace(/\//g, '\\') + `\\${asset.project_id}\\${asset.display_name}.${ext}`;
const posix = smbRoot.replace(/^smb:\/\//, '//') + `/${asset.project_id}/${asset.display_name}.${ext}`;

View file

@ -16,8 +16,10 @@ FROM nvcr.io/nvidia/cuda:12.3.1-base-ubuntu22.04
# track YouTube's frequent changes, so we pull the latest self-contained
# release binary at build time. /usr/local/bin precedes /usr/bin on PATH, so
# `yt-dlp` resolves to this one. Rebuild the worker image to refresh it.
# cifs-utils: the promotion scanner mounts the growing-files SMB landing zone
# at /growing so it watches the SAME share the capture sidecars write to.
RUN apt-get update && apt-get install -y --no-install-recommends \
curl ca-certificates ffmpeg python3 \
curl ca-certificates ffmpeg python3 cifs-utils \
&& curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y --no-install-recommends nodejs \
&& curl -fsSL https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp_linux \

View file

@ -6,7 +6,8 @@
// Why a poll loop and not chokidar: NFS/SMB mounts don't reliably surface
// inotify events through the kernel; mtime polling is the boring-but-works
// answer for fairness across all storage backends.
import { readdir, stat, unlink } from 'node:fs/promises';
import { readdir, stat, unlink, mkdir, writeFile } from 'node:fs/promises';
import { execFileSync } from 'node:child_process';
import { join, relative, basename } from 'node:path';
import { createReadStream } from 'node:fs';
import { Queue } from 'bullmq';
@ -16,6 +17,66 @@ import { uploadStreamToS3 } from '../s3/client.js';
const GROWING_PATH = process.env.GROWING_PATH || '/growing';
const S3_BUCKET = process.env.S3_BUCKET || 'wild-dragon';
const POLL_MS = 5000;
const SMB_CREDS_FILE = '/run/promotion-smb-creds';
// Normalize a Windows / smb:// share path to the //host/share UNC that
// mount.cifs accepts (mirrors services/capture/src/capture-manager.js).
function toUncShare(raw) {
if (!raw) return '';
let s = String(raw).trim().replace(/\\/g, '/');
s = s.replace(/^smb:\/\//i, '//');
if (!s.startsWith('//')) s = '//' + s.replace(/^\/+/, '');
return s;
}
function isMounted(path) {
try { execFileSync('mountpoint', ['-q', path]); return true; }
catch { return false; }
}
// Mount the growing-files CIFS share at GROWING_PATH so the promotion scanner
// sees the SAME files the capture sidecar writes on the remote node. Without
// this the worker was watching a LOCAL empty /growing and never promoted any
// growing capture — the master never reached S3 and the only proxy that fired
// was the bogus one from capture's finalize call (against a key that doesn't
// exist) → "unable to open the file on disk". Best-effort + idempotent.
async function ensureGrowingShareMounted() {
const r = await query(
`SELECT key, value FROM settings WHERE key = ANY($1)`,
[['growing_smb_mount', 'growing_smb_username', 'growing_smb_password', 'growing_smb_vers']]
).catch(() => ({ rows: [] }));
const cfg = {};
for (const { key, value } of r.rows) cfg[key] = value;
const share = toUncShare(cfg.growing_smb_mount || '');
if (!share) {
console.log('[promotion] no growing_smb_mount configured — using local GROWING_PATH');
return;
}
try {
if (isMounted(GROWING_PATH)) {
console.log('[promotion] growing share already mounted at', GROWING_PATH);
return;
}
await mkdir(GROWING_PATH, { recursive: true }).catch(() => {});
await writeFile(
SMB_CREDS_FILE,
`username=${cfg.growing_smb_username || ''}\npassword=${cfg.growing_smb_password || ''}\n`,
{ mode: 0o600 }
);
const opts = [
`credentials=${SMB_CREDS_FILE}`,
'uid=0', 'gid=0', 'file_mode=0664', 'dir_mode=0775',
`vers=${cfg.growing_smb_vers || '3.0'}`,
].join(',');
execFileSync('mount', ['-t', 'cifs', share, GROWING_PATH, '-o', opts],
{ stdio: ['ignore', 'ignore', 'pipe'] });
console.log('[promotion] mounted CIFS growing share', share, '->', GROWING_PATH);
} catch (err) {
const stderr = err.stderr ? err.stderr.toString().trim() : err.message;
console.error('[promotion] CIFS mount failed (promotion will not see growing files):', stderr);
}
}
let inflight = new Set();
let idleThresholdMs = 8000;
@ -137,6 +198,10 @@ async function scan() {
// and close the queue connection during SIGTERM.
export function startPromotionWorker() {
loadIdleThreshold();
// Mount the SMB landing zone before the first scan so we watch the SAME share
// the capture sidecars write to (best-effort; falls back to local GROWING_PATH).
ensureGrowingShareMounted().catch((e) =>
console.error('[promotion] mount bootstrap failed:', e.message));
thresholdInterval = setInterval(loadIdleThreshold, 60_000);
scanInterval = setInterval(scan, POLL_MS);
console.log(`[promotion] watching ${GROWING_PATH} (idle threshold ${idleThresholdMs}ms)`);