Phase 0.2 of the NVENC All-Intra HEVC ingest plan.
node-agent/handleSidecarStart:
- Accept useGpu: true in the sidecar start body
- When useGpu: adds Runtime=nvidia, DeviceRequests=[gpu], and injects
NVIDIA_VISIBLE_DEVICES=all + NVIDIA_DRIVER_CAPABILITIES=video,compute,utility
into the container env. CPU-codec recorders are unaffected (useGpu defaults false).
mam-api/recorders (start endpoint):
- Derive useGpu from recorder.recording_codec — true for hevc_nvenc/h264_nvenc
- Pass useGpu to remote sidecar start body
- Apply same Runtime/DeviceRequests to the local Docker spawn path
capture/capture-manager:
- Update hevc_nvenc codec entry with all-intra flags:
-g 1 -bf 0 (every frame IDR, no B-frames — required for growing-file
edit-while-record), -rc vbr, -profile:v main10, pixFmt p010le (10-bit 4:2:0)
Next: validation gate (§8) — test MXF OP1a then fragmented MOV on one
DeckLink channel, mount in Premiere while recording.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Stuck-live fix: capture sidecar now finalises the pre-created live asset by id (new POST /assets/:id/finalize) instead of POSTing a new asset (409 collision); node-agent gives the sidecar a 180s stop grace so the S3 upload + callback complete; node-agent logs sidecar start/stop for diagnostics.
Live SDI monitor: HLS preview is now a 2nd output of the hires ffmpeg (single DeckLink read, split to ProRes/S3 + H.264/HLS); node-agent serves /live over HTTP; mam-api proxies GET /recorders/:id/live/* to the recorder node; web-ui HlsPreview loads from the proxied URL.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
## capture service
- capture-manager.js: add 'deltacast' source_type to _buildInputArgs.
Uses 'deltacast://<index>' with ffmpeg deltacast demuxer when
/dev/deltacast<N> exists; falls back to lavfi testsrc2 + sine test card
(matching deltacast-sdi-recorder standalone app) when hardware absent.
- routes/capture.js: add GET /devices/deltacast endpoint (enumerates
/dev/deltacast* + DELTACAST_PORT_COUNT env fallback). Extend /probe to
handle source_type=deltacast.
## node-agent
- detectHardware(): add 'deltacast' array to capabilities payload.
Enumerates /dev/deltacast* nodes; falls back to DELTACAST_PORT_COUNT env.
Adds DELTACAST_MODEL env support. Logs dc= count in heartbeat line.
- sidecar /start: bind /dev/deltacast* device nodes into capture containers
when sourceType='deltacast'.
## mam-api
- cluster.js: add GET /cluster/devices/deltacast and
GET /cluster/devices/deltacast/signal endpoints — same shape as
blackmagic equivalents for UI parity.
- recorders.js /start: pass DELTACAST_PORT_COUNT env to capture container;
bind /dev/deltacast* device nodes on local spawn.
- migration 024: ALTER TYPE source_type ADD VALUE 'deltacast' (idempotent).
- schema.sql: add 'deltacast' to source_type ENUM for fresh installs.
## web-ui
- modal-new-recorder.jsx: add 'Deltacast' source type card; fetch
/cluster/devices/deltacast on selection; port picker with TEST CARD
badge when hardware absent; falls through to manual index entry if
no devices detected.
Frontend / UX / a11y
- Sidebar collapse/expand toggle with localStorage persistence (#142)
- Settings sections wrap inputs in <form> with Enter-to-submit + native
validation; password autocomplete=new-password (#141, #138)
- Asset thumbnails get descriptive alt text (#140)
- Production deploy now precompiles JSX via esbuild and loads the
production React UMD instead of dev builds + in-browser Babel (#139,
#122)
- Search wrapper gets role=search; global search input gets aria-label,
role=combobox, aria-controls/aria-expanded/aria-activedescendant
wiring (#137, #135)
- Dashboard and Library no longer share the same nav icon (#136)
- Sidebar collapses off-canvas with a topbar menu button below 768 px;
mobile default is collapsed (#134)
- --text-3 bumped to #8B92A0 for WCAG AA contrast on --bg-0 (#133)
- Schedule and Library routes were rendering empty inside the .main
flex container — switched to flex:1 + min-height:0 (#131, #132,
editor + asset detail get the same fix)
- Jobs nav badge now polls /jobs?status=active every 10 s and reflects
the live count (#130, #113)
- aria-label sweep on every icon-only button (#126)
- Premiere panel release list moved to window.PREMIERE_RELEASES in
data.jsx; Editor + Settings read from the same source (#125)
- Typo setPgMclips → setPgmClips (#124)
- Stray console.error / console.warn calls gated behind
window.DF_LOG.{warn,error} (#123)
- Hardcoded /api/v1 paths route through window.ZAMPP_API_PREFIX (#115)
- Schedule rows no longer crash on null recorder_id (#117)
- EditorKeyboard guards against document.activeElement === null (#116)
- Unmount-safe timers for PasswordResetModal, Containers, Editor (#111)
- Player seek clamps below totalMs, server-side range clamping +
uncached 416 on EOF, client-side EOF-stall watchdog (#143)
- Duration badge overlap fix on narrow asset cards (#52)
Backend / security / reliability
- GET /recorders fixed N+1: single LATERAL JOIN for live_asset_id;
Docker inspects bounded to actually-recording rows (#121)
- Upload disk-storage (multer.diskStorage) streams parts to S3 instead
of buffering 500 MB in RAM (#120)
- /assets list clamps limit to MAX_LIMIT=500 to prevent OOM (#119)
- SDK upload archive listing + post-extract sanitize block zip-slip /
tar-slip and symlink escapes (#118)
- Migrations track applied state in schema_migrations, run in a
transaction, and exit non-zero on failure (#107)
- node-agent BMD_COUNT override uses BMD_DEVICE_PREFIX; filesystem
detection wins (#109, #127)
- GPU_COUNT override now merges with nvidia-smi enrichment (#108)
- /cluster/heartbeat requires a node-bound token or admin user;
tokens carry bound_hostname (#106)
- /recorders/:id/start error responses no longer echo the Docker
create payload — env vars stay out of client responses (#105)
- /recorders/probe restricts schemes (srt/rtmp/rtsp/udp/rtp), blocks
private + loopback hosts for non-admins, denies common service
ports (#104)
- Scheduler tick guarded by a Postgres advisory lock; pending/running
rows claimed via UPDATE...RETURNING + FOR UPDATE SKIP LOCKED to
survive multi-node deploys (#103)
- UUID validateUuid('id') param middleware on every /:id route (#102)
- Error handler scrubs Postgres error messages and 5xx detail (#101)
- Graceful SIGTERM/SIGINT shutdown — stops scheduler, drains the HTTP
server, ends the pool, 25 s force-exit watchdog (#100)
- AMPP sync moved from fire-and-forget to a persisted retry queue
(ampp_sync_status / attempts / next_attempt_at + scheduler retry
loop with exponential backoff) (#77)
Migrations
- 019: api_tokens.bound_hostname (#106)
- 020: assets.ampp_sync_status + retry bookkeeping (#77)
Other
- Defer #92 Growing-files per-upload toggle, #80 Audio tab, #57
Dashboard redesign, #56 Editor SPA polish phase 3, #114 S3
migration tool to v1.3
Replaced sync execFileSync('docker') approach (no docker CLI in container)
with async Docker socket HTTP API calls:
- POST /containers/create with nvidia runtime + DeviceRequests
- POST /containers/:id/start
- Poll inspect until not running
- GET /containers/:id/logs, strip 8-byte frame headers, parse csv
probeGpusViaSmi() runs once at startup before the first heartbeat.
Result cached in _gpuCache; detectHardware() reads cache on every heartbeat.
Falls back to /dev/nvidia* scan if probe fails or runtime unavailable.
nsenter approach failed (requires SYS_ADMIN in container).
nvidia-smi bind-mount failed (Alpine vs Ubuntu glibc incompatibility).
Working solution: spawn 'docker run --rm --gpus all ubuntu:22.04 nvidia-smi'
via the Docker socket. The NVIDIA Container Runtime injects nvidia-smi and
driver libs into any container with --gpus all, regardless of the base image.
ubuntu:22.04 is already cached on GPU nodes.
Result: GPU reported with name, memory_mb, driver_version — shows as BOUND
in the cluster UI.
nvidia-smi bind-mount failed due to Alpine vs Ubuntu glibc incompatibility.
Fix: nsenter --mount=/proc/1/ns/mnt -- nvidia-smi runs in the host's mount
namespace where glibc and all NVIDIA driver libs are present.
Requires pid: host in docker-compose.worker.yml (already has network: host).
nsenter is provided by util-linux in Alpine — already in the image.
Falls back to direct nvidia-smi call (for glibc-based containers), then
to /dev/nvidia* file scan if all attempts fail.
index.js:
- detectGpusViaSmi(): runs nvidia-smi --query-gpu=index,name,memory.total,
driver_version and parses the output into structured GPU objects with
name, memory_mb, driver, device — the same fields the cluster UI uses
to determine BOUND status
- Falls back to /dev/nvidia* file scan if nvidia-smi isn't available
docker-compose.worker.yml:
- Bind-mount /usr/bin/nvidia-smi and libnvidia-ml.so.1 from host into
node-agent container (read-only). These are the minimum binaries needed
for nvidia-smi to execute inside the container.
- Mounts are optional — Docker ignores them silently if paths don't exist
(e.g. on nodes without NVIDIA hardware)
Two bugs fixed:
1. SDI capture sidecar never had /dev/blackmagic bound — ffmpeg opened the
decklink input inside a container with no device nodes, so frame=0.
Fix: local spawns now push '/dev/blackmagic:/dev/blackmagic' onto Binds
when source_type='sdi'.
2. recorders.js always spawned sidecars against the local Docker socket
(zampp1), even when a recorder's node_id pointed at zampp2 (where the
card is). Fix: resolveNodeTarget() looks up the recorder's cluster node;
if it's a different hostname the sidecar is spawned via a new
POST /sidecar/start endpoint on the remote node-agent.
node-agent gains three new routes (all talk to the local Docker socket):
POST /sidecar/start — create + start container (host network,
privileged, /dev/blackmagic bind for sdi)
DELETE /sidecar/:id — stop + remove
GET /sidecar/:id/status — inspect + poll capture service
docker-compose.worker.yml: add /var/run/docker.sock and LIVE_DIR to
node-agent so it can spawn sidecars, and document build-capture prerequisite.: index.js
In bridge mode the agent was reporting the container's 172.x address
because the first non-internal interface in os.networkInterfaces() was
docker0. Now honours NODE_IP, skips lo/docker*/br-*/veth*/etc, and
down-ranks the 172.16-31 range so real LAN IPs win. Also exposes the
detected IP on /health for the onboarding script to print.