dragonflight

Author	SHA1	Message	Date
Zac Gaetano	f2542bc929	feat(nvenc): GPU sidecar passthrough + All-Intra HEVC capture codec Phase 0.2 of the NVENC All-Intra HEVC ingest plan. node-agent/handleSidecarStart: - Accept useGpu: true in the sidecar start body - When useGpu: adds Runtime=nvidia, DeviceRequests=[gpu], and injects NVIDIA_VISIBLE_DEVICES=all + NVIDIA_DRIVER_CAPABILITIES=video,compute,utility into the container env. CPU-codec recorders are unaffected (useGpu defaults false). mam-api/recorders (start endpoint): - Derive useGpu from recorder.recording_codec — true for hevc_nvenc/h264_nvenc - Pass useGpu to remote sidecar start body - Apply same Runtime/DeviceRequests to the local Docker spawn path capture/capture-manager: - Update hevc_nvenc codec entry with all-intra flags: -g 1 -bf 0 (every frame IDR, no B-frames — required for growing-file edit-while-record), -rc vbr, -profile:v main10, pixFmt p010le (10-bit 4:2:0) Next: validation gate (§8) — test MXF OP1a then fragmented MOV on one DeckLink channel, mount in Premiere while recording. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 12:35:23 -04:00
Zac Gaetano	92b460f503	fix(recorder): finalise live asset on stop + add live SDI monitor Stuck-live fix: capture sidecar now finalises the pre-created live asset by id (new POST /assets/:id/finalize) instead of POSTing a new asset (409 collision); node-agent gives the sidecar a 180s stop grace so the S3 upload + callback complete; node-agent logs sidecar start/stop for diagnostics. Live SDI monitor: HLS preview is now a 2nd output of the hires ffmpeg (single DeckLink read, split to ProRes/S3 + H.264/HLS); node-agent serves /live over HTTP; mam-api proxies GET /recorders/:id/live/* to the recorder node; web-ui HlsPreview loads from the proxied URL. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-29 03:20:20 +00:00
ZGaetano	888ca65045	feat(capture): Deltacast SDI framework — test-card capture, cluster detection, UI ## capture service - capture-manager.js: add 'deltacast' source_type to _buildInputArgs. Uses 'deltacast://<index>' with ffmpeg deltacast demuxer when /dev/deltacast<N> exists; falls back to lavfi testsrc2 + sine test card (matching deltacast-sdi-recorder standalone app) when hardware absent. - routes/capture.js: add GET /devices/deltacast endpoint (enumerates /dev/deltacast* + DELTACAST_PORT_COUNT env fallback). Extend /probe to handle source_type=deltacast. ## node-agent - detectHardware(): add 'deltacast' array to capabilities payload. Enumerates /dev/deltacast* nodes; falls back to DELTACAST_PORT_COUNT env. Adds DELTACAST_MODEL env support. Logs dc= count in heartbeat line. - sidecar /start: bind /dev/deltacast* device nodes into capture containers when sourceType='deltacast'. ## mam-api - cluster.js: add GET /cluster/devices/deltacast and GET /cluster/devices/deltacast/signal endpoints — same shape as blackmagic equivalents for UI parity. - recorders.js /start: pass DELTACAST_PORT_COUNT env to capture container; bind /dev/deltacast* device nodes on local spawn. - migration 024: ALTER TYPE source_type ADD VALUE 'deltacast' (idempotent). - schema.sql: add 'deltacast' to source_type ENUM for fresh installs. ## web-ui - modal-new-recorder.jsx: add 'Deltacast' source type card; fetch /cluster/devices/deltacast on selection; port picker with TEST CARD badge when hardware absent; falls through to manual index entry if no devices detected.	2026-05-28 23:12:40 +00:00
opencode	04ce096e67	chore: 1.2 ship-prep sweep — close 38 issues Frontend / UX / a11y - Sidebar collapse/expand toggle with localStorage persistence (#142) - Settings sections wrap inputs in <form> with Enter-to-submit + native validation; password autocomplete=new-password (#141, #138) - Asset thumbnails get descriptive alt text (#140) - Production deploy now precompiles JSX via esbuild and loads the production React UMD instead of dev builds + in-browser Babel (#139, #122) - Search wrapper gets role=search; global search input gets aria-label, role=combobox, aria-controls/aria-expanded/aria-activedescendant wiring (#137, #135) - Dashboard and Library no longer share the same nav icon (#136) - Sidebar collapses off-canvas with a topbar menu button below 768 px; mobile default is collapsed (#134) - --text-3 bumped to #8B92A0 for WCAG AA contrast on --bg-0 (#133) - Schedule and Library routes were rendering empty inside the .main flex container — switched to flex:1 + min-height:0 (#131, #132, editor + asset detail get the same fix) - Jobs nav badge now polls /jobs?status=active every 10 s and reflects the live count (#130, #113) - aria-label sweep on every icon-only button (#126) - Premiere panel release list moved to window.PREMIERE_RELEASES in data.jsx; Editor + Settings read from the same source (#125) - Typo setPgMclips → setPgmClips (#124) - Stray console.error / console.warn calls gated behind window.DF_LOG.{warn,error} (#123) - Hardcoded /api/v1 paths route through window.ZAMPP_API_PREFIX (#115) - Schedule rows no longer crash on null recorder_id (#117) - EditorKeyboard guards against document.activeElement === null (#116) - Unmount-safe timers for PasswordResetModal, Containers, Editor (#111) - Player seek clamps below totalMs, server-side range clamping + uncached 416 on EOF, client-side EOF-stall watchdog (#143) - Duration badge overlap fix on narrow asset cards (#52) Backend / security / reliability - GET /recorders fixed N+1: single LATERAL JOIN for live_asset_id; Docker inspects bounded to actually-recording rows (#121) - Upload disk-storage (multer.diskStorage) streams parts to S3 instead of buffering 500 MB in RAM (#120) - /assets list clamps limit to MAX_LIMIT=500 to prevent OOM (#119) - SDK upload archive listing + post-extract sanitize block zip-slip / tar-slip and symlink escapes (#118) - Migrations track applied state in schema_migrations, run in a transaction, and exit non-zero on failure (#107) - node-agent BMD_COUNT override uses BMD_DEVICE_PREFIX; filesystem detection wins (#109, #127) - GPU_COUNT override now merges with nvidia-smi enrichment (#108) - /cluster/heartbeat requires a node-bound token or admin user; tokens carry bound_hostname (#106) - /recorders/:id/start error responses no longer echo the Docker create payload — env vars stay out of client responses (#105) - /recorders/probe restricts schemes (srt/rtmp/rtsp/udp/rtp), blocks private + loopback hosts for non-admins, denies common service ports (#104) - Scheduler tick guarded by a Postgres advisory lock; pending/running rows claimed via UPDATE...RETURNING + FOR UPDATE SKIP LOCKED to survive multi-node deploys (#103) - UUID validateUuid('id') param middleware on every /:id route (#102) - Error handler scrubs Postgres error messages and 5xx detail (#101) - Graceful SIGTERM/SIGINT shutdown — stops scheduler, drains the HTTP server, ends the pool, 25 s force-exit watchdog (#100) - AMPP sync moved from fire-and-forget to a persisted retry queue (ampp_sync_status / attempts / next_attempt_at + scheduler retry loop with exponential backoff) (#77) Migrations - 019: api_tokens.bound_hostname (#106) - 020: assets.ampp_sync_status + retry bookkeeping (#77) Other - Defer #92 Growing-files per-upload toggle, #80 Audio tab, #57 Dashboard redesign, #56 Editor SPA polish phase 3, #114 S3 migration tool to v1.3	2026-05-27 02:06:14 +00:00
ZGaetano	a6f045b3d7	fix(node-agent): probe GPU via Docker API async at startup, cache result Replaced sync execFileSync('docker') approach (no docker CLI in container) with async Docker socket HTTP API calls: - POST /containers/create with nvidia runtime + DeviceRequests - POST /containers/:id/start - Poll inspect until not running - GET /containers/:id/logs, strip 8-byte frame headers, parse csv probeGpusViaSmi() runs once at startup before the first heartbeat. Result cached in _gpuCache; detectHardware() reads cache on every heartbeat. Falls back to /dev/nvidia* scan if probe fails or runtime unavailable.	2026-05-26 18:28:03 +00:00
ZGaetano	558c18e417	fix(node-agent): detect GPUs via docker run --gpus all ubuntu:22.04 nsenter approach failed (requires SYS_ADMIN in container). nvidia-smi bind-mount failed (Alpine vs Ubuntu glibc incompatibility). Working solution: spawn 'docker run --rm --gpus all ubuntu:22.04 nvidia-smi' via the Docker socket. The NVIDIA Container Runtime injects nvidia-smi and driver libs into any container with --gpus all, regardless of the base image. ubuntu:22.04 is already cached on GPU nodes. Result: GPU reported with name, memory_mb, driver_version — shows as BOUND in the cluster UI.	2026-05-26 18:25:44 +00:00
ZGaetano	5ff507b81b	fix(node-agent): use nsenter to run nvidia-smi in host mount namespace nvidia-smi bind-mount failed due to Alpine vs Ubuntu glibc incompatibility. Fix: nsenter --mount=/proc/1/ns/mnt -- nvidia-smi runs in the host's mount namespace where glibc and all NVIDIA driver libs are present. Requires pid: host in docker-compose.worker.yml (already has network: host). nsenter is provided by util-linux in Alpine — already in the image. Falls back to direct nvidia-smi call (for glibc-based containers), then to /dev/nvidia* file scan if all attempts fail.	2026-05-26 18:22:11 +00:00
ZGaetano	726343db96	fix(node-agent): bind nvidia-smi for full GPU info (name, VRAM, driver) index.js: - detectGpusViaSmi(): runs nvidia-smi --query-gpu=index,name,memory.total, driver_version and parses the output into structured GPU objects with name, memory_mb, driver, device — the same fields the cluster UI uses to determine BOUND status - Falls back to /dev/nvidia* file scan if nvidia-smi isn't available docker-compose.worker.yml: - Bind-mount /usr/bin/nvidia-smi and libnvidia-ml.so.1 from host into node-agent container (read-only). These are the minimum binaries needed for nvidia-smi to execute inside the container. - Mounts are optional — Docker ignores them silently if paths don't exist (e.g. on nodes without NVIDIA hardware)	2026-05-26 18:19:23 +00:00
ZGaetano	8186b181cc	fix(decklink): mount /dev/blackmagic in sidecar + remote node routing via node-agent Two bugs fixed: 1. SDI capture sidecar never had /dev/blackmagic bound — ffmpeg opened the decklink input inside a container with no device nodes, so frame=0. Fix: local spawns now push '/dev/blackmagic:/dev/blackmagic' onto Binds when source_type='sdi'. 2. recorders.js always spawned sidecars against the local Docker socket (zampp1), even when a recorder's node_id pointed at zampp2 (where the card is). Fix: resolveNodeTarget() looks up the recorder's cluster node; if it's a different hostname the sidecar is spawned via a new POST /sidecar/start endpoint on the remote node-agent. node-agent gains three new routes (all talk to the local Docker socket): POST /sidecar/start — create + start container (host network, privileged, /dev/blackmagic bind for sdi) DELETE /sidecar/:id — stop + remove GET /sidecar/:id/status — inspect + poll capture service docker-compose.worker.yml: add /var/run/docker.sock and LIVE_DIR to node-agent so it can spawn sidecars, and document build-capture prerequisite.: index.js	2026-05-21 18:51:09 -04:00
ZGaetano	3b4af6ef11	node-agent: prefer NODE_IP and skip docker bridge interfaces In bridge mode the agent was reporting the container's 172.x address because the first non-internal interface in os.networkInterfaces() was docker0. Now honours NODE_IP, skips lo/docker/br-/veth*/etc, and down-ranks the 172.16-31 range so real LAN IPs win. Also exposes the detected IP on /health for the onboarding script to print.	2026-05-21 00:15:03 -04:00
ZGaetano	cc8ee63639	fix(node-agent): replace express with built-in http — no external deps needed	2026-05-20 22:59:03 -04:00
ZGaetano	a941f609f0	feat: node-agent detects NVIDIA GPUs and Blackmagic DeckLink cards, reports in heartbeat	2026-05-20 14:18:07 -04:00
ZGaetano	c5a358888b	feat(node-agent): heartbeat agent — CPU/mem stats, health endpoint, bearer token auth	2026-05-20 13:48:18 -04:00
ZGaetano	0bc1ac9161	feat(node-agent): add Dockerfile	2026-05-20 13:47:57 -04:00
ZGaetano	feb78b8bcb	feat(node-agent): add package.json for cluster heartbeat agent	2026-05-20 13:47:53 -04:00

15 commits