Commit graph

9 commits

Author SHA1 Message Date
92b460f503 fix(recorder): finalise live asset on stop + add live SDI monitor
Stuck-live fix: capture sidecar now finalises the pre-created live asset by id (new POST /assets/:id/finalize) instead of POSTing a new asset (409 collision); node-agent gives the sidecar a 180s stop grace so the S3 upload + callback complete; node-agent logs sidecar start/stop for diagnostics.

Live SDI monitor: HLS preview is now a 2nd output of the hires ffmpeg (single DeckLink read, split to ProRes/S3 + H.264/HLS); node-agent serves /live over HTTP; mam-api proxies GET /recorders/:id/live/* to the recorder node; web-ui HlsPreview loads from the proxied URL.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 03:20:20 +00:00
opencode
04ce096e67 chore: 1.2 ship-prep sweep — close 38 issues
Frontend / UX / a11y
- Sidebar collapse/expand toggle with localStorage persistence (#142)
- Settings sections wrap inputs in <form> with Enter-to-submit + native
  validation; password autocomplete=new-password (#141, #138)
- Asset thumbnails get descriptive alt text (#140)
- Production deploy now precompiles JSX via esbuild and loads the
  production React UMD instead of dev builds + in-browser Babel (#139,
  #122)
- Search wrapper gets role=search; global search input gets aria-label,
  role=combobox, aria-controls/aria-expanded/aria-activedescendant
  wiring (#137, #135)
- Dashboard and Library no longer share the same nav icon (#136)
- Sidebar collapses off-canvas with a topbar menu button below 768 px;
  mobile default is collapsed (#134)
- --text-3 bumped to #8B92A0 for WCAG AA contrast on --bg-0 (#133)
- Schedule and Library routes were rendering empty inside the .main
  flex container — switched to flex:1 + min-height:0 (#131, #132,
  editor + asset detail get the same fix)
- Jobs nav badge now polls /jobs?status=active every 10 s and reflects
  the live count (#130, #113)
- aria-label sweep on every icon-only button (#126)
- Premiere panel release list moved to window.PREMIERE_RELEASES in
  data.jsx; Editor + Settings read from the same source (#125)
- Typo setPgMclips → setPgmClips (#124)
- Stray console.error / console.warn calls gated behind
  window.DF_LOG.{warn,error} (#123)
- Hardcoded /api/v1 paths route through window.ZAMPP_API_PREFIX (#115)
- Schedule rows no longer crash on null recorder_id (#117)
- EditorKeyboard guards against document.activeElement === null (#116)
- Unmount-safe timers for PasswordResetModal, Containers, Editor (#111)
- Player seek clamps below totalMs, server-side range clamping +
  uncached 416 on EOF, client-side EOF-stall watchdog (#143)
- Duration badge overlap fix on narrow asset cards (#52)

Backend / security / reliability
- GET /recorders fixed N+1: single LATERAL JOIN for live_asset_id;
  Docker inspects bounded to actually-recording rows (#121)
- Upload disk-storage (multer.diskStorage) streams parts to S3 instead
  of buffering 500 MB in RAM (#120)
- /assets list clamps limit to MAX_LIMIT=500 to prevent OOM (#119)
- SDK upload archive listing + post-extract sanitize block zip-slip /
  tar-slip and symlink escapes (#118)
- Migrations track applied state in schema_migrations, run in a
  transaction, and exit non-zero on failure (#107)
- node-agent BMD_COUNT override uses BMD_DEVICE_PREFIX; filesystem
  detection wins (#109, #127)
- GPU_COUNT override now merges with nvidia-smi enrichment (#108)
- /cluster/heartbeat requires a node-bound token or admin user;
  tokens carry bound_hostname (#106)
- /recorders/:id/start error responses no longer echo the Docker
  create payload — env vars stay out of client responses (#105)
- /recorders/probe restricts schemes (srt/rtmp/rtsp/udp/rtp), blocks
  private + loopback hosts for non-admins, denies common service
  ports (#104)
- Scheduler tick guarded by a Postgres advisory lock; pending/running
  rows claimed via UPDATE...RETURNING + FOR UPDATE SKIP LOCKED to
  survive multi-node deploys (#103)
- UUID validateUuid('id') param middleware on every /:id route (#102)
- Error handler scrubs Postgres error messages and 5xx detail (#101)
- Graceful SIGTERM/SIGINT shutdown — stops scheduler, drains the HTTP
  server, ends the pool, 25 s force-exit watchdog (#100)
- AMPP sync moved from fire-and-forget to a persisted retry queue
  (ampp_sync_status / attempts / next_attempt_at + scheduler retry
  loop with exponential backoff) (#77)

Migrations
- 019: api_tokens.bound_hostname (#106)
- 020: assets.ampp_sync_status + retry bookkeeping (#77)

Other
- Defer #92 Growing-files per-upload toggle, #80 Audio tab, #57
  Dashboard redesign, #56 Editor SPA polish phase 3, #114 S3
  migration tool to v1.3
2026-05-27 02:06:14 +00:00
558c18e417 fix(node-agent): detect GPUs via docker run --gpus all ubuntu:22.04
nsenter approach failed (requires SYS_ADMIN in container).
nvidia-smi bind-mount failed (Alpine vs Ubuntu glibc incompatibility).

Working solution: spawn 'docker run --rm --gpus all ubuntu:22.04 nvidia-smi'
via the Docker socket. The NVIDIA Container Runtime injects nvidia-smi and
driver libs into any container with --gpus all, regardless of the base image.
ubuntu:22.04 is already cached on GPU nodes.

Result: GPU reported with name, memory_mb, driver_version — shows as BOUND
in the cluster UI.
2026-05-26 18:25:44 +00:00
5ff507b81b fix(node-agent): use nsenter to run nvidia-smi in host mount namespace
nvidia-smi bind-mount failed due to Alpine vs Ubuntu glibc incompatibility.
Fix: nsenter --mount=/proc/1/ns/mnt -- nvidia-smi runs in the host's mount
namespace where glibc and all NVIDIA driver libs are present.

Requires pid: host in docker-compose.worker.yml (already has network: host).
nsenter is provided by util-linux in Alpine — already in the image.

Falls back to direct nvidia-smi call (for glibc-based containers), then
to /dev/nvidia* file scan if all attempts fail.
2026-05-26 18:22:11 +00:00
726343db96 fix(node-agent): bind nvidia-smi for full GPU info (name, VRAM, driver)
index.js:
- detectGpusViaSmi(): runs nvidia-smi --query-gpu=index,name,memory.total,
  driver_version and parses the output into structured GPU objects with
  name, memory_mb, driver, device — the same fields the cluster UI uses
  to determine BOUND status
- Falls back to /dev/nvidia* file scan if nvidia-smi isn't available

docker-compose.worker.yml:
- Bind-mount /usr/bin/nvidia-smi and libnvidia-ml.so.1 from host into
  node-agent container (read-only). These are the minimum binaries needed
  for nvidia-smi to execute inside the container.
- Mounts are optional — Docker ignores them silently if paths don't exist
  (e.g. on nodes without NVIDIA hardware)
2026-05-26 18:19:23 +00:00
486e3c27e4 fix(decklink): mount /dev/blackmagic in sidecar + remote node routing via node-agent
Two bugs fixed:
1. SDI capture sidecar never had /dev/blackmagic bound — ffmpeg opened the
   decklink input inside a container with no device nodes, so frame=0.
   Fix: local spawns now push '/dev/blackmagic:/dev/blackmagic' onto Binds
   when source_type='sdi'.

2. recorders.js always spawned sidecars against the local Docker socket
   (zampp1), even when a recorder's node_id pointed at zampp2 (where the
   card is). Fix: resolveNodeTarget() looks up the recorder's cluster node;
   if it's a different hostname the sidecar is spawned via a new
   POST /sidecar/start endpoint on the remote node-agent.

node-agent gains three new routes (all talk to the local Docker socket):
  POST   /sidecar/start         — create + start container (host network,
                                   privileged, /dev/blackmagic bind for sdi)
  DELETE /sidecar/:id           — stop + remove
  GET    /sidecar/:id/status    — inspect + poll capture service

docker-compose.worker.yml: add /var/run/docker.sock and LIVE_DIR to
node-agent so it can spawn sidecars, and document build-capture prerequisite.: docker-compose.worker.yml
2026-05-21 18:51:11 -04:00
40a66bae03 worker compose: run node-agent in host network mode
The agent now uses network_mode: host so os.hostname() and the network
interface list reflect the actual host instead of a per-container random
hex hostname (root cause of duplicate Zampp2 rows) and a 172.x docker
bridge IP. Also passes NODE_IP and BMD_MODEL through for explicit overrides.
2026-05-21 00:14:33 -04:00
7486539b32 feat: worker compose adds capture profile (BMD DeckLink) and GPU env vars 2026-05-20 14:19:21 -04:00
0b255e063f feat(deploy): docker-compose.worker.yml for cluster worker nodes 2026-05-20 13:48:27 -04:00