nsenter approach failed (requires SYS_ADMIN in container).
nvidia-smi bind-mount failed (Alpine vs Ubuntu glibc incompatibility).
Working solution: spawn 'docker run --rm --gpus all ubuntu:22.04 nvidia-smi'
via the Docker socket. The NVIDIA Container Runtime injects nvidia-smi and
driver libs into any container with --gpus all, regardless of the base image.
ubuntu:22.04 is already cached on GPU nodes.
Result: GPU reported with name, memory_mb, driver_version — shows as BOUND
in the cluster UI.
nvidia-smi bind-mount failed due to Alpine vs Ubuntu glibc incompatibility.
Fix: nsenter --mount=/proc/1/ns/mnt -- nvidia-smi runs in the host's mount
namespace where glibc and all NVIDIA driver libs are present.
Requires pid: host in docker-compose.worker.yml (already has network: host).
nsenter is provided by util-linux in Alpine — already in the image.
Falls back to direct nvidia-smi call (for glibc-based containers), then
to /dev/nvidia* file scan if all attempts fail.
index.js:
- detectGpusViaSmi(): runs nvidia-smi --query-gpu=index,name,memory.total,
driver_version and parses the output into structured GPU objects with
name, memory_mb, driver, device — the same fields the cluster UI uses
to determine BOUND status
- Falls back to /dev/nvidia* file scan if nvidia-smi isn't available
docker-compose.worker.yml:
- Bind-mount /usr/bin/nvidia-smi and libnvidia-ml.so.1 from host into
node-agent container (read-only). These are the minimum binaries needed
for nvidia-smi to execute inside the container.
- Mounts are optional — Docker ignores them silently if paths don't exist
(e.g. on nodes without NVIDIA hardware)
Two bugs fixed:
1. SDI capture sidecar never had /dev/blackmagic bound — ffmpeg opened the
decklink input inside a container with no device nodes, so frame=0.
Fix: local spawns now push '/dev/blackmagic:/dev/blackmagic' onto Binds
when source_type='sdi'.
2. recorders.js always spawned sidecars against the local Docker socket
(zampp1), even when a recorder's node_id pointed at zampp2 (where the
card is). Fix: resolveNodeTarget() looks up the recorder's cluster node;
if it's a different hostname the sidecar is spawned via a new
POST /sidecar/start endpoint on the remote node-agent.
node-agent gains three new routes (all talk to the local Docker socket):
POST /sidecar/start — create + start container (host network,
privileged, /dev/blackmagic bind for sdi)
DELETE /sidecar/:id — stop + remove
GET /sidecar/:id/status — inspect + poll capture service
docker-compose.worker.yml: add /var/run/docker.sock and LIVE_DIR to
node-agent so it can spawn sidecars, and document build-capture prerequisite.: index.js
In bridge mode the agent was reporting the container's 172.x address
because the first non-internal interface in os.networkInterfaces() was
docker0. Now honours NODE_IP, skips lo/docker*/br-*/veth*/etc, and
down-ranks the 172.16-31 range so real LAN IPs win. Also exposes the
detected IP on /health for the onboarding script to print.