dragonflight/services/node-agent
ZGaetano a6f045b3d7 fix(node-agent): probe GPU via Docker API async at startup, cache result
Replaced sync execFileSync('docker') approach (no docker CLI in container)
with async Docker socket HTTP API calls:
- POST /containers/create with nvidia runtime + DeviceRequests
- POST /containers/:id/start
- Poll inspect until not running
- GET /containers/:id/logs, strip 8-byte frame headers, parse csv

probeGpusViaSmi() runs once at startup before the first heartbeat.
Result cached in _gpuCache; detectHardware() reads cache on every heartbeat.
Falls back to /dev/nvidia* scan if probe fails or runtime unavailable.
2026-05-26 18:28:03 +00:00
..
Dockerfile feat(node-agent): add Dockerfile 2026-05-20 13:47:57 -04:00
index.js fix(node-agent): probe GPU via Docker API async at startup, cache result 2026-05-26 18:28:03 +00:00
package.json feat(node-agent): add package.json for cluster heartbeat agent 2026-05-20 13:47:53 -04:00