79 lines
3.8 KiB
Markdown
79 lines
3.8 KiB
Markdown
# Capability-Routed GPU Worker Pool + Job Node Attribution
|
|
|
|
Date: 2026-05-28 | Status: approved (design), pending implementation
|
|
|
|
## Problem
|
|
All transcode/proxy jobs run on a single zampp1 worker configured with
|
|
`NVENC_ENABLED=` (empty) -> CPU libx264, despite a Tesla P4 in the box.
|
|
zampp2's L4 runs no worker (0% util). No visibility into which node/GPU ran a
|
|
job. Idle hardware: Tesla P4 + 2x Quadro P400 (zampp1), L4 (zampp2).
|
|
|
|
## Goals
|
|
1. Use all GPUs, routed by capability (heavy encodes on strong cards, light
|
|
decode-only jobs on weak cards).
|
|
2. Distribute jobs across nodes automatically.
|
|
3. Show which node + GPU ran each job in the Jobs UI.
|
|
Non-goals: autoscaling, custom scheduler beyond BullMQ competing-consumers,
|
|
multi-GPU selection inside one worker process.
|
|
|
|
## Current architecture (facts)
|
|
- BullMQ on shared Redis; queues already type-named: proxy, thumbnail,
|
|
filmstrip, conform, trim. mam-api enqueues by type -> NO mam-api change.
|
|
- worker/src/index.js creates a Worker per queue in one process; per-queue
|
|
*_CONCURRENCY envs already exist.
|
|
- proxy.js picks gpu_codec || (gpuEnabled ? h264_nvenc : libx264) and falls
|
|
back to libx264 on GPU encode failure (proxy.js:181).
|
|
- Chain: proxy -> thumbnail -> filmstrip. Workers are competing consumers.
|
|
|
|
## Design
|
|
|
|
### Tiers (by queue subscription)
|
|
- HEAVY: subscribes proxy, conform, trim. Cards: Tesla P4 (zampp1), L4
|
|
(zampp2). NVENC_ENABLED=true -> h264_nvenc.
|
|
- LIGHT: subscribes thumbnail, filmstrip. Cards: 2x Quadro P400 (zampp1).
|
|
P400s never subscribe to heavy queues, so a weak card cannot bottleneck a heavy
|
|
job. Strong cards do not subscribe to light queues in v1 (clean tiers; revisit
|
|
if light backlog ever starves while P4/L4 idle).
|
|
|
|
### Worker change (only code change)
|
|
Add WORKER_QUEUES env (comma list) to worker/src/index.js: only create Workers
|
|
for listed queues; unset = all (back-compat). No GPU-selection code change —
|
|
each container pinned to one GPU via NVIDIA_VISIBLE_DEVICES (sees it as dev 0).
|
|
|
|
### Topology
|
|
zampp1 (docker-compose.yml + gpu overlay):
|
|
- worker-p4 : VISIBLE=<P4 uuid>, NVENC_ENABLED=true, WORKER_QUEUES=proxy,conform,trim
|
|
- worker-p400a : VISIBLE=<P400a uuid>, WORKER_QUEUES=thumbnail,filmstrip
|
|
- worker-p400b : VISIBLE=<P400b uuid>, WORKER_QUEUES=thumbnail,filmstrip
|
|
zampp2 (docker-compose.worker.yml + gpu overlay):
|
|
- worker-l4 : VISIBLE=<L4 uuid>, NVENC_ENABLED=true, WORKER_QUEUES=proxy,conform,trim
|
|
needs REDIS_URL / DATABASE_URL / S3_* in zampp2 .env pointing at zampp1.
|
|
Pin by GPU UUID (nvidia-smi -L) not index, so reordering does not remap cards.
|
|
|
|
### Concurrency (initial, tunable via *_CONCURRENCY)
|
|
P4 proxy 2 ; L4 proxy 3 ; conform/trim 1 ; P400 thumbnail 2, filmstrip 2.
|
|
|
|
### Node/GPU attribution (phase 2)
|
|
Stamp each job with node hostname + GPU + tier on start. Mechanism pending
|
|
confirmation of whether Jobs screen reads the jobs DB table or live BullMQ
|
|
(appeared empty during runs): if DB-backed add node/gpu cols (or result jsonb)
|
|
and stamp in createWorker; if BullMQ-backed include node/gpu in job data and
|
|
surface via jobs API. Show "zampp2 / L4" in the Jobs UI row.
|
|
|
|
## Risks
|
|
- zampp2: heavy NVENC job can contend with an active recording's CPU work
|
|
(decode/mux uses CPU; GPU itself separate). Mitigate via L4 concurrency and,
|
|
if needed, pausing heavy intake during active capture.
|
|
- P400 NVENC session caps -> covered by libx264 fallback.
|
|
- zampp2 worker needs reach to zampp1 Redis/Postgres/S3 (already proven for API).
|
|
|
|
## Verification
|
|
- Enqueue several proxies; nvidia-smi shows encoder util on P4 (zampp1) and L4
|
|
(zampp2); P400s only on thumbnail/filmstrip.
|
|
- A heavy job never lands on a P400 (worker logs / attribution).
|
|
- Assets still reach ready with proxy+thumbnail+filmstrip.
|
|
- Phase 2: Jobs UI shows correct node/GPU per job.
|
|
|
|
## Rollout order
|
|
1. GPU tier (WORKER_QUEUES + topology + NVENC) — FIRST.
|
|
2. Node/GPU attribution + Jobs UI.
|