feat(ui): version badge, polling fixes, asset browser hygiene, project ctx fixes
This commit is contained in:
parent
2a43deb0be
commit
1e206a55fa
8 changed files with 3044 additions and 16 deletions
6
.env.worker
Normal file
6
.env.worker
Normal file
|
|
@ -0,0 +1,6 @@
|
|||
|
||||
FC_SHM_SIZE_GB=40
|
||||
FC_URL=http://127.0.0.1:7435
|
||||
|
||||
FC_SHM_SIZE_GB=40
|
||||
FC_URL=http://127.0.0.1:7435
|
||||
79
docs/superpowers/specs/2026-05-28-gpu-worker-pool-design.md
Normal file
79
docs/superpowers/specs/2026-05-28-gpu-worker-pool-design.md
Normal file
|
|
@ -0,0 +1,79 @@
|
|||
# Capability-Routed GPU Worker Pool + Job Node Attribution
|
||||
|
||||
Date: 2026-05-28 | Status: approved (design), pending implementation
|
||||
|
||||
## Problem
|
||||
All transcode/proxy jobs run on a single zampp1 worker configured with
|
||||
`NVENC_ENABLED=` (empty) -> CPU libx264, despite a Tesla P4 in the box.
|
||||
zampp2's L4 runs no worker (0% util). No visibility into which node/GPU ran a
|
||||
job. Idle hardware: Tesla P4 + 2x Quadro P400 (zampp1), L4 (zampp2).
|
||||
|
||||
## Goals
|
||||
1. Use all GPUs, routed by capability (heavy encodes on strong cards, light
|
||||
decode-only jobs on weak cards).
|
||||
2. Distribute jobs across nodes automatically.
|
||||
3. Show which node + GPU ran each job in the Jobs UI.
|
||||
Non-goals: autoscaling, custom scheduler beyond BullMQ competing-consumers,
|
||||
multi-GPU selection inside one worker process.
|
||||
|
||||
## Current architecture (facts)
|
||||
- BullMQ on shared Redis; queues already type-named: proxy, thumbnail,
|
||||
filmstrip, conform, trim. mam-api enqueues by type -> NO mam-api change.
|
||||
- worker/src/index.js creates a Worker per queue in one process; per-queue
|
||||
*_CONCURRENCY envs already exist.
|
||||
- proxy.js picks gpu_codec || (gpuEnabled ? h264_nvenc : libx264) and falls
|
||||
back to libx264 on GPU encode failure (proxy.js:181).
|
||||
- Chain: proxy -> thumbnail -> filmstrip. Workers are competing consumers.
|
||||
|
||||
## Design
|
||||
|
||||
### Tiers (by queue subscription)
|
||||
- HEAVY: subscribes proxy, conform, trim. Cards: Tesla P4 (zampp1), L4
|
||||
(zampp2). NVENC_ENABLED=true -> h264_nvenc.
|
||||
- LIGHT: subscribes thumbnail, filmstrip. Cards: 2x Quadro P400 (zampp1).
|
||||
P400s never subscribe to heavy queues, so a weak card cannot bottleneck a heavy
|
||||
job. Strong cards do not subscribe to light queues in v1 (clean tiers; revisit
|
||||
if light backlog ever starves while P4/L4 idle).
|
||||
|
||||
### Worker change (only code change)
|
||||
Add WORKER_QUEUES env (comma list) to worker/src/index.js: only create Workers
|
||||
for listed queues; unset = all (back-compat). No GPU-selection code change —
|
||||
each container pinned to one GPU via NVIDIA_VISIBLE_DEVICES (sees it as dev 0).
|
||||
|
||||
### Topology
|
||||
zampp1 (docker-compose.yml + gpu overlay):
|
||||
- worker-p4 : VISIBLE=<P4 uuid>, NVENC_ENABLED=true, WORKER_QUEUES=proxy,conform,trim
|
||||
- worker-p400a : VISIBLE=<P400a uuid>, WORKER_QUEUES=thumbnail,filmstrip
|
||||
- worker-p400b : VISIBLE=<P400b uuid>, WORKER_QUEUES=thumbnail,filmstrip
|
||||
zampp2 (docker-compose.worker.yml + gpu overlay):
|
||||
- worker-l4 : VISIBLE=<L4 uuid>, NVENC_ENABLED=true, WORKER_QUEUES=proxy,conform,trim
|
||||
needs REDIS_URL / DATABASE_URL / S3_* in zampp2 .env pointing at zampp1.
|
||||
Pin by GPU UUID (nvidia-smi -L) not index, so reordering does not remap cards.
|
||||
|
||||
### Concurrency (initial, tunable via *_CONCURRENCY)
|
||||
P4 proxy 2 ; L4 proxy 3 ; conform/trim 1 ; P400 thumbnail 2, filmstrip 2.
|
||||
|
||||
### Node/GPU attribution (phase 2)
|
||||
Stamp each job with node hostname + GPU + tier on start. Mechanism pending
|
||||
confirmation of whether Jobs screen reads the jobs DB table or live BullMQ
|
||||
(appeared empty during runs): if DB-backed add node/gpu cols (or result jsonb)
|
||||
and stamp in createWorker; if BullMQ-backed include node/gpu in job data and
|
||||
surface via jobs API. Show "zampp2 / L4" in the Jobs UI row.
|
||||
|
||||
## Risks
|
||||
- zampp2: heavy NVENC job can contend with an active recording's CPU work
|
||||
(decode/mux uses CPU; GPU itself separate). Mitigate via L4 concurrency and,
|
||||
if needed, pausing heavy intake during active capture.
|
||||
- P400 NVENC session caps -> covered by libx264 fallback.
|
||||
- zampp2 worker needs reach to zampp1 Redis/Postgres/S3 (already proven for API).
|
||||
|
||||
## Verification
|
||||
- Enqueue several proxies; nvidia-smi shows encoder util on P4 (zampp1) and L4
|
||||
(zampp2); P400s only on thumbnail/filmstrip.
|
||||
- A heavy job never lands on a P400 (worker logs / attribution).
|
||||
- Assets still reach ready with proxy+thumbnail+filmstrip.
|
||||
- Phase 2: Jobs UI shows correct node/GPU per job.
|
||||
|
||||
## Rollout order
|
||||
1. GPU tier (WORKER_QUEUES + topology + NVENC) — FIRST.
|
||||
2. Node/GPU attribution + Jobs UI.
|
||||
2912
services/web-ui/package-lock.json
generated
Normal file
2912
services/web-ui/package-lock.json
generated
Normal file
File diff suppressed because it is too large
Load diff
|
|
@ -89,8 +89,8 @@ function AssetDetail({ asset, onClose }) {
|
|||
if (!streamUrl || streamType !== 'hls' || !videoRef.current) return;
|
||||
if (!window.Hls) return;
|
||||
const hls = new window.Hls();
|
||||
hls.loadSource(streamUrl);
|
||||
hls.attachMedia(videoRef.current);
|
||||
hls.on(window.Hls.Events.MEDIA_ATTACHED, function() { hls.loadSource(streamUrl); });
|
||||
return function() { hls.destroy(); };
|
||||
}, [streamUrl, streamType]);
|
||||
|
||||
|
|
@ -500,7 +500,7 @@ function AssetDetail({ asset, onClose }) {
|
|||
const msg = err ? `MediaError code=${err.code} message=${err.message || '(none)'}` : 'unknown error';
|
||||
setPlayerState('error');
|
||||
setPlayerError(msg);
|
||||
console.error('[player]', msg, e);
|
||||
window.DF_LOG?.debug('[player]', msg, e);
|
||||
}}
|
||||
onEnded={function() { setPlaying(false); setPlayerState('paused'); }}
|
||||
/>
|
||||
|
|
|
|||
|
|
@ -991,7 +991,10 @@ function RecorderRow({ recorder: initialRecorder, onRefresh, onConfigure, nodeOn
|
|||
// Project override for this take. Defaults to the recorder's configured project.
|
||||
const [takeProjectId, setTakeProjectId] = React.useState(initialRecorder.project_id || PROJECTS[0]?.id || '');
|
||||
const [confirm, confirmModal] = window.useConfirm();
|
||||
const isRec = recorder.status === 'recording';
|
||||
// Override status immediately on toggle (prevents stale badge until next poll)
|
||||
const [statusOverride, setStatusOverride] = React.useState(null);
|
||||
const displayStatus = statusOverride || recorder.status;
|
||||
const isRec = displayStatus === 'recording';
|
||||
|
||||
// Keep takeProjectId in sync if the recorder row changes (e.g. after a refresh).
|
||||
React.useEffect(() => {
|
||||
|
|
@ -1058,6 +1061,7 @@ function RecorderRow({ recorder: initialRecorder, onRefresh, onConfigure, nodeOn
|
|||
const toggle = () => {
|
||||
if (pending) return;
|
||||
const action = isRec ? 'stop' : 'start';
|
||||
setStatusOverride(action === 'stop' ? 'standby' : 'recording'); // optimistic
|
||||
setPending(true);
|
||||
setErr(null);
|
||||
setRecorder(r => ({ ...r, status: isRec ? 'idle' : 'recording' }));
|
||||
|
|
@ -1073,6 +1077,7 @@ function RecorderRow({ recorder: initialRecorder, onRefresh, onConfigure, nodeOn
|
|||
setPending(false);
|
||||
// Clear the clip name on a successful stop so the next take starts fresh.
|
||||
// Leave takeProjectId as-is (operator likely wants the same project for the next take).
|
||||
setStatusOverride(null); // clear override, let real status take over
|
||||
if (action === 'stop') setClipName('');
|
||||
onRefresh();
|
||||
window.dispatchEvent(new CustomEvent('df:recorders-changed'));
|
||||
|
|
@ -1082,7 +1087,7 @@ function RecorderRow({ recorder: initialRecorder, onRefresh, onConfigure, nodeOn
|
|||
window.dispatchEvent(new CustomEvent('df:assets-changed'));
|
||||
}
|
||||
})
|
||||
.catch(e => { setPending(false); setErr(e.message || 'Failed'); setRecorder(initialRecorder); });
|
||||
.catch(e => { setStatusOverride(null); setPending(false); setErr(e.message || 'Failed'); setRecorder(initialRecorder); });
|
||||
};
|
||||
|
||||
const isEnabled = recorder.enabled === true;
|
||||
|
|
|
|||
|
|
@ -191,7 +191,7 @@ function Library({ navigate, onOpenAsset, openProject, onClearProject, onOpenPro
|
|||
// to 'ready' (with thumbnail) without a manual reload. Also pull once on
|
||||
// mount so uploads/imports created on other screens appear immediately.
|
||||
const hasLive = React.useMemo(
|
||||
() => allAssets.some(a => a.status === 'live' || a.status === 'processing' || a.status === 'ingesting'),
|
||||
() => allAssets.some(a => ['live','processing','ingesting','recording'].includes(a.status)),
|
||||
[allAssets]
|
||||
);
|
||||
React.useEffect(() => {
|
||||
|
|
@ -701,16 +701,25 @@ function AssetCard({ asset, onOpen, onContextMenu, onDownload, onDragStart, drag
|
|||
setHovered(false);
|
||||
};
|
||||
|
||||
// HLS wiring
|
||||
// HLS wiring - safe: attachMedia before loadSource
|
||||
React.useEffect(function() {
|
||||
if (!hovered || !hoverStream || hoverStream.type !== 'hls' || !videoRef.current) return;
|
||||
if (!window.Hls) return;
|
||||
hlsRef.current = new window.Hls({ maxBufferLength: 10 });
|
||||
hlsRef.current.loadSource(hoverStream.url);
|
||||
hlsRef.current.attachMedia(videoRef.current);
|
||||
return function() {
|
||||
if (hlsRef.current) { hlsRef.current.destroy(); hlsRef.current = null; }
|
||||
};
|
||||
if (!hovered || !hoverStream || !videoRef.current) return;
|
||||
var vid = videoRef.current;
|
||||
if (hoverStream.type !== 'hls') {
|
||||
vid.src = hoverStream.url; vid.play().catch(function() {});
|
||||
return function() { vid.pause(); vid.src = ''; };
|
||||
}
|
||||
if (!window.Hls || !window.Hls.isSupported()) {
|
||||
vid.src = hoverStream.url; vid.play().catch(function() {});
|
||||
return function() { vid.pause(); vid.src = ''; };
|
||||
}
|
||||
var hls = new window.Hls({ maxBufferLength: 8, startLevel: 0, autoStartLoad: true });
|
||||
hls.attachMedia(vid);
|
||||
hls.on(window.Hls.Events.MEDIA_ATTACHED, function() { hls.loadSource(hoverStream.url); });
|
||||
hls.on(window.Hls.Events.MANIFEST_PARSED, function() { vid.play().catch(function() {}); });
|
||||
hls.on(window.Hls.Events.ERROR, function(_e, data) { if (data.fatal) hls.destroy(); });
|
||||
hlsRef.current = hls;
|
||||
return function() { hls.destroy(); hlsRef.current=null; vid.pause(); vid.removeAttribute('src'); };
|
||||
}, [hovered, hoverStream]);
|
||||
|
||||
const showVideo = hovered && hoverStream;
|
||||
|
|
@ -729,7 +738,6 @@ function AssetCard({ asset, onOpen, onContextMenu, onDownload, onDragStart, drag
|
|||
<video
|
||||
key={hoverStream.url}
|
||||
ref={videoRef}
|
||||
src={hoverStream.type !== 'hls' ? hoverStream.url : undefined}
|
||||
autoPlay
|
||||
muted
|
||||
loop
|
||||
|
|
@ -796,8 +804,9 @@ function ProjectContextMenu({ project, x, y, onClose, onRename }) {
|
|||
onClick={function(e) { e.stopPropagation(); }}
|
||||
onContextMenu={function(e) { e.preventDefault(); e.stopPropagation(); }}>
|
||||
<div className="ctx-header">{project.name}</div>
|
||||
<button onClick={function() { onClose(); if (window._dfOpenProject) window._dfOpenProject(project); }}><Icon name="folder" size={11} />Open project</button>
|
||||
<button onClick={function() { onClose(); onRename(project); }}><Icon name="edit" size={11} />Rename project…</button>
|
||||
<button onClick={function() { onClose(); window.ZAMPP_API.fetch('/projects/' + project.id, { method: 'DELETE' }).then(function() { window.location.reload(); }).catch(function(e) { alert('Delete failed: ' + e.message); }); }} className="danger"><Icon name="trash" size={11} />Delete project</button>
|
||||
<button onClick={function() { window.ZAMPP_API.fetch('/projects/' + project.id, { method: 'DELETE' }).then(function() { onClose(); window.dispatchEvent(new CustomEvent('df:projects-changed')); }).catch(function(e) { alert('Delete failed: ' + e.message); }); }} className="danger"><Icon name="trash" size={11} />Delete project</button>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -213,6 +213,7 @@ function Sidebar({ active, onNavigate, me, collapsed, onToggle }) {
|
|||
</button>
|
||||
)}
|
||||
</div>
|
||||
<div className="app-version">β 0.56</div>
|
||||
</aside>
|
||||
);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -469,3 +469,19 @@ a { color: inherit; text-decoration: none; }
|
|||
pointer-events: none;
|
||||
border: 1px solid var(--border-strong);
|
||||
}
|
||||
|
||||
/* ========== App version badge ========== */
|
||||
.app-version {
|
||||
position: fixed;
|
||||
bottom: 8px;
|
||||
left: 0;
|
||||
width: var(--sidebar-w, 232px);
|
||||
text-align: center;
|
||||
font-size: 10px;
|
||||
font-family: var(--font-mono);
|
||||
color: var(--text-3);
|
||||
opacity: 0.5;
|
||||
pointer-events: none;
|
||||
user-select: none;
|
||||
z-index: 10;
|
||||
}
|
||||
|
|
|
|||
Loading…
Reference in a new issue