WildDragonLLC/dragonflight - Forgejo: Beyond coding. We Forge.

WildDragonLLC/dragonflight

Author	SHA1	Message	Date
Zac Gaetano	8ea750f5df	feat(playback): HLS VOD rendition for browser (supplements MP4 proxy) Browser playback of recorded assets moves to HLS, retiring the MP4 range-stitching path for VOD. MP4 proxy is kept for the Premiere panel. - worker/hls.js: remuxToHls() stream-copies the proxy MP4 → fMP4 HLS (playlist.m3u8 + init.mp4 + segment_*.m4s) via existing segmentToHls, uploads to hls/<id>/, sets assets.hls_s3_key. hlsWorker backfills from an existing proxy. - proxy.js: generate HLS inline after the MP4 upload (local file, no re-download, no re-encode); best-effort/non-fatal. - worker/index.js: register 'hls' worker wherever 'proxy' runs. - mam-api: GET /assets/:id/hls/:file serves playlist/init/segments as whole-object GETs (no Range → sidesteps RustFS bug), strict filename validation. /stream prefers hls_s3_key (type:'hls'). reprocess?type=hls backfills. Migration 025 adds assets.hls_s3_key. - Frontend unchanged: hls.js path already handles type:'hls'. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-29 16:18:15 -04:00
Zac Gaetano	fdec2e307d	feat(worker): capability-routed GPU worker pool + per-node job attribution WORKER_QUEUES env lets a worker subscribe to a subset of queues. Deploy one GPU-pinned container per card: heavy encodes (proxy/conform/trim) on Tesla P4 (zampp1) + L4 (zampp2) via NVENC; light jobs (thumbnail/filmstrip) on the 2x Quadro P400 (zampp1). BullMQ competing-consumers distribute across nodes. RUN_PROMOTION gates the growing-files scanner to one worker. Each worker stamps WORKER_LABEL onto job data so the Jobs UI Node column shows which node/GPU ran each job. Redis/DB/S3 for the zampp2 worker come from its .env (pointed at zampp1). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-29 04:00:10 +00:00
ZGaetano	a03c85f08a	feat: server-side filmstrip worker + fix scheduler crash + fix clip freeze Root causes found: 1. Scheduler crashing every 15s: assets table has no error_message column. Fix: remove error_message from UPDATE in scheduler.js (#66 regression). 2. Clip freezing: client-side filmstrip seek loop runs on main thread, seeks same proxy the player is streaming → both stall → freeze. Fix: replace browser seek loop entirely with server-side FFmpeg worker. 3. No dedicated filmstrip worker: filmstrip was never pre-built server-side. Changes: - services/mam-api/src/db/migrations/018-add-filmstrip-s3-key.sql Add filmstrip_s3_key TEXT column to assets table - services/worker/src/workers/filmstrip.js (new) BullMQ worker: downloads proxy, runs FFmpeg fps filter to extract 28 evenly-spaced JPEG frames, base64-encodes them, uploads JSON array to S3 at filmstrips/<assetId>.json, stores key in DB - services/worker/src/workers/thumbnail.js Queue filmstrip job automatically after thumbnail completes - services/worker/src/index.js Register filmstrip worker (concurrency=2), export filmstripQueue singleton, close it on SIGTERM - services/mam-api/src/routes/assets.js - filmstripQueue added - POST /reprocess?type=filmstrip now supported - GET /:id/filmstrip returns signed S3 URL for JSON frames - services/mam-api/src/routes/jobs.js filmstrip queue visible in Jobs UI - services/web-ui/public/screens-asset.jsx Replace browser seek loop with fetch of /assets/:id/filmstrip → fetch S3 JSON → render frames. Zero browser-side video seeking. Right-click and Files tab re-generate via API endpoint.	2026-05-26 16:39:44 +00:00
ZGaetano	bacdb9f49c	fix(worker): close all Queue singletons + promotion intervals on SIGTERM (issue #94 bugs 4, 7, 10)	2026-05-26 07:38:08 -04:00
Zac Gaetano	c312991bac	feat: implement advanced features (conform, auto-relink, GUI redesign, docs, tests) - #30 FCP XML Export & Conform: slide panel UI, preset system, FCP XML generation, conform job submission with progress polling via BullMQ - #31 Hi-Res Auto-Relink: clip list with checkboxes, batch-trim server endpoint, trimWorker with frame-accurate FFmpeg trimming, auto-relink in Premiere via ExtendScript, temp segment signed URL endpoint - #32 GUI Redesign: complete rewrite with Wild Dragon OKLCH design tokens (accent oklch(45% 0.20 266)), slide panels, preset cards, chip components - #34 Cleanup Task: existing task validated and properly registered - #35 Testing: comprehensive 33-scenario E2E test plan - #36 Documentation: advanced features guide with workflows, troubleshooting, presets table, and architecture overview - #24 PR merge: verified mergeable All server endpoints, worker queues, and ExtendScript functions wired together	2026-05-24 13:19:24 -04:00
Zac Gaetano	91325a4267	fix(jobs): real cancel for active jobs + multi-threaded thumbnail worker DELETE /jobs/:id was throwing "404 not found" when the operator tried to cancel a running job. BullMQ refuses job.remove() while a job is in the active state; the route caught that error and fell through to the 404 branch, which was misleading because the job actually exists — the queue was just refusing to drop it from under the worker. Fix: - Detect 'active' state explicitly and call moveToFailed(err, '0', false) first. Token '0' bypasses the per-worker lock check (the operator-side cancel doesn't hold the worker lock). That transitions active -> failed and frees the queue's concurrency slot. - If moveToFailed itself fails (lock owned by a live worker), fall back to job.discard() so at least the result is thrown away. - If remove() then fails (stalled, broken state), drop the job's Redis key directly via queue.client. Last-resort obliteration. - Stop swallowing getJob() errors — if Redis is sad, surface it via next(err) instead of returning a misleading 404. - Return { cancelled: true } when the job was active, so the client can show "Cancelled" rather than "Removed" in any future toast. While here: thumbnail jobs now run with concurrency 4 by default (proxy 2, conform 1, import 1 unchanged). Every queue defaulted to concurrency 1 before, so a single stalled job blocked the entire queue. All three are overridable via PROXY_CONCURRENCY / THUMBNAIL_CONCURRENCY / CONFORM_CONCURRENCY env vars for nodes with more headroom. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 17:23:07 -04:00
Zac Gaetano	9ad88e4df4	feat(ingest): YouTube importer — paste link, asset travels normal pipeline Adds Ingest → YouTube. UI takes a URL + project, API enqueues a BullMQ "import" job, worker shells out to yt-dlp, lands the MP4 in S3 at the same originals/{assetId}/... path uploads use, then hands off to the existing proxy queue. Imported assets share one lifecycle with uploads from that point on. Worker container picks up yt-dlp + python3 (apk on alpine, apt on the GPU variant). The new 'import' queue is registered in jobs.js so it appears in the Jobs SSE stream and retry/delete work for free. Spec: docs/superpowers/specs/2026-05-23-youtube-importer-design.md Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-23 16:05:41 -04:00
Zac Gaetano	328f7b4f31	feat: live HLS preview, proxy worker fixes, Settings tabs, growing-files + Premier panel - worker/proxy: scale-to-even filter, analyzeduration 100M, skip images, hasAudio - worker/promotion: SMB landing zone -> S3 on idle, queues proxy job, status='ready' - web-ui screens-ingest: HlsPreview component replaces fake LiveStrip/FauxFrame - web-ui screens-admin: functional Settings tabs (S3, GPU, Growing, SDI, AMPP) - mam-api /settings/growing: GET/PUT growing-files config - mam-api /assets/:id/live-path: SMB UNC/POSIX path for live growing assets - capture-manager: GROWING_ENABLED -> write hires to /growing instead of S3 stream - recorders.js: pass GROWING_ENABLED to capture container, bind /growing mount - docker-compose: mount /mnt/NVME/MAM/wild-dragon-growing on mam-api + worker - premiere-plugin: Mount Live button, Relink-to-HiRes, live->ready status poll	2026-05-22 19:12:53 -04:00
Zac	562881f0db	fix(jobs): stall detection + manual kill button so 5h-stuck actives can't happen A thumbnail job from earlier stayed 'active' for 6+ hours: worker was restarted at 70% progress, BullMQ left it in the active set, and there was no stall reaper because the worker was created with only the default options. Worker now passes stalledInterval: 30000, lockDuration: 60000, lockRenewTime: 15000, maxStalledCount: 1 to the Worker constructor. If a run dies, BullMQ reclaims the job back to waiting within 30s and a 'stalled' event is logged. Otherwise the lock is renewed mid-job. Jobs UI gains a 'Kill' button per row next to Details. Calls DELETE /api/v1/jobs/:id which already removes the job from Redis. Use it on any row that looks stuck.	2026-05-17 19:10:19 -04:00
ZGaetano	cc174c4977	Fix worker/index.js: job.progress is a property not a function in BullMQ v3+	2026-05-16 00:46:53 -04:00
Zac Gaetano	1a1f34a468	add services/worker/src/index.js	2026-04-07 21:58:18 -04:00