Commit graph

9 commits

Author SHA1 Message Date
a03c85f08a feat: server-side filmstrip worker + fix scheduler crash + fix clip freeze
Root causes found:
1. Scheduler crashing every 15s: assets table has no error_message column.
   Fix: remove error_message from UPDATE in scheduler.js (#66 regression).

2. Clip freezing: client-side filmstrip seek loop runs on main thread,
   seeks same proxy the player is streaming → both stall → freeze.
   Fix: replace browser seek loop entirely with server-side FFmpeg worker.

3. No dedicated filmstrip worker: filmstrip was never pre-built server-side.

Changes:
- services/mam-api/src/db/migrations/018-add-filmstrip-s3-key.sql
  Add filmstrip_s3_key TEXT column to assets table

- services/worker/src/workers/filmstrip.js (new)
  BullMQ worker: downloads proxy, runs FFmpeg fps filter to extract
  28 evenly-spaced JPEG frames, base64-encodes them, uploads JSON
  array to S3 at filmstrips/<assetId>.json, stores key in DB

- services/worker/src/workers/thumbnail.js
  Queue filmstrip job automatically after thumbnail completes

- services/worker/src/index.js
  Register filmstrip worker (concurrency=2), export filmstripQueue
  singleton, close it on SIGTERM

- services/mam-api/src/routes/assets.js
  - filmstripQueue added
  - POST /reprocess?type=filmstrip now supported
  - GET /:id/filmstrip returns signed S3 URL for JSON frames

- services/mam-api/src/routes/jobs.js
  filmstrip queue visible in Jobs UI

- services/web-ui/public/screens-asset.jsx
  Replace browser seek loop with fetch of /assets/:id/filmstrip
  → fetch S3 JSON → render frames. Zero browser-side video seeking.
  Right-click and Files tab re-generate via API endpoint.
2026-05-26 16:39:44 +00:00
bacdb9f49c fix(worker): close all Queue singletons + promotion intervals on SIGTERM (issue #94 bugs 4, 7, 10) 2026-05-26 07:38:08 -04:00
c312991bac feat: implement advanced features (conform, auto-relink, GUI redesign, docs, tests)
- #30 FCP XML Export & Conform: slide panel UI, preset system, FCP XML generation,
  conform job submission with progress polling via BullMQ
- #31 Hi-Res Auto-Relink: clip list with checkboxes, batch-trim server endpoint,
  trimWorker with frame-accurate FFmpeg trimming, auto-relink in Premiere via
  ExtendScript, temp segment signed URL endpoint
- #32 GUI Redesign: complete rewrite with Wild Dragon OKLCH design tokens
  (accent oklch(45% 0.20 266)), slide panels, preset cards, chip components
- #34 Cleanup Task: existing task validated and properly registered
- #35 Testing: comprehensive 33-scenario E2E test plan
- #36 Documentation: advanced features guide with workflows, troubleshooting,
  presets table, and architecture overview
- #24 PR merge: verified mergeable

All server endpoints, worker queues, and ExtendScript functions wired together
2026-05-24 13:19:24 -04:00
91325a4267 fix(jobs): real cancel for active jobs + multi-threaded thumbnail worker
DELETE /jobs/:id was throwing "404 not found" when the operator tried to
cancel a running job. BullMQ refuses job.remove() while a job is in the
active state; the route caught that error and fell through to the
404 branch, which was misleading because the job actually exists — the
queue was just refusing to drop it from under the worker.

Fix:
- Detect 'active' state explicitly and call moveToFailed(err, '0', false)
  first. Token '0' bypasses the per-worker lock check (the operator-side
  cancel doesn't hold the worker lock). That transitions active -> failed
  and frees the queue's concurrency slot.
- If moveToFailed itself fails (lock owned by a live worker), fall back
  to job.discard() so at least the result is thrown away.
- If remove() then fails (stalled, broken state), drop the job's Redis
  key directly via queue.client. Last-resort obliteration.
- Stop swallowing getJob() errors — if Redis is sad, surface it via
  next(err) instead of returning a misleading 404.
- Return { cancelled: true } when the job was active, so the client
  can show "Cancelled" rather than "Removed" in any future toast.

While here: thumbnail jobs now run with concurrency 4 by default
(proxy 2, conform 1, import 1 unchanged). Every queue defaulted to
concurrency 1 before, so a single stalled job blocked the entire queue.
All three are overridable via PROXY_CONCURRENCY / THUMBNAIL_CONCURRENCY
/ CONFORM_CONCURRENCY env vars for nodes with more headroom.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 17:23:07 -04:00
9ad88e4df4 feat(ingest): YouTube importer — paste link, asset travels normal pipeline
Adds Ingest → YouTube. UI takes a URL + project, API enqueues a BullMQ
"import" job, worker shells out to yt-dlp, lands the MP4 in S3 at the
same originals/{assetId}/... path uploads use, then hands off to the
existing proxy queue. Imported assets share one lifecycle with uploads
from that point on.

Worker container picks up yt-dlp + python3 (apk on alpine, apt on the
GPU variant). The new 'import' queue is registered in jobs.js so it
appears in the Jobs SSE stream and retry/delete work for free.

Spec: docs/superpowers/specs/2026-05-23-youtube-importer-design.md

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 16:05:41 -04:00
328f7b4f31 feat: live HLS preview, proxy worker fixes, Settings tabs, growing-files + Premier panel
- worker/proxy: scale-to-even filter, analyzeduration 100M, skip images, hasAudio
- worker/promotion: SMB landing zone -> S3 on idle, queues proxy job, status='ready'
- web-ui screens-ingest: HlsPreview component replaces fake LiveStrip/FauxFrame
- web-ui screens-admin: functional Settings tabs (S3, GPU, Growing, SDI, AMPP)
- mam-api /settings/growing: GET/PUT growing-files config
- mam-api /assets/:id/live-path: SMB UNC/POSIX path for live growing assets
- capture-manager: GROWING_ENABLED -> write hires to /growing instead of S3 stream
- recorders.js: pass GROWING_ENABLED to capture container, bind /growing mount
- docker-compose: mount /mnt/NVME/MAM/wild-dragon-growing on mam-api + worker
- premiere-plugin: Mount Live button, Relink-to-HiRes, live->ready status poll
2026-05-22 19:12:53 -04:00
Zac
562881f0db fix(jobs): stall detection + manual kill button so 5h-stuck actives can't happen
A thumbnail job from earlier stayed 'active' for 6+ hours: worker was restarted at 70% progress, BullMQ left it in the active set, and there was no stall reaper because the worker was created with only the default options.

Worker now passes stalledInterval: 30000, lockDuration: 60000, lockRenewTime: 15000, maxStalledCount: 1 to the Worker constructor. If a run dies, BullMQ reclaims the job back to waiting within 30s and a 'stalled' event is logged. Otherwise the lock is renewed mid-job.

Jobs UI gains a 'Kill' button per row next to Details. Calls DELETE /api/v1/jobs/:id which already removes the job from Redis. Use it on any row that looks stuck.
2026-05-17 19:10:19 -04:00
cc174c4977 Fix worker/index.js: job.progress is a property not a function in BullMQ v3+ 2026-05-16 00:46:53 -04:00
1a1f34a468 add services/worker/src/index.js 2026-04-07 21:58:18 -04:00