dragonflight/services/worker/src
Zac 562881f0db fix(jobs): stall detection + manual kill button so 5h-stuck actives can't happen
A thumbnail job from earlier stayed 'active' for 6+ hours: worker was restarted at 70% progress, BullMQ left it in the active set, and there was no stall reaper because the worker was created with only the default options.

Worker now passes stalledInterval: 30000, lockDuration: 60000, lockRenewTime: 15000, maxStalledCount: 1 to the Worker constructor. If a run dies, BullMQ reclaims the job back to waiting within 30s and a 'stalled' event is logged. Otherwise the lock is renewed mid-job.

Jobs UI gains a 'Kill' button per row next to Details. Calls DELETE /api/v1/jobs/:id which already removes the job from Redis. Use it on any row that looks stuck.
2026-05-17 19:10:19 -04:00
..
db add services/worker/src/db/client.js 2026-04-07 21:58:20 -04:00
edl add services/worker/src/edl/parser.js 2026-04-07 21:58:20 -04:00
ffmpeg fix(infra+workers): S3 creds, ffprobe, BullMQ awaits, thumbnail seek, bin optional, docker-compose vars, jobs Redis, recorders stop codes: executor.js 2026-05-16 00:29:49 -04:00
s3 fix(infra+workers): S3 creds, ffprobe, BullMQ awaits, thumbnail seek, bin optional, docker-compose vars, jobs Redis, recorders stop codes: client.js 2026-05-16 00:29:48 -04:00
workers fix(infra+workers): S3 creds, ffprobe, BullMQ awaits, thumbnail seek, bin optional, docker-compose vars, jobs Redis, recorders stop codes: thumbnail.js 2026-05-16 00:29:51 -04:00
index.js fix(jobs): stall detection + manual kill button so 5h-stuck actives can't happen 2026-05-17 19:10:19 -04:00