Commit graph

6 commits

Author SHA1 Message Date
3c7cc1a77f fix(worker): retry transient S3 aborts + reuse one keep-alive client
Burn test: 5 assets errored during proxy with 'aborted'/'socket hang up'
during the master DOWNLOAD. The masters all exist in S3 (262-269MB) — it's
the connection-limited RustFS backend dropping streams when 8 jobs hammer it
at once. Two fixes:

1. downloadFromS3/uploadToS3 now retry transient failures (aborted, socket
   hang up, ECONNRESET, timeout, 5xx, throttle) up to 5x with exponential
   backoff, cleaning the partial file between download attempts. A single
   mid-stream abort no longer errors the whole asset.

2. Reuse ONE shared S3 client instead of createS3Client()+client.destroy()
   per call. The per-call destroy tore down the keep-alive agent's sockets
   every time, so connection pooling never happened and each transfer opened
   fresh connections — exactly what overwhelmed RustFS. A long-lived client
   lets the keep-alive pool actually be reused.
2026-06-04 16:56:11 +00:00
b27b9f6909 fix(s3): keep-alive agents + long timeouts to end socket starvation
Root cause of stuck 'processing', failed deletes, and dead playback:

The mam-api proxies media (/video, /hls pipe the full S3 body through
Express), holding long-lived streaming sockets. With the SDK's default
http agents (no keep-alive, unbounded but unpooled) those streams starved
control-plane calls — DeleteObject and the proxy worker's master download
— which timed out (10s connectionTimeout) in bursts.

Fixes:
- mam-api S3 client: dedicated keep-alive http/https Agents (maxSockets 256)
  + requestTimeout raised 30s→300s so large master GETs finish.
- worker S3 client: previously had NO handler config at all (SDK defaults).
  Added keep-alive agents + 600s requestTimeout so proxy/conform master
  downloads (hundreds of MB) don't stall and leave assets in 'processing'.
2026-06-04 12:53:28 +00:00
opencode
a86c1c72f9 fix(player): stitch S3 ranges around RustFS empty-body bug (#143)
RustFS returns empty bodies for ranged GETs whose start offset is past
~5.9 MB on single-file proxy MP4s. HEAD reports correct size, full GET
(`bytes=0-`) works, but `bytes=8179166-` comes back 206 + correct
Content-Range header with zero bytes. Confirmed via direct S3 probe
against broadcastmgmt.cloud/dragonmam (see scratch tests).

Workaround in mam-api `GET /api/v1/assets/:id/video` until the proxy
worker emits HLS (planned v1.2.1):

  - HEAD the object first to learn total size (also gives ETag /
    Last-Modified for conditional requests).
  - No-Range / unparseable-Range / pre-EOF requests \u2192 plain pipe.
  - Parsed `bytes=N-M` requests below RUSTFS_RANGE_SAFE_START
    (default 5_500_000) \u2192 direct ranged GET, RustFS handles fine.
  - Anything reaching into the broken zone \u2192 stream from offset 0,
    drop bytes below start, stop at end. Memory stays flat; extra
    bandwidth = (end+1 - requested-size) per seek.
  - Genuinely out-of-range \u2192 416 with Cache-Control: no-store so the
    browser doesn't poison its cache.

Also stashes (not yet wired up) the HLS pieces we'll need for the
follow-up: `segmentToHls` ffmpeg helper + `uploadDirectoryToS3`
worker s3 helper. Harmless additions; not referenced by any code path
yet.

Confirmed against the affected asset (a72aaa03-...): bytes=0-100k +
50% +100k native pass-through; 70% +100k and near-EOF previously hung
the browser, now stream correctly via the stitched path.

Refs #143.
2026-05-27 02:38:42 +00:00
328f7b4f31 feat: live HLS preview, proxy worker fixes, Settings tabs, growing-files + Premier panel
- worker/proxy: scale-to-even filter, analyzeduration 100M, skip images, hasAudio
- worker/promotion: SMB landing zone -> S3 on idle, queues proxy job, status='ready'
- web-ui screens-ingest: HlsPreview component replaces fake LiveStrip/FauxFrame
- web-ui screens-admin: functional Settings tabs (S3, GPU, Growing, SDI, AMPP)
- mam-api /settings/growing: GET/PUT growing-files config
- mam-api /assets/:id/live-path: SMB UNC/POSIX path for live growing assets
- capture-manager: GROWING_ENABLED -> write hires to /growing instead of S3 stream
- recorders.js: pass GROWING_ENABLED to capture container, bind /growing mount
- docker-compose: mount /mnt/NVME/MAM/wild-dragon-growing on mam-api + worker
- premiere-plugin: Mount Live button, Relink-to-HiRes, live->ready status poll
2026-05-22 19:12:53 -04:00
b2da06b4cc fix(infra+workers): S3 creds, ffprobe, BullMQ awaits, thumbnail seek, bin optional, docker-compose vars, jobs Redis, recorders stop codes: client.js 2026-05-16 00:29:48 -04:00
0537b7ab44 add services/worker/src/s3/client.js 2026-04-07 21:58:20 -04:00