dragonflight/services/worker/src/s3
ZGaetano 3c7cc1a77f fix(worker): retry transient S3 aborts + reuse one keep-alive client
Burn test: 5 assets errored during proxy with 'aborted'/'socket hang up'
during the master DOWNLOAD. The masters all exist in S3 (262-269MB) — it's
the connection-limited RustFS backend dropping streams when 8 jobs hammer it
at once. Two fixes:

1. downloadFromS3/uploadToS3 now retry transient failures (aborted, socket
   hang up, ECONNRESET, timeout, 5xx, throttle) up to 5x with exponential
   backoff, cleaning the partial file between download attempts. A single
   mid-stream abort no longer errors the whole asset.

2. Reuse ONE shared S3 client instead of createS3Client()+client.destroy()
   per call. The per-call destroy tore down the keep-alive agent's sockets
   every time, so connection pooling never happened and each transfer opened
   fresh connections — exactly what overwhelmed RustFS. A long-lived client
   lets the keep-alive pool actually be reused.
2026-06-04 16:56:11 +00:00
..
client.js fix(worker): retry transient S3 aborts + reuse one keep-alive client 2026-06-04 16:56:11 +00:00