No graceful shutdown handler — SIGTERM kills mam-api mid-tick, leaks Redis + Docker sockets #100

Closed
opened 2026-05-26 18:18:01 -04:00 by zgaetano · 1 comment
Owner

Fixed in 04ce096. SIGTERM / SIGINT now run gracefulShutdown: stops the scheduler tick, drains in-flight HTTP requests via server.close(), ends the PG pool, and exits 0. 25 s force-exit watchdog (unref'd) in case anything is stuck.

Fixed in 04ce096. SIGTERM / SIGINT now run `gracefulShutdown`: stops the scheduler tick, drains in-flight HTTP requests via `server.close()`, ends the PG pool, and exits 0. 25 s force-exit watchdog (`unref`'d) in case anything is stuck.
Author
Owner

Fix Plan — #100 No graceful shutdown handler

Root cause: mam-api has no SIGTERM/SIGINT handler. Docker stop kills mid-tick: corrupted scheduler state, leaked BullMQ Redis connections, leaked PG pool, leaked intervals.

Fix — add to src/index.js:

const shutdown = async (signal) => {
  console.log(`Received ${signal}, shutting down gracefully...`);

  // 1. Stop accepting new requests
  if (server) server.close();

  // 2. Clear intervals
  if (heartbeatInterval) clearInterval(heartbeatInterval);
  if (loginPruneInterval) clearInterval(loginPruneInterval);

  // 3. Stop scheduler
  if (schedulerInterval) clearInterval(schedulerInterval);

  // 4. Close all BullMQ queues
  await Promise.all([
    proxyQueue.close(),
    thumbnailQueue.close(),
    conformQueue.close(),
    uploadQueue.close(),
    importQueue.close(),
    trimQueue.close(),
  ].filter(Boolean));

  // 5. Close PG pool
  if (pool) await pool.end();

  process.exit(0);
};

process.on("SIGTERM", () => shutdown("SIGTERM"));
process.on("SIGINT", () => shutdown("SIGINT"));

// Hard timeout — force exit after 10s
setTimeout(() => process.exit(1), 10000).unref();

Files: src/index.js, src/routes/jobs.js, src/routes/assets.js, src/routes/upload.js, src/routes/auth.js
Effort: ~2h
**Priority: P0 — data integrity

## Fix Plan — #100 No graceful shutdown handler **Root cause:** `mam-api` has no `SIGTERM`/`SIGINT` handler. Docker stop kills mid-tick: corrupted scheduler state, leaked BullMQ Redis connections, leaked PG pool, leaked intervals. **Fix — add to `src/index.js`:** ```js const shutdown = async (signal) => { console.log(`Received ${signal}, shutting down gracefully...`); // 1. Stop accepting new requests if (server) server.close(); // 2. Clear intervals if (heartbeatInterval) clearInterval(heartbeatInterval); if (loginPruneInterval) clearInterval(loginPruneInterval); // 3. Stop scheduler if (schedulerInterval) clearInterval(schedulerInterval); // 4. Close all BullMQ queues await Promise.all([ proxyQueue.close(), thumbnailQueue.close(), conformQueue.close(), uploadQueue.close(), importQueue.close(), trimQueue.close(), ].filter(Boolean)); // 5. Close PG pool if (pool) await pool.end(); process.exit(0); }; process.on("SIGTERM", () => shutdown("SIGTERM")); process.on("SIGINT", () => shutdown("SIGINT")); // Hard timeout — force exit after 10s setTimeout(() => process.exit(1), 10000).unref(); ``` **Files:** `src/index.js`, `src/routes/jobs.js`, `src/routes/assets.js`, `src/routes/upload.js`, `src/routes/auth.js` **Effort:** ~2h **Priority: P0 — data integrity
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: WildDragonLLC/dragonflight#100
No description provided.