duration_ms/file_size are int8; node-postgres returned them as strings,
a footgun for any consumer doing arithmetic/sorting/comparison (already
hand-patched once in playout totals). Register a global int8 type parser
so the API emits real numbers. All such values are < 2^53 (no precision loss).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Drop in the redesigned timeline-centric Playout (PGM monitor, transport,
SCTE-35 card, as-run drawer) from the on-node redesign, fully wired to the
real playout API (channels/transport/HLS preview w/ error-recovery/as-run);
no mock data. In-page ConfirmModal for destructive actions.
SCTE-35: new playout_scte_breaks table (migration 033), endpoints to
schedule/trigger/list/cancel breaks (POST/GET/DELETE /channels/:id/scte[/trigger]),
scheduler due-break sweep, engine triggerScte + auto-return + as-run 'scte'
rows + on-air SCTE-BREAK state and timeline AD markers. In-stream SCTE-35
cue injection is a documented stub (CasparCG FFMPEG consumer exposes no
scte35 muxer) — scheduling/triggering/countdown/as-run are functional.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Playout failover queries cluster_nodes.last_seen_at to find healthy nodes
for channel re-placement. Column missing from original cluster schema.
Migration 031 adds column + backfills existing nodes to NOW().
Fixes scheduler error: column "last_seen_at" does not exist
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Six tables: channels, playlists, items, sidecars (sidecar registry for
health-check), schedule (Phase B), as-run log.
- video_format default 1080p5994 (house standard, capture cadence)
- restart_count / last_restart_at / last_heartbeat_at on channels for
auto-failover bookkeeping
- audio_normalized flag on items so re-stages skip the loudnorm pass
- unique partial index on (channel_id) for running sidecars
Review of the v2 auth landing turned up four weak spots in the MFA path.
All four are now fixed; behaviour is unchanged for the password-correct
+ correct-TOTP happy path.
1. TOTP brute-force gate (the big one). /login was calling
ipBackoff.recordSuccess(ip) the instant the password hashed correctly,
*before* the second factor was proven. That cleared the per-IP failure
counter, so each /login retry let an attacker with a known password
hammer the 6-digit /login/totp space (10^6) at full speed.
Now recordSuccess fires only inside establishSession() — i.e. after
every required factor has actually passed (password [+TOTP] or
OAuth [+TOTP]).
2. MFA ticket binding. Tickets issued by /login (and the Google callback)
were unbound — a stolen ticket replayed from a different origin still
worked. Tickets now carry SHA-256 hashes of the issuing request's IP
and User-Agent; redeemTicket rejects on mismatch. The ticket is burned
even on mismatch so a wrong-binding probe can't be retried.
3. TOTP replay within the same 30s step (RFC 6238 §5.2). The verifier
accepted the same code as many times as you submitted it. Now
verifyToken returns the matched counter, and /login/totp does a CAS
UPDATE on users.totp_last_counter — codes at counters <= the last
accepted value are rejected. New migration 030 adds totp_last_counter,
seeded on /totp/enable so the enrollment code itself can't be reused
at first login, and zeroed on /totp/disable.
4. Google OAuth domain check no longer falls back to the email suffix
when the hd (hosted-domain) claim is missing. Email-suffix matching
let consumer (non-Workspace) Google accounts whose email happens to
end in the allowed domain through; if GOOGLE_ALLOWED_DOMAIN is set,
the operator means "only this Workspace", so accounts without a
verified hd must be rejected.
Tests: new mfa-tickets.test.js covers ip/UA binding, single-use on
mismatch, and bindings-absent back-compat. totp.test.js updated for the
new verifyToken return shape (counter on success, null on failure;
truthiness still works at call sites) and adds an explicit
matched-counter check.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Optional "Sign in with Google" with auto-provisioning, fully config-gated:
without GOOGLE_CLIENT_ID/SECRET and OAUTH_REDIRECT_URL the routes 404 and the
button is hidden, so deployments without SSO are unaffected.
- migration 028: users.google_sub (unique) + email; password_hash nullable
for OAuth-only accounts
- src/auth/google-oauth.js: lazy google-auth-library, ID-token verify,
GOOGLE_ALLOWED_DOMAIN enforcement, requires email_verified === true
- auth routes: /auth/google (state-CSRF redirect), /auth/google/callback,
/auth/google/enabled; reuses establishSession
- web-ui: "Sign in with Google" on the login screen (shown only when enabled),
friendly callback error handling
- .env.example documents all new vars
Security hardening (from review of this + the TOTP work):
- resolveGoogleUser links ONLY by google_sub, never by email — a Google login
can never seize a pre-existing local account (account-takeover fix)
- a Google-linked account with TOTP still requires the second factor (ticket
in session, /?mfa=1 step) instead of bypassing it
- /login/totp now applies the per-IP login backoff
- recovery-code consumption is atomic (WHERE used_at IS NULL + rowCount)
- concurrent first-login race on google_sub is caught and re-resolved
- tests: google-oauth config helpers + google-link takeover/dedup regression
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
## capture service
- capture-manager.js: add 'deltacast' source_type to _buildInputArgs.
Uses 'deltacast://<index>' with ffmpeg deltacast demuxer when
/dev/deltacast<N> exists; falls back to lavfi testsrc2 + sine test card
(matching deltacast-sdi-recorder standalone app) when hardware absent.
- routes/capture.js: add GET /devices/deltacast endpoint (enumerates
/dev/deltacast* + DELTACAST_PORT_COUNT env fallback). Extend /probe to
handle source_type=deltacast.
## node-agent
- detectHardware(): add 'deltacast' array to capabilities payload.
Enumerates /dev/deltacast* nodes; falls back to DELTACAST_PORT_COUNT env.
Adds DELTACAST_MODEL env support. Logs dc= count in heartbeat line.
- sidecar /start: bind /dev/deltacast* device nodes into capture containers
when sourceType='deltacast'.
## mam-api
- cluster.js: add GET /cluster/devices/deltacast and
GET /cluster/devices/deltacast/signal endpoints — same shape as
blackmagic equivalents for UI parity.
- recorders.js /start: pass DELTACAST_PORT_COUNT env to capture container;
bind /dev/deltacast* device nodes on local spawn.
- migration 024: ALTER TYPE source_type ADD VALUE 'deltacast' (idempotent).
- schema.sql: add 'deltacast' to source_type ENUM for fresh installs.
## web-ui
- modal-new-recorder.jsx: add 'Deltacast' source type card; fetch
/cluster/devices/deltacast on selection; port picker with TEST CARD
badge when hardware absent; falls through to manual index entry if
no devices detected.
Code-review feedback: ON CONFLICT (id) only catches id collisions; a pre-existing
'dev' username would trigger a unique_violation on the username index and roll
back the migration, hard-failing the mam-api boot. Switch to bare ON CONFLICT
DO NOTHING so any unique conflict is no-op-safe.
Scope (locked in via planning Q&A):
- Identity: local accounts only (PG users table) + existing bearer
tokens for headless callers.
- Transport: httpOnly cookie session for browser, Bearer for API.
- RBAC: admin / editor / viewer roles, plus an orthogonal
is_client flag for external (agency, talent, customer) accounts.
- Bootstrap: ADMIN_BOOTSTRAP_USER + ADMIN_BOOTSTRAP_PASSWORD env
seed the first admin on a clean install. Set ADMIN_BOOTSTRAP_RESET
to force-reset the named user (break-glass).
- Rate limit: in-memory, 10 fails per 15min per (IP, username).
- Password policy: \u22658 chars, mixed case, digit, symbol; small
blocklist of common passwords; cannot equal username.
- Self-service: change own display name + password. Everything
else (role, is_client, other-user mgmt) is admin only.
- Audit log: append-only table, indexed by actor + event_type +
created_at, populated by every auth/admin event.
Files added:
- services/mam-api/src/db/migrations/022-auth-rework.sql
users.is_client + last_login_at + failed_attempts; audit_log
table with FK to users (ON DELETE SET NULL).
- services/mam-api/src/middleware/audit.js
Fire-and-forget audit() helper. Caller never awaits, failure
logs but never throws — auditing cannot break the request
that triggered it.
- services/mam-api/src/middleware/passwordPolicy.js
Shared checkPassword(pw, { username }) used by setup, user
create/update, and self-service password change.
- services/mam-api/src/tasks/bootstrapAdmin.js
Runs after migrations. No-ops unless ADMIN_BOOTSTRAP_USER +
ADMIN_BOOTSTRAP_PASSWORD are set AND (users table empty OR
ADMIN_BOOTSTRAP_RESET=true).
- services/mam-api/src/routes/audit.js
Admin-only GET /audit (paginated, filter by event_type /
actor / target / date) and GET /audit/event-types.
- services/web-ui/public/modal-account-settings.jsx
Profile + Password tabs. Triggered by sidebar user button.
Files rewritten:
- services/mam-api/src/routes/auth.js
- POST /login: regenerate(), no manual save(); audit success/
fail/lockout; updates last_login_at + failed_attempts.
- POST /logout: destroys session, audits logout.
- GET /me: returns is_client + last_login_at. Synthetic admin
when AUTH_ENABLED=false.
- GET /setup-status: drives login.html UI state.
- POST /setup: blocked once any user exists; password policy.
- POST /password: self-service. Requires current pw, runs
policy, audits, invalidates other sessions implicitly via
users.js if changed by admin.
- PATCH /me: self-service display_name update.
- services/mam-api/src/routes/users.js
- is_client field in create/update/list/get.
- Guardrails: cannot delete or demote last admin, cannot
delete self, admins cannot be flagged is_client.
- Password change invalidates all sessions for that user
(DELETE FROM sessions WHERE sess->>'userId' = id).
- Audit on every mutation.
- Password policy enforced.
- services/mam-api/src/middleware/auth.js
- requireAuth now exposes req.user.is_client.
- New requireRole(["admin","editor"], { rejectClients: true })
helper. Applied to cluster, sdk, capture routes (infra).
- Synthetic user when AUTH_ENABLED=false has is_client=false.
- services/mam-api/src/index.js
- Loads bootstrap admin after migrations.
- Wires /api/v1/audit.
- Cleans up an earlier comment block.
- services/web-ui/public/login.html
- Password hint added next to setup-mode password field.
- services/web-ui/public/shell.jsx
- Sidebar user footer is a button that opens AccountSettings.
- CLIENT badge next to role when is_client=true.
- Nav filters: clients lose ingest tree + jobs + editor;
viewers lose ingest + editor; only admins see the Admin
section. Power button hidden when synthetic user.
- services/web-ui/public/screens-admin.jsx
- Users table: new Client column with inline toggle.
- InviteUserModal: Client checkbox + password hint, gated
off when role=admin.
- Last login column replaces Created in primary view.
- CSV export includes client + last_login.
- services/web-ui/public/data.jsx
- ZAMPP_DATA.ME carries is_client + display_name.
- services/web-ui/public/index.html
- Loads dist/modal-account-settings.js.
- services/web-ui/public/styles-rest.css
- .user-row grid widened to 6 columns.
- docker-compose.yml
- Plumbs SESSION_COOKIE_SECURE + ADMIN_BOOTSTRAP_* env vars.
Deploy:
cd /opt/wild-dragon
git pull origin main
# In .env:
# AUTH_ENABLED=true
# SESSION_SECRET=<openssl rand -hex 48>
# ADMIN_BOOTSTRAP_USER=admin
# ADMIN_BOOTSTRAP_PASSWORD=<strong>
docker compose build mam-api web-ui
docker compose up -d --force-recreate --no-deps mam-api web-ui
The redirect loop after successful login was almost certainly the
`sessions` table never being created. `schema.sql` defines it but
only runs on first-init via the postgres entrypoint; instances
bootstrapped via mam-api's own migration loop never got the table.
express-session's `req.session.save()` then failed silently and the
cookie pointed at a sid that wasn't in the store — every subsequent
request looked like a brand-new visitor.
- New migration 021-ensure-sessions-table.sql (idempotent).
- connect-pg-simple now configured with `createTableIfMissing: true`
as belt-and-braces.
- `POST /auth/login` now explicitly waits for session.save() and
surfaces both regenerate() and save() errors instead of treating
them as 'success'. Logs sid + req.secure + req.protocol so we can
confirm trust-proxy is doing the right thing behind NPM.
Frontend / UX / a11y
- Sidebar collapse/expand toggle with localStorage persistence (#142)
- Settings sections wrap inputs in <form> with Enter-to-submit + native
validation; password autocomplete=new-password (#141, #138)
- Asset thumbnails get descriptive alt text (#140)
- Production deploy now precompiles JSX via esbuild and loads the
production React UMD instead of dev builds + in-browser Babel (#139,
#122)
- Search wrapper gets role=search; global search input gets aria-label,
role=combobox, aria-controls/aria-expanded/aria-activedescendant
wiring (#137, #135)
- Dashboard and Library no longer share the same nav icon (#136)
- Sidebar collapses off-canvas with a topbar menu button below 768 px;
mobile default is collapsed (#134)
- --text-3 bumped to #8B92A0 for WCAG AA contrast on --bg-0 (#133)
- Schedule and Library routes were rendering empty inside the .main
flex container — switched to flex:1 + min-height:0 (#131, #132,
editor + asset detail get the same fix)
- Jobs nav badge now polls /jobs?status=active every 10 s and reflects
the live count (#130, #113)
- aria-label sweep on every icon-only button (#126)
- Premiere panel release list moved to window.PREMIERE_RELEASES in
data.jsx; Editor + Settings read from the same source (#125)
- Typo setPgMclips → setPgmClips (#124)
- Stray console.error / console.warn calls gated behind
window.DF_LOG.{warn,error} (#123)
- Hardcoded /api/v1 paths route through window.ZAMPP_API_PREFIX (#115)
- Schedule rows no longer crash on null recorder_id (#117)
- EditorKeyboard guards against document.activeElement === null (#116)
- Unmount-safe timers for PasswordResetModal, Containers, Editor (#111)
- Player seek clamps below totalMs, server-side range clamping +
uncached 416 on EOF, client-side EOF-stall watchdog (#143)
- Duration badge overlap fix on narrow asset cards (#52)
Backend / security / reliability
- GET /recorders fixed N+1: single LATERAL JOIN for live_asset_id;
Docker inspects bounded to actually-recording rows (#121)
- Upload disk-storage (multer.diskStorage) streams parts to S3 instead
of buffering 500 MB in RAM (#120)
- /assets list clamps limit to MAX_LIMIT=500 to prevent OOM (#119)
- SDK upload archive listing + post-extract sanitize block zip-slip /
tar-slip and symlink escapes (#118)
- Migrations track applied state in schema_migrations, run in a
transaction, and exit non-zero on failure (#107)
- node-agent BMD_COUNT override uses BMD_DEVICE_PREFIX; filesystem
detection wins (#109, #127)
- GPU_COUNT override now merges with nvidia-smi enrichment (#108)
- /cluster/heartbeat requires a node-bound token or admin user;
tokens carry bound_hostname (#106)
- /recorders/:id/start error responses no longer echo the Docker
create payload — env vars stay out of client responses (#105)
- /recorders/probe restricts schemes (srt/rtmp/rtsp/udp/rtp), blocks
private + loopback hosts for non-admins, denies common service
ports (#104)
- Scheduler tick guarded by a Postgres advisory lock; pending/running
rows claimed via UPDATE...RETURNING + FOR UPDATE SKIP LOCKED to
survive multi-node deploys (#103)
- UUID validateUuid('id') param middleware on every /:id route (#102)
- Error handler scrubs Postgres error messages and 5xx detail (#101)
- Graceful SIGTERM/SIGINT shutdown — stops scheduler, drains the HTTP
server, ends the pool, 25 s force-exit watchdog (#100)
- AMPP sync moved from fire-and-forget to a persisted retry queue
(ampp_sync_status / attempts / next_attempt_at + scheduler retry
loop with exponential backoff) (#77)
Migrations
- 019: api_tokens.bound_hostname (#106)
- 020: assets.ampp_sync_status + retry bookkeeping (#77)
Other
- Defer #92 Growing-files per-upload toggle, #80 Audio tab, #57
Dashboard redesign, #56 Editor SPA polish phase 3, #114 S3
migration tool to v1.3
Root causes found:
1. Scheduler crashing every 15s: assets table has no error_message column.
Fix: remove error_message from UPDATE in scheduler.js (#66 regression).
2. Clip freezing: client-side filmstrip seek loop runs on main thread,
seeks same proxy the player is streaming → both stall → freeze.
Fix: replace browser seek loop entirely with server-side FFmpeg worker.
3. No dedicated filmstrip worker: filmstrip was never pre-built server-side.
Changes:
- services/mam-api/src/db/migrations/018-add-filmstrip-s3-key.sql
Add filmstrip_s3_key TEXT column to assets table
- services/worker/src/workers/filmstrip.js (new)
BullMQ worker: downloads proxy, runs FFmpeg fps filter to extract
28 evenly-spaced JPEG frames, base64-encodes them, uploads JSON
array to S3 at filmstrips/<assetId>.json, stores key in DB
- services/worker/src/workers/thumbnail.js
Queue filmstrip job automatically after thumbnail completes
- services/worker/src/index.js
Register filmstrip worker (concurrency=2), export filmstripQueue
singleton, close it on SIGTERM
- services/mam-api/src/routes/assets.js
- filmstripQueue added
- POST /reprocess?type=filmstrip now supported
- GET /:id/filmstrip returns signed S3 URL for JSON frames
- services/mam-api/src/routes/jobs.js
filmstrip queue visible in Jobs UI
- services/web-ui/public/screens-asset.jsx
Replace browser seek loop with fetch of /assets/:id/filmstrip
→ fetch S3 JSON → render frames. Zero browser-side video seeking.
Right-click and Files tab re-generate via API endpoint.
Bug fixes:
- #91: dockerApi() 10s socket timeout (Docker daemon hang)
- #77: await syncToAmpp() with .catch() — no longer fire-and-forget
- #75: migration 016 — add 'proxy','import' to job_type enum; add 'completed' to job_status
- #73: BullMQ orphan job cleanup on hard asset delete
- #70: batch-trim jobs table gets expires_at; trim-status auto-expires stale rows
- #66: scheduler tick marks stale live assets (>2h) as error
- #63: migration 017 — partial unique index prevents concurrent live asset overwrite
- #61: recorders.js uses getS3Bucket() not stale process.env.S3_BUCKET
- #60: already fixed (copy nulls proxy/thumbnail keys, requeues proxy)
- #40: already fixed (All projects clears openProject)
- #64: already fixed (sourceType/needsProxy handled)
- #90: GET /jobs now includes DB jobs table (trim jobs visible in UI)
- #74: nginx Content-Type header preserved; multer 500MB file size limit
- #68: GET /upload returns in-progress ingesting assets
- #58: /stream and /video endpoints fall back to original file for all video types
- #55: recorder poll .catch() logs auth errors cleanly; redirect stops interval
- #52: thumb-status and thumb-duration moved inside position:relative wrapper
- #50: ProjectCard gets onContextMenu handler with rename/delete menu
- #49: project context menu dismisses on contextmenu + scroll events
Features:
- #93: POST /assets/:id/reprocess?type=proxy|thumbnail — force re-queue any asset
Asset ⋯ menu now shows 'Re-generate proxy' and 'Re-generate thumbnail' buttons
UI:
- Logo: brightness(0) invert(1) filter applied consistently in sidebar, launcher,
and login — white logo pops on dark UI; inline style removed from login.html
- Fix 500 error when creating bins: missing updated_at column on bins table
(migration 013 adds the column, schema.sql updated)
- Add drag-and-drop support for moving asset cards/list rows onto bin rail items
with visual droppable highlight
- Add right-click context menu on project rail items (Rename/Delete)
- Expose RenameProjectModal via window so Library screen can reuse it
- Bins context menu already existed — was hidden by the 500 error
Add 'trim' to job_type enum, create temp_segments table with
expiry/job/asset indexes, and add conform_source_sequence_id
to assets for lineage tracking.
Closes#33
Adds Ingest → YouTube. UI takes a URL + project, API enqueues a BullMQ
"import" job, worker shells out to yt-dlp, lands the MP4 in S3 at the
same originals/{assetId}/... path uploads use, then hands off to the
existing proxy queue. Imported assets share one lifecycle with uploads
from that point on.
Worker container picks up yt-dlp + python3 (apk on alpine, apt on the
GPU variant). The new 'import' queue is registered in jobs.js so it
appears in the Jobs SSE stream and retry/delete work for free.
Spec: docs/superpowers/specs/2026-05-23-youtube-importer-design.md
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- migration 010: asset_comments table (id, asset_id, user_id, body,
frame_ms, resolved, timestamps) with index on asset_id+created_at
- new routes mounted at /api/v1/assets/:assetId/comments — GET/POST/
PATCH/DELETE with author join (display_name + initials), nullable
user_id so comments still attach when AUTH_ENABLED is off
- Asset detail loads comments from the API on mount instead of the
empty ZAMPP_DATA.COMMENTS seed; addComment POSTs and merges the
returned row; resolved-toggle and delete are wired
- CommentsList: new trash-icon delete action per comment, helpful
empty-state copy ('Add one below to mark a frame'), tooltips on
the timestamp and resolved buttons
Now editor comments survive page reload, are visible to other users
via the same API, and pin reliably to frame_ms (integer) instead of
a parsed HH:MM:SS:FF string.
Expands recorders with video bitrate, framerate, audio codec / bitrate
/ channels, container format, and a node_id/device_index pair so the UI
can pin SDI recorders to a specific node + DeckLink port instead of
relying on a flat "BM1/BM2" index. capture-manager.js consumes these via
env vars and builds ffmpeg args from them.
Migration 004 wrapped table creation in IF NOT EXISTS, so deploys with
a pre-existing cluster_nodes table never picked up the inline UNIQUE
constraint and accumulated duplicate hostnames on every container
restart. This migration purges older duplicates and adds the unique
index idempotently so the ON CONFLICT (hostname) upsert finally works.
While a recorder is running, the capture container tees an HLS
stream into /live/<assetId>/ alongside the ProRes master upload.
The asset row is pre-created at recorder start with status='live'
so the clip appears in the library immediately. /api/v1/assets/:id/stream
returns the HLS playlist URL until recording stops, then proxy.
* docker-compose: shared wild-dragon-live mount on api/capture/web-ui
* migration 001-add-live-status: idempotent ALTER TYPE for asset_status
* mam-api: runMigrations() on boot; recorders.js pre-creates live asset
+ passes ASSET_ID; assets.js POST upserts on existing live row instead
of inserting a duplicate, and stream route returns HLS for live assets
* capture: parallel HLS ffmpeg into /live/<assetId>/; ASSET_ID env
* web-ui: nginx serves /live/, preview.js loads hls.js, LIVE badge added
pool.js was using DB_HOST/DB_USER/etc which were never set.
The docker-compose.yml passes DATABASE_URL. Parse that if present,
fall back to individual vars for local dev.