Commit graph

203 commits

Author SHA1 Message Date
c2409bd037 fix(mam-api): add last_seen_at to cluster_nodes for playout failover
Playout failover queries cluster_nodes.last_seen_at to find healthy nodes
for channel re-placement. Column missing from original cluster schema.

Migration 031 adds column + backfills existing nodes to NOW().

Fixes scheduler error: column "last_seen_at" does not exist

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-30 13:39:06 -04:00
Zac
9d098e9778 feat(auth-ui): interactive permissions matrix, admin 2FA reset, Downloads button
Backend (routes/users.js):
- GET / now returns totp_enabled so the UI can show 2FA status
- GET /:id/access — admin-only effective per-project access (MAX over direct +
  group grants), labels via=direct|group:<name>; admins report all/edit
- POST /:id/totp/disable — admin clears a locked-out user's 2FA without their
  password (self-service disable still requires it); dev user blocked
- role validated against {admin,editor,viewer} on create + PATCH (was unchecked)

Frontend:
- Users>Policies tab: static prose replaced with interactive per-user matrix —
  inline role select, 2FA badge, Reset-2FA action, lazy per-user access expander
- Home "Premiere panel" tile -> "Downloads"; modal renamed, adds Teams ISO row
  (disabled "coming soon" until the .exe is supplied); UXP .ccx link unchanged
- data.jsx: window.TEAMS_ISO placeholder ({available:false})

Not runtime-tested in browser yet. Teams ISO .exe still pending from user.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 15:59:27 +00:00
Zac
ca71e47035 fix(playout): repair failover, authenticate scheduler self-calls, fix playlist walk + CasparCG consumer syntax
Post-review fixes for the 8-commit playout-mcr drop:

- Scheduler self-calls (callSelf -> /recorders, /playout) carried no auth, so
  under AUTH_ENABLED=true requireUiHeader 403'd every mutating POST. This broke
  playout failover AND scheduled recordings. Add a per-boot in-process service
  token (x-internal-token) the scheduler attaches; requireAuth/requireUiHeader
  treat it as the seeded admin. No env/compose config needed.

- Failover deadlocked: restartChannel set status='starting' then the scheduler
  called the guarded /start route, which 409s on 'starting'. Extract the spawn
  body into spawnChannelSidecar() shared by /start and restartChannel; failover
  now spawns directly with no self-call.

- Phase A playlist stalled after 2 clips: _scheduleAdvance cued the next clip
  via LOADBG AUTO but never advanced the pointer. Pass asset_duration_ms in the
  /play payload and arm a duration-based timer that advances currentIndex and
  cues subsequent clips, keeping as-run in sync for arbitrary-length playlists.

- CasparCG consumer syntax was invalid: "ADD <ch> FFMPEG" is the producer name,
  not a consumer keyword, and old -vcodec/-acodec short args are rejected. Use
  STREAM/FILE with -codec:v / -codec:a / -preset:v / -tune:v and a format=yuv420p
  filter ahead of libx264 (channel output is RGBA).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 14:51:35 +00:00
Zac
5538683d78 feat(mam-api): /playout control plane + auto-failover
Routes: channel + playlist CRUD, start/stop/play/pause/skip transport, as-run
log. RBAC via assertProjectAccess on channel.project_id; null project ⇒
admin-only (recorder convention).

Sidecar orchestration mirrors recorders.js: Docker socket for local node,
node-agent /sidecar/start for remote. Channel start passes CHANNEL_ID env so
the sidecar can write HLS preview to /media/live/<id>.

DeckLink port-contention guard: blocks starting a decklink channel when a
recorder or another channel on the same node+device_index is active.

restartChannel(id) helper picks another healthy cluster node and re-places
non-decklink channels; decklink is alert-only. Exposed for the scheduler.

Scheduler tick adds step 6: poll each running channel's sidecar /status,
update last_heartbeat_at, and after ~3 misses trigger restartChannel +
self-call /start. Reuses the existing PG advisory lock so multi-replica
deploys don't double-fire failovers.
2026-05-30 14:02:25 +00:00
Zac
29187a90df feat(mam-api): migration 029 — playout schema
Six tables: channels, playlists, items, sidecars (sidecar registry for
health-check), schedule (Phase B), as-run log.

- video_format default 1080p5994 (house standard, capture cadence)
- restart_count / last_restart_at / last_heartbeat_at on channels for
  auto-failover bookkeeping
- audio_normalized flag on items so re-stages skip the loudnorm pass
- unique partial index on (channel_id) for running sidecars
2026-05-30 14:02:25 +00:00
Zac
72fc608d8a fix(mam-api): harden TOTP login flow + tighten Google domain check
Review of the v2 auth landing turned up four weak spots in the MFA path.
All four are now fixed; behaviour is unchanged for the password-correct
+ correct-TOTP happy path.

1. TOTP brute-force gate (the big one). /login was calling
   ipBackoff.recordSuccess(ip) the instant the password hashed correctly,
   *before* the second factor was proven. That cleared the per-IP failure
   counter, so each /login retry let an attacker with a known password
   hammer the 6-digit /login/totp space (10^6) at full speed.
   Now recordSuccess fires only inside establishSession() — i.e. after
   every required factor has actually passed (password [+TOTP] or
   OAuth [+TOTP]).

2. MFA ticket binding. Tickets issued by /login (and the Google callback)
   were unbound — a stolen ticket replayed from a different origin still
   worked. Tickets now carry SHA-256 hashes of the issuing request's IP
   and User-Agent; redeemTicket rejects on mismatch. The ticket is burned
   even on mismatch so a wrong-binding probe can't be retried.

3. TOTP replay within the same 30s step (RFC 6238 §5.2). The verifier
   accepted the same code as many times as you submitted it. Now
   verifyToken returns the matched counter, and /login/totp does a CAS
   UPDATE on users.totp_last_counter — codes at counters <= the last
   accepted value are rejected. New migration 030 adds totp_last_counter,
   seeded on /totp/enable so the enrollment code itself can't be reused
   at first login, and zeroed on /totp/disable.

4. Google OAuth domain check no longer falls back to the email suffix
   when the hd (hosted-domain) claim is missing. Email-suffix matching
   let consumer (non-Workspace) Google accounts whose email happens to
   end in the allowed domain through; if GOOGLE_ALLOWED_DOMAIN is set,
   the operator means "only this Workspace", so accounts without a
   verified hd must be rejected.

Tests: new mfa-tickets.test.js covers ip/UA binding, single-use on
mismatch, and bindings-absent back-compat. totp.test.js updated for the
new verifyToken return shape (counter on success, null on failure;
truthiness still works at call sites) and adds an explicit
matched-counter check.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 12:52:53 +00:00
Zac
3fe7d6bba2 fix(mam-api): close cross-project authz gaps in assets/bins/jobs/upload
Review of the v2 auth landing found four places where the per-project RBAC
helpers weren't applied to destination/source projects, letting a scoped
editor write into projects they don't have access to:

- assets PATCH /🆔 bin_id moved with no check, so an editor in project A
  could stuff their asset into a bin in project B. Now validates the bin's
  project_id matches the asset's own project (assets don't change project).
- assets POST /:id/copy: body's projectId/binId never checked, so any
  reachable asset could be cloned into an arbitrary project. Now asserts
  edit on the destination project and validates binId belongs there.
- bins POST /:id/assets: requireBinEdit checks edit on the bin's project but
  not on the source asset's project, so an asset from project B could be
  pulled into A's bin tree (and surfaced in A's views). Now the asset must
  belong to the bin's own project.
- jobs POST /conform: project_id from body never gated, so any logged-in
  user could enqueue conform jobs against any project. Now asserts edit.
- upload POST /init, POST /simple: projectId/binId from body never gated,
  same class of bug. Now asserts edit on projectId and validates binId.
- upload GET /: returned every in-progress upload globally, leaking
  filenames across projects. Now scoped via accessibleProjectIds.

These are the same pattern as the holes 2615143 closed on recorders/
sequences/imports/comments — these routes existed before the RBAC commit
landed and were never marked TODO(authz), so the broad sweep missed them.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 12:52:29 +00:00
Zac
2615143c6d feat(mam-api): finish per-project authz on the deferred routers
Phase 1 scoped only projects/assets/bins and left recorders, sequences,
imports, comments carrying TODO(authz) markers. A scoped editor/viewer could
still read and mutate those across every project. This closes the gap using
the existing authz.js helpers — no open TODO(authz) markers remain.

- recorders: param('id') resolves project + view baseline, requireRecorderEdit
  on PATCH/DELETE/start/stop, GET / filtered by accessibleProjectIds, POST /
  asserts edit on the target project (null project = admin-only)
- sequences: same param pattern + requireSequenceEdit on PUT/:id,/clips,conform
  and DELETE; GET//POST/ assert on the query/body project
- imports: POST /youtube asserts edit on the body projectId
- comments: router.use guard resolves project via the asset (view to read, edit
  to write); also fixes the author bug (req.session.userId -> req.user.id, which
  was always NULL so comments had no recorded author)
- capture: intentionally any-logged-in (shared hardware, asset scoped on
  registration) — TODO replaced with a rationale note

Security fixes from review of this change:
- recorders POST /:id/start: a per-take projectId override could route a live
  asset into a project the caller lacks edit on — now asserts edit on the
  override target
- sequences PUT /:id/clips: spliced asset_ids weren't checked, so an editor
  could pull in (and via GET /:id leak signed proxy URLs for) assets from a
  project they can't access — now every clip asset must belong to the
  sequence's project; pre-transaction queries moved inside try/catch so a DB
  error returns 500 instead of hanging the request

- tests: recorders-access, sequences-access (incl. cross-project clip guard),
  comments-access (incl. author-id regression)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 03:48:02 +00:00
Zac
0c3a4b625f feat(mam-api,web-ui): Google OAuth (OIDC) sign-in
Optional "Sign in with Google" with auto-provisioning, fully config-gated:
without GOOGLE_CLIENT_ID/SECRET and OAUTH_REDIRECT_URL the routes 404 and the
button is hidden, so deployments without SSO are unaffected.

- migration 028: users.google_sub (unique) + email; password_hash nullable
  for OAuth-only accounts
- src/auth/google-oauth.js: lazy google-auth-library, ID-token verify,
  GOOGLE_ALLOWED_DOMAIN enforcement, requires email_verified === true
- auth routes: /auth/google (state-CSRF redirect), /auth/google/callback,
  /auth/google/enabled; reuses establishSession
- web-ui: "Sign in with Google" on the login screen (shown only when enabled),
  friendly callback error handling
- .env.example documents all new vars

Security hardening (from review of this + the TOTP work):
- resolveGoogleUser links ONLY by google_sub, never by email — a Google login
  can never seize a pre-existing local account (account-takeover fix)
- a Google-linked account with TOTP still requires the second factor (ticket
  in session, /?mfa=1 step) instead of bypassing it
- /login/totp now applies the per-IP login backoff
- recovery-code consumption is atomic (WHERE used_at IS NULL + rowCount)
- concurrent first-login race on google_sub is caught and re-resolved
- tests: google-oauth config helpers + google-link takeover/dedup regression

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 02:51:59 +00:00
Zac
fff0828d79 feat(mam-api,web-ui): TOTP two-factor authentication
Optional time-based 2FA on top of password login. TOTP core is hand-rolled
on node:crypto (RFC 6238) — no runtime dep — and verified against the RFC
test vectors.

- migration 027: users.totp_secret/totp_enabled + user_recovery_codes
- src/auth/totp.js: base32, secret gen, RFC 6238 verify, otpauth URI,
  recovery codes
- src/auth/mfa-tickets.js: short-lived single-use tickets bridging the two
  login steps (in-memory, single-instance like the rate-limiter)
- auth routes: /totp/setup, /totp/enable (returns recovery codes once),
  /totp/disable (password-confirmed); login returns {mfa_required, ticket}
  when enabled, /login/totp completes with a code or recovery code
- /auth/me and loadUser surface totp_enabled
- web-ui: login second-factor step; Settings -> Account TOTP enroll (QR +
  manual secret + recovery codes + disable)
- qrcode added as an optional dep; setup degrades to manual entry if absent
- tests: totp unit (RFC vectors) + integration (enable/login/recovery/disable)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 02:42:57 +00:00
Zac
ec026195eb feat(mam-api,web-ui): per-project RBAC (v2 auth layer)
Adds per-project access control on top of the flat v1 auth. admin keeps
global access; editor/viewer are scoped to projects granted to them (direct
or via group) at view (read-only) or edit (read-write) level.

- migration 026: project_access table + access_level enum
- src/auth/authz.js: central isAdmin/accessibleProjectIds/projectLevel/
  assertProjectAccess
- requireAdmin middleware; admin-gate /users, /auth/users, /groups
- enforce scoping on projects, assets, bins (list filter + per-resource
  view/edit + create checks); gate bulk asset maintenance + batch-trim
- grant API: GET/POST/DELETE /projects/:id/access
- web-ui: hide admin nav for non-admins, admin-route bounce, project
  "Manage access" modal, rewrite Policies tab
- tests: authz, project-access, assets-access (node:test, skip w/o DB)
- deferred routers carry TODO(authz) markers; .env.example documents the
  service-token-needs-admin/grants requirement

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 02:37:36 +00:00
9d6bbf8112 fix(mam-api): /stream returns MP4 url + separate hls_url (fixes Premiere import)
The HLS-VOD work made GET /assets/:id/stream return the HLS playlist URL as
`url` whenever hls_s3_key was set. The Premiere plugin's "Import Proxy"
downloads `url` to a file and imports it — so it was saving an .m3u8 playlist
as .mp4, and Premiere rejected it ("unsupported compression type"). This hit
every YouTube asset (all get HLS generated), regardless of codec.

/stream now returns the directly-downloadable MP4 proxy as `url` (type mp4)
and the HLS playlist as a separate `hls_url`. The web player prefers `hls_url`
(so in-browser HLS playback is unchanged), while the already-installed plugin
gets a real MP4 again — no plugin reinstall needed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 21:44:52 -04:00
0818f15498 fix(s3): land NodeHttpHandler request/connection timeout in main
The s3 client request-timeout fix (the original browser playback-hang fix)
was applied directly on zampp1 but never committed to main. Without it a
stalled RustFS GET hangs /video and /hls indefinitely. Landing it so a clean
deploy from main no longer regresses playback.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 17:26:59 -04:00
4473427515 Merge remote-tracking branch 'wilddragon/feat/recorder-codec-bitrate' into integrate 2026-05-29 17:25:28 -04:00
9b47250388 feat(recorder): default All-Intra HEVC (NVENC) + custom bitrate, auto fps/res, source-bitrate warning
#2 Recorder codec/bitrate:
- Default recorder codec → hevc_nvenc (All-Intra HEVC NVENC); ProRes/H.264/DNxHR
  still selectable. recorders.js default flips prores_hq → hevc_nvenc.
- Custom target bitrate (Mbps) input, shown only for bitrate-controlled codecs
  (NVENC/x264/x265/DNxHD); ProRes shows quality-based (no bitrate).
- Framerate + resolution are auto-detected from source (manual fields removed).
- Container derived from codec (HEVC/ProRes/DNxHR → fragmented MOV, H.264 → MP4);
  drops the stub container picker (closes #150 direction).

#3 SRT/RTMP customization + bitrate warning:
- Same codec/bitrate/auto controls apply to network recorders (shared form).
- Warns in the modal when the configured target bitrate exceeds the probed
  source stream bitrate (via /recorders/probe) — re-encoding above source adds
  storage, not quality.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 17:04:00 -04:00
8ea750f5df feat(playback): HLS VOD rendition for browser (supplements MP4 proxy)
Browser playback of recorded assets moves to HLS, retiring the MP4
range-stitching path for VOD. MP4 proxy is kept for the Premiere panel.

- worker/hls.js: remuxToHls() stream-copies the proxy MP4 → fMP4 HLS
  (playlist.m3u8 + init.mp4 + segment_*.m4s) via existing segmentToHls,
  uploads to hls/<id>/, sets assets.hls_s3_key. hlsWorker backfills from
  an existing proxy.
- proxy.js: generate HLS inline after the MP4 upload (local file, no
  re-download, no re-encode); best-effort/non-fatal.
- worker/index.js: register 'hls' worker wherever 'proxy' runs.
- mam-api: GET /assets/:id/hls/:file serves playlist/init/segments as
  whole-object GETs (no Range → sidesteps RustFS bug), strict filename
  validation. /stream prefers hls_s3_key (type:'hls'). reprocess?type=hls
  backfills. Migration 025 adds assets.hls_s3_key.
- Frontend unchanged: hls.js path already handles type:'hls'.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 16:18:15 -04:00
f2542bc929 feat(nvenc): GPU sidecar passthrough + All-Intra HEVC capture codec
Phase 0.2 of the NVENC All-Intra HEVC ingest plan.

node-agent/handleSidecarStart:
- Accept useGpu: true in the sidecar start body
- When useGpu: adds Runtime=nvidia, DeviceRequests=[gpu], and injects
  NVIDIA_VISIBLE_DEVICES=all + NVIDIA_DRIVER_CAPABILITIES=video,compute,utility
  into the container env. CPU-codec recorders are unaffected (useGpu defaults false).

mam-api/recorders (start endpoint):
- Derive useGpu from recorder.recording_codec — true for hevc_nvenc/h264_nvenc
- Pass useGpu to remote sidecar start body
- Apply same Runtime/DeviceRequests to the local Docker spawn path

capture/capture-manager:
- Update hevc_nvenc codec entry with all-intra flags:
  -g 1 -bf 0 (every frame IDR, no B-frames — required for growing-file
  edit-while-record), -rc vbr, -profile:v main10, pixFmt p010le (10-bit 4:2:0)

Next: validation gate (§8) — test MXF OP1a then fragmented MOV on one
DeckLink channel, mount in Premiere while recording.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 12:35:23 -04:00
92b460f503 fix(recorder): finalise live asset on stop + add live SDI monitor
Stuck-live fix: capture sidecar now finalises the pre-created live asset by id (new POST /assets/:id/finalize) instead of POSTing a new asset (409 collision); node-agent gives the sidecar a 180s stop grace so the S3 upload + callback complete; node-agent logs sidecar start/stop for diagnostics.

Live SDI monitor: HLS preview is now a 2nd output of the hires ffmpeg (single DeckLink read, split to ProRes/S3 + H.264/HLS); node-agent serves /live over HTTP; mam-api proxies GET /recorders/:id/live/* to the recorder node; web-ui HlsPreview loads from the proxied URL.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 03:20:20 +00:00
634f1842bd fix: add Bearer auth to capture sidecar callback and pass CAPTURE_TOKEN
- capture/src/index.js: read MAM_API_TOKEN from env; include
  Authorization: Bearer header in shutdown callback fetches to mam-api
  (POST /assets and POST /assets/:id/mark-empty). Without this, mam-api
  AUTH_ENABLED=true rejects the callback with 401, leaving assets stuck in live
- recorders.js: pass MAM_API_TOKEN=${CAPTURE_TOKEN} in sidecar env so the
  capture container receives the token at boot
- api_tokens: inserted capture-sidecar token (unbound, prefix b3d3d3c4)
2026-05-29 01:57:39 +00:00
453103aee6 fix: use external MAM_API_URL for remote capture sidecars; add cluster metrics endpoint and dashboard resource graphs
- recorders.js: when isRemote=true, replace MAM_API_URL in sidecar env with
  http://<NODE_IP>:<PORT_MAM_API> so capture containers on worker host network
  can reach mam-api (fixes assets stuck in live status after recorder stop)
- cluster.js: add GET /api/v1/cluster/metrics endpoint returning per-node
  cpu/ram/gpu utilization; update heartbeat handler to persist metrics JSONB
- web-ui: add Resources panel to dashboard with live CPU/RAM/GPU bars per node,
  polling /api/v1/cluster/metrics every 5s
2026-05-29 01:04:24 +00:00
888ca65045 feat(capture): Deltacast SDI framework — test-card capture, cluster detection, UI
## capture service
- capture-manager.js: add 'deltacast' source_type to _buildInputArgs.
  Uses 'deltacast://<index>' with ffmpeg deltacast demuxer when
  /dev/deltacast<N> exists; falls back to lavfi testsrc2 + sine test card
  (matching deltacast-sdi-recorder standalone app) when hardware absent.
- routes/capture.js: add GET /devices/deltacast endpoint (enumerates
  /dev/deltacast* + DELTACAST_PORT_COUNT env fallback). Extend /probe to
  handle source_type=deltacast.

## node-agent
- detectHardware(): add 'deltacast' array to capabilities payload.
  Enumerates /dev/deltacast* nodes; falls back to DELTACAST_PORT_COUNT env.
  Adds DELTACAST_MODEL env support. Logs dc= count in heartbeat line.
- sidecar /start: bind /dev/deltacast* device nodes into capture containers
  when sourceType='deltacast'.

## mam-api
- cluster.js: add GET /cluster/devices/deltacast and
  GET /cluster/devices/deltacast/signal endpoints — same shape as
  blackmagic equivalents for UI parity.
- recorders.js /start: pass DELTACAST_PORT_COUNT env to capture container;
  bind /dev/deltacast* device nodes on local spawn.
- migration 024: ALTER TYPE source_type ADD VALUE 'deltacast' (idempotent).
- schema.sql: add 'deltacast' to source_type ENUM for fresh installs.

## web-ui
- modal-new-recorder.jsx: add 'Deltacast' source type card; fetch
  /cluster/devices/deltacast on selection; port picker with TEST CARD
  badge when hardware absent; falls through to manual index entry if
  no devices detected.
2026-05-28 23:12:40 +00:00
354731a363 fix(capture): fix DeckLink device name enumeration for SDI port 2+; add per-take project selector on Recorders page
- capture-manager.js, routes/capture.js: fix ffmpeg -sources decklink
  parse regex from v4l2 hex-address format (never matched DeckLink output)
  to correct indented-line format. Port 2+ (index 1+) was falling through
  to a wrong model-name fallback, causing ffmpeg to open the wrong input
  and produce black frames. Now logs the detected device list and the
  selected name at start.

- recorders.js (/start): accept per-take projectId override in request
  body. If provided, clips go to that project instead of the recorder's
  default project_id. Used for both the live-asset INSERT and the
  PROJECT_ID env var passed to the capture container.

- screens-ingest.jsx (RecorderRow): add project dropdown shown when
  recorder is stopped. Defaults to the recorder's configured project;
  operator can change it before hitting Record without editing the
  recorder config.
2026-05-28 22:26:08 +00:00
Claude
56d7479a35 fix(mam-api): pass project_id into conform job so render can register the asset
The conform worker's final step INSERTs the rendered output into the
assets table:

  INSERT INTO assets (project_id, filename, display_name, …)
  VALUES ($1, …)
  -- project_id NOT NULL

It reads projectId from job.data, but the /sequences/:id/conform
endpoint never set it. Render finished cleanly, ffmpeg ran, output
uploaded to S3, then the final asset row INSERT failed:
  null value in column "project_id" of relation "assets"

Pass seq.project_id from the loaded sequence row. The rendered output
lands as an asset under the same project as its source sequence —
the natural target.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 14:24:04 -04:00
Claude
0abef056e7 fix(uxp+mam-api): Export Timeline render — xmeml schema + BullMQ job poll
Two cooperating bugs left Export Timeline stuck at "Rendering Hi-Res"
forever:

A. worker emitted "Invalid FCP XML: no sequence element" because
   Timeline.generateFcpXml produced fcpxml (FCP X schema:
   <fcpxml><resources>/<library>/...) while the worker's parseFcpXml
   expects xmeml (FCP 7 schema: <xmeml><sequence>...). Two completely
   different formats.

   Rewrite generateFcpXml to emit xmeml v5 with the structure the
   parser walks:
     xmeml/sequence/{name,duration,rate{timebase,ntsc},
                     media/video/{format/samplecharacteristics,
                                  track[@currentExplodedTrackIndex]
                                  /clipitem/{name,duration,rate,in,out,
                                             start,end,file/{name,pathurl}}}}
   Clipitem in/out are SOURCE frames (the underlying media in/out);
   start/end are TIMELINE frames (the cut position). The worker uses
   the rate timebase to parse them.

B. /api/v1/jobs/:id rejected the panel's polls with
   "Invalid id — must be a UUID". The handlers below correctly parse
   BullMQ-prefixed ids ("conform:42"), but router.param('id',
   validateUuid('id')) ran first and 400'd everything that wasn't a
   UUID. The panel's pollConform swallows the resulting fetch error
   silently and polls forever.

   Drop the validator. Comment in the file explains why.

Bumps panel to v2.2.2.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 13:58:13 -04:00
Zac Gaetano
7e3e6b2a28 fix(auth): force HTTPS on dragonflight.live so login cookies stick
User reported infinite login loop on dragonflight.live. Root cause: openresty
fronts both http:// and https:// without redirecting, and a user landing on
http:// gets the Set-Cookie response silently dropped — cookies are Secure-only
when TRUST_PROXY=true, and the CORS allowlist refuses the http:// origin.
Result: login appears to succeed, next request has no session cookie, AuthGate
bounces back to login.

Two defensive layers (the openresty box is not in our reach):
- web-ui index.html: tiny inline redirect; if location is http://dragonflight.live,
  rewrite to https:// before anything else runs. Bounded to that exact hostname
  so local / LAN access on http://172.18.91.x stays as-is.
- mam-api: emit Strict-Transport-Security on HTTPS responses when AUTH_ENABLED=true.
  After one successful HTTPS visit, browsers auto-upgrade future http:// requests
  on their own — closes the loophole even if someone bypasses the index.html JS.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 22:00:35 -04:00
Zac Gaetano
8028c4c4dd feat(auth): bound-hostname tokens for node-agent + return role from /me
- requireAuth bearer path now selects api_tokens.bound_hostname and users.role,
  populates req.tokenBoundHostname and req.user.role. /cluster/heartbeat can
  now authenticate via a bound api_token (issued via POST /auth/tokens with
  bound_hostname).
- routes/tokens.js POST accepts bound_hostname; GET returns it so users can
  see which tokens are bound.
- Remove /cluster/heartbeat from SERVICE_PATHS so requireAuth runs on it (the
  bearer auth handles the gate; the heartbeat handler still enforces the
  body.hostname === bound match).
- /auth/me now returns role (final-review I2). Closes the gap where every
  signed-in user appeared as 'viewer' in the UI regardless of actual role.
- loadUser SELECTs role for session auth.
- Backend tests still 37/15/0/22 — no test changes needed; existing token
  CRUD tests stay passing since bound_hostname is optional.
2026-05-27 19:27:59 -04:00
Zac Gaetano
c8e98ffa0d fix(auth): sync DEV_USER_ID with migration 023 — use all-zeros UUID
Migration 023 was fixed in 9dc572b to use '00000000-0000-4000-8000-000000000000'
because 'v' isn't a valid hex digit, but the DEV_USER_ID constant in
middleware/auth.js still referenced the original '...000000000dev'. Every
route that passes DEV_USER_ID as a query parameter (users list, login lookup,
setup-required count) was throwing 22P02 invalid input syntax for type uuid.
The errors were swallowed by Promise.allSettled in the SPA's data load so the
app appeared to work in dev mode, but enabling AUTH_ENABLED=true would have
broken login entirely.
2026-05-27 19:08:07 -04:00
9dc572b913 fix(migration): replace invalid UUID in 023 dev user seed 2026-05-27 18:45:21 -04:00
Zac Gaetano
03d0d098f5 fix(auth): final-review integration fixes — Users page alias + PATCH, CSRF on uploads + heartbeat, drop .bak
Final-review findings:
- Mount usersRouter at /api/v1/users in addition to /api/v1/auth/users so the
  existing SPA Users page works; add PATCH /:id for inline edits (display_name,
  role, password).
- Add X-Requested-With: dragonflight-ui to raw XHR/fetch paths that bypass
  apiFetch (file uploads, SDK uploads, EDL export) — without it, requireUiHeader
  403s before reaching the route.
- Exempt SERVICE_PATHS (/cluster/heartbeat) from requireUiHeader so node-agent
  heartbeats keep working when NODE_TOKEN is unset.
- Remove stale auth.js.bak.
2026-05-27 15:42:42 -04:00
Zac Gaetano
8ede44ae87 docs(auth): flip AUTH_ENABLED default + document setup + recovery 2026-05-27 15:25:29 -04:00
Zac Gaetano
96effaaa3c fix(mam-api): TRUST_PROXY boot warning + CSRF integration tests + bounded rate-limit map
Fixes three issues in the authentication system:

C1: Add boot-time warning when AUTH_ENABLED=true but TRUST_PROXY!=true.
    Without TRUST_PROXY=true behind nginx, req.ip becomes the proxy IP for all
    clients, collapsing per-IP rate limiting into a shared pool. Operators must
    explicitly set TRUST_PROXY=true to make per-IP rate limiting effective.

C2: Mount requireUiHeader middleware in test helpers (auth.test.js,
    users.test.js, tokens.test.js). The CSRF header validation was not being
    exercised in the test suite. Tests now send X-Requested-With: dragonflight-ui
    headers that are actually validated by the middleware.

I1: Implement bounded rate-limit Map with MAX_ENTRIES=10000 and LRU eviction.
    Unbounded Maps are vulnerable to spray attacks: attackers can force memory
    exhaustion by requesting with distinct IPs. Now we evict the oldest entry
    (by insertion order) when the map reaches capacity.
2026-05-27 15:03:35 -04:00
Zac Gaetano
d209a192c3 feat(mam-api): login rate limit + X-Requested-With CSRF header check 2026-05-27 14:58:02 -04:00
Zac Gaetano
56b661ef65 feat(mam-api): API token CRUD — show raw once, bearer-authenticate via SHA-256 lookup 2026-05-27 14:52:07 -04:00
Zac Gaetano
b7f5a84d2d feat(mam-api): user CRUD + admin password reset + last-user delete guard 2026-05-27 14:47:03 -04:00
Zac Gaetano
0bbaf80d2a feat(mam-api): GET /auth/me + POST /auth/password 2026-05-27 14:42:53 -04:00
Zac Gaetano
d75a0241eb feat(mam-api): POST /auth/logout 2026-05-27 14:38:05 -04:00
Zac Gaetano
bcfc19e530 fix(mam-api): real dummy bcrypt hash + log last_login_at failures
Code-review feedback:
- Dummy hash for user-enumeration-defense timing was 63 chars (bcrypt strings
  are 60 chars). Worked by accident because bcrypt 5.x is lenient about
  trailing chars; a future tightening would silently regress the timing
  defense. Replaced with a real pre-computed bcrypt hash.
- last_login_at UPDATE now logs errors instead of silently swallowing them,
  matching the pattern in requireAuth for api_tokens.last_used_at.
- Removed dead import of comparePassword from auth.test.js.
2026-05-27 14:35:59 -04:00
Zac Gaetano
f8b6f7d5ef feat(mam-api): POST /auth/login + redirect-loop regression test 2026-05-27 14:28:18 -04:00
Zac Gaetano
c9f9698b58 feat(mam-api): POST /auth/setup — first-run admin creation 2026-05-27 14:24:56 -04:00
Zac Gaetano
49a9543942 feat(mam-api): auth router skeleton + setup-required endpoint 2026-05-27 14:21:32 -04:00
Zac Gaetano
cb7cc9a43e fix(mam-api): narrow cluster carve-out to /cluster/heartbeat only
Code-review feedback: startsWith('/cluster') was a prefix match that exposed
destructive operator endpoints (POST /containers/:id/restart, DELETE /:id,
GET /devices/blackmagic/*) unauthenticated. Only POST /heartbeat is genuine
node-agent traffic; everything else in cluster.js is operator/UI surface
that should go through requireAuth. Long-term: issue node-agent a bound
api_token and drop the carve-out entirely.
2026-05-27 14:18:27 -04:00
Zac Gaetano
9de4fe9ab9 feat(mam-api): mount requireAuth gate at /api/v1 with auth + cluster carve-outs 2026-05-27 14:13:21 -04:00
Zac Gaetano
88c3aa5149 fix(mam-api): SESSION_SECRET boot guard + cleaner CORS rejection
Code-review feedback:
- Hard-fail boot when AUTH_ENABLED=true and SESSION_SECRET is unset, so
  express-session can't silently use an in-memory random secret that
  invalidates sessions on restart and breaks multi-node clusters.
- CORS rejection now returns cb(null, false) instead of cb(new Error)
  so misconfigured origins surface as clean CORS errors in the browser
  instead of HTTP 500s. Log a warn line for operator visibility.
- pruneSessionInterval units comment.
2026-05-27 14:11:09 -04:00
Zac Gaetano
a094df03ea feat(mam-api): wire express-session + tighten CORS allowlist 2026-05-27 14:06:41 -04:00
Zac Gaetano
1a723fe4c2 fix(mam-api): requireAuth — stamp last_seen_at after user confirmation
Code-review feedback: writing last_seen_at = now before loadUser() lets
the stamp persist if the lookup throws (resave:false still writes when
modified), extending the idle window without confirming the user exists.
Also clarify DEV_USER_ID is a specific placeholder, not a generic sentinel.
2026-05-27 14:04:15 -04:00
Zac Gaetano
0248a68f57 feat(mam-api): requireAuth middleware — session + bearer + idle/absolute timeout 2026-05-27 13:59:50 -04:00
Zac Gaetano
3bca290e09 fix(mam-api): test glob — use find so npm test picks up files at any depth
/bin/sh (which npm uses) doesn't expand ** recursively. Task 1's smoke test
under test/ stopped being discovered once Task 3 added tests under test/auth/.
find + sort keeps depth-agnostic discovery portable across shells.
2026-05-27 13:54:12 -04:00
Zac Gaetano
3fc8116dd3 feat(mam-api): auth utilities — password hash/compare + token gen/hash/parse 2026-05-27 13:51:15 -04:00
Zac Gaetano
14931d6362 fix(mam-api): migration 023 — broaden ON CONFLICT + document password_updated_at backfill
Code-review feedback: ON CONFLICT (id) only catches id collisions; a pre-existing
'dev' username would trigger a unique_violation on the username index and roll
back the migration, hard-failing the mam-api boot. Switch to bare ON CONFLICT
DO NOTHING so any unique conflict is no-op-safe.
2026-05-27 13:48:08 -04:00
Zac Gaetano
1d3c0385dd feat(mam-api): migration 023 — auth timestamps + idempotent dev user seed 2026-05-27 13:44:07 -04:00