Commit graph

248 commits

Author SHA1 Message Date
179a740453 feat(admin): cluster-wide Logs page + fix container log demux + poll containers
- mam-api: dockerLogs() + demuxDockerStream() — the local container-log path
  JSON.parsed Docker's raw multiplexed stream and always returned '(no logs)';
  now strips stdcopy framing and returns readable text (tail configurable).
- web-ui: new Logs admin page — every container across every node grouped by
  node in a left rail, live-follow log viewer with filter + copy on the right.
  Reuses the now-working /cluster/containers/:node/:id/logs endpoint.
- web-ui: Containers screen now polls every 5s (was load-once) so the
  cross-cluster view stays live without manual refresh.
- icons: add server + file glyphs (were referenced but missing -> blank).
- nav: Logs wired into the Admin sidebar section + routes + breadcrumbs.
2026-06-04 05:28:17 +00:00
07eea02109 fix(capture): restore audio wallclock (throughput) + remove CPU codec options
- restore -use_wallclock_as_timestamps on audio input: without it ffmpeg's raw
  s16le reader stalled the graph (NVENC idle at 9%, ~half frames dropped). With
  it + long-GOP HEVC the encoder runs realtime and A/V length stays locked.
- remove all CPU codec options (prores*, dnxh*, libx264/265) from recorder UI;
  GPU NVENC only (hevc_nvenc / h264_nvenc). 3x L4 cluster, no reason for CPU.
- GPU codec defaults in env builders + proxy default h264_nvenc.
2026-06-04 04:14:59 +00:00
51f939b1fe fix(deltacast-bridge): use group-0 sample count as authoritative audio length
Taking the MAX sample count across the 4 audio groups could emit more audio
frames per slot than group 0 (the SDI-clock reference), drifting the audio
stream slightly longer than video — heard as a ~1% pitch-up. Group 0 paces the
timeline exactly as the original 2ch path did; shorter groups are silence-padded
to its length, never extending it.
2026-06-04 04:01:25 +00:00
de509c66ab feat(recorders): hardware-identity model with Enable/Disable lifecycle
Recorders are now physical capture ports, not user-created rows:
- migration 036: label, enabled, auto_provisioned + UNIQUE(node_id,device_index)
  (the structural fix that makes two recorders sharing a port impossible)
- mam-api: auto-provision one recorder row per port from heartbeat capabilities
  (reconcileRecordersForNode); create-once, never overwrites operator config
- mam-api: POST /:id/enable + /:id/disable (provision/teardown standby sidecar);
  PATCH accepts label; config persists across enable/disable
- node-agent: freeCapturePort() force-removes any container on a capture port
  before standby/start — eliminates the EADDRINUSE collisions
- web-ui: recorder menu grouped by node (online/offline), Enable/Disable toggle,
  per-recorder config modal (codec/bitrate/growing/label/project), friendly
  label over hardware name, no destructive delete

Fixes the delete/recreate churn that orphaned standby sidecars and collided on
capture ports during this session's outage.
2026-06-04 03:14:43 +00:00
bf4632b911 feat(mam-api): extract ensureStandbySidecar + add POST /recorders/reconcile-standby
Re-provisions the persistent standby sidecar for SDI/deltacast recorders that
lost theirs (manual cleanup, node redeploy, wiped /dev/shm). Without this the
recorder falls back to slow on-demand spawn on /start, which can collide on the
capture port (EADDRINUSE). Idempotent; { force:true } recreates even when a
container_id is already set.
2026-06-04 03:05:00 +00:00
Wild Dragon Dev
2f13c8d8b1 feat(mam-api): aggregate containers from all nodes + proxy logs 2026-06-04 01:42:13 +00:00
Wild Dragon Dev
a5aed86349 fix(recorders): kill stale standby container before on-demand respawn to prevent EADDRINUSE 2026-06-03 23:04:17 +00:00
Wild Dragon Dev
a22bda44a7 fix(recorders): set PRE_ROLL_SECONDS=1 for sdi/deltacast/blackmagic sidecars 2026-06-03 22:07:16 +00:00
Wild Dragon Dev
ef57900583 feat(recorders): always-on standby sidecars for deltacast, sdi, blackmagic
Sidecars now spawn at recorder CREATE time instead of /start time.
The container boots in STANDBY=1 mode (idle preview only, no ffmpeg master).
On /start, mam-api sends per-session params (CLIP_NAME, ASSET_ID, PROJECT_ID)
to the running sidecar via HTTP POST /capture/start — ffmpeg starts in <1s.
On /stop, mam-api calls HTTP POST /capture/stop — container stays alive in
standby, ready for the next take immediately.
Container is only killed on recorder DELETE.

This eliminates: Docker create/start overhead (~1-2s), bridge startup (~2-5s),
and pre-roll wait (~5s). Latency from 'record' click to first encoded frame
drops from ~10s to ~1s.

Changes:
- capture/src/index.js: boot in standby when STANDBY=1 env is set; still
  start idle preview (live thumbnail visible before recording)
- capture/src/routes/capture.js: POST /start accepts full codec params and
  asset_id in body (skips mam-api asset creation when asset_id provided)
- node-agent/index.js: handleSidecarStandby() + POST /sidecar/standby route;
  warms bridge at recorder create time
- recorders.js POST /: spawn standby sidecar after DB insert (non-fatal)
- recorders.js POST /:id/start: HTTP fast-path to standby sidecar; falls
  back to on-demand spawn if standby not available
- recorders.js POST /:id/stop: HTTP /capture/stop, keep container in standby
- recorders.js GET /:id/status: use port-based URL for local capture status
2026-06-03 21:59:33 +00:00
Wild Dragon Dev
9dc86aa3b6 Merge branch 'feat/unified-framecache' 2026-06-03 16:59:44 +00:00
Wild Dragon Dev
07d1fc9e72 fix(scheduler): allow 'starting' and 'stopping' statuses in DB
The scheduler tick loop updates a schedule's status to 'starting' and 'stopping'
in the database while it initiates the API calls to the recorder container. The
original CHECK constraint in recorder_schedules rejected these two statuses,
causing the scheduler to crash on constraint violation and never start the job.
2026-06-03 16:54:35 +00:00
d654f7c8a1 fix(mam-api): remove stitchedS3Stream workaround — RustFS range bug fixed in beta.6 (#143) 2026-06-03 16:07:32 +00:00
c269468014 fix(scheduler): orphan grace window must use recorder.updated_at not asset.updated_at — asset is created at recording START not STOP 2026-06-03 14:03:32 +00:00
108390e823 fix(scheduler): add 90s grace before marking stopped-recorder live assets as error 2026-06-03 12:51:41 +00:00
7704988978 fix(recorders): resolve syntax error caused by double declaration of proto variable 2026-06-03 12:17:06 +00:00
c21260c9b0 fix(ampp): require auth on AMPP endpoint 2026-06-03 10:42:57 +00:00
d16d19c26d fix(node-agent): use timingSafeEqual for token comparison 2026-06-03 10:42:57 +00:00
63f05cd652 fix(audit): critical security hardening and ops reliability fixes 2026-06-03 10:42:57 +00:00
bd662f6917 fix(migration): wrap ALTER TYPE ADD VALUE in DO block with IF NOT EXISTS check 2026-06-03 00:41:04 +00:00
a04ef2de3a feat(promotion): implement manual growing files promotion via BullMQ queue + pending_migration status + right click Move to S3 2026-06-03 00:38:50 +00:00
62b9a90291 fix(recorders): stop capture containers in the background to prevent API TimeoutError on large file uploads 2026-06-03 00:22:36 +00:00
a8b59f087d fix(recorders): pre-create live asset with .mxf key when growing_enabled (was .mov, broke proxy lookup -> error) 2026-06-02 22:10:30 +00:00
3eacb35c1e fix(capture): replace continuous idle preview with 1fps JPEG snapshot to stop FIFO contention halving capture fps 2026-06-02 21:40:52 +00:00
Claude
a2790601c9 feat(library): first-frame poster thumbnail for live recordings
Replace the HLS 'connecting…' player in the library with a real frame grabbed
from the start of the recording, while the recording is still live.

Flow:
- recorders.js already pre-creates the asset as status='live' + ASSET_ID env
- capture-manager.start() fires _publishLiveThumbnail() (non-blocking): polls
  /live/<id> for the first seg-*.ts, extracts frame 0 via ffmpeg (scaled JPEG,
  yuvj420p), uploads to S3 thumbnails/<id>.jpg, then POSTs the key to mam-api
- new mam-api POST /assets/:id/live-thumbnail sets thumbnail_s3_key on the still
  -live row (status untouched); idempotent no-op once finalized
- visuals.jsx AssetThumb: for live assets, show the static poster once the key /
  signed URL is available, else fall back to the live HLS preview. Pulsing LIVE
  border kept either way
- POST /assets gains an optional status param (default 'processing'); 'live'
  skips the proxy/thumbnail queue
- capture /stop route now finalizes the pre-created asset by id (guarded) instead
  of POSTing a duplicate

🤖 Generated with Claude Code
2026-06-02 15:21:05 +00:00
Claude
f218650b85 fix(scheduler): mark orphaned live assets error immediately when recorder stops
When a capture sidecar crashes before finalize() runs (e.g. wrong node,
filter error, hardware fault), the asset stays 'live' indefinitely — library
shows 'Recording' badge for up to 120 minutes until the stale-timeout fires.

Add an orphan check that runs every scheduler tick: if an asset is 'live'
and its recorder is 'stopped', mark it 'error' immediately. This runs before
the 120-minute staleness guard so the library clears within 15 seconds.

🤖 Generated with Claude Code
2026-06-02 11:43:22 +00:00
Claude
858c9f7b97 fix(deltacast-bridge): call VHD_SetBiDirCfg before board open + set channel SDI mode
ROOT CAUSE of 'connecting' hangs and intermittent port failures:
The DELTA-12G-e-h 8c is a bidirectional card. Without calling
VHD_SetBiDirCfg(board_index, VHD_BIDIR_80) before streaming, the
board remains in its default bi-dir config (likely 4RX/4TX) — so
RX stream opens fail with VHDERR_RESOURCEUNAVAILABLE on channels
configured as TX, causing random 'connecting' hangs per the SDK docs.

Per SDK Tools.cpp SetNbChannels() pattern:
1. Open temporary board handle
2. Check IS_BIDIR + channel counts
3. Call VHD_SetBiDirCfg(board_index, VHD_BIDIR_80) for 8ch bidir
4. Close temp handle, then open real board handle for streaming

Also add VHD_SetChannelProperty(VHD_CHANNEL_MODE_SDI) for ASI-type
channels per Sample_RX.cpp — required for 12G-ASI/3G-ASI channel
types to correctly detect incoming video standard.

🤖 Generated with Claude Code
2026-06-02 11:23:39 +00:00
32a2d0329e fix(growing+gui): growing file = MXF XDCAM HD422 (Premiere-growable) + GUI fixes
Growing root cause (4th attempt): Premiere doesn't import H.264-in-.ts
("unsupported compression type"); its growing-file support is MXF OP1a.
Prior MXF/DNxHR failed because DNxHR is VBR and never flushes the incremental
index — XDCAM HD422 (mpeg2video, CBR) DOES write index segments into body
partitions mid-record (proven live via SIGKILL: 5 index segments, readable,
no footer). Growing master is now MXF OP1a / XDCAM HD422 4:2:2 CBR + PCM s16le,
operator bitrate as CBR (default 50M). live-path returns .mxf to match.

GUI: bitrate input is now always editable in growing mode (was hidden for
ProRes-selected codecs); codec menu shown disabled-with-explanation under
growing (it had only looked "missing" due to a stale served bundle).

Requires Premiere prefs: Media > "Automatically refresh growing files" ON,
and disable the two XMP-write-on-import options.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 22:13:01 -04:00
64bbb221f7 fix(api): parse Postgres bigint (int8) as Number, not string
duration_ms/file_size are int8; node-postgres returned them as strings,
a footgun for any consumer doing arithmetic/sorting/comparison (already
hand-patched once in playout totals). Register a global int8 type parser
so the API emits real numbers. All such values are < 2^53 (no precision loss).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 21:47:45 -04:00
984a73e8ec feat(playout): redesigned MCR screen + SCTE-35 end-to-end
Drop in the redesigned timeline-centric Playout (PGM monitor, transport,
SCTE-35 card, as-run drawer) from the on-node redesign, fully wired to the
real playout API (channels/transport/HLS preview w/ error-recovery/as-run);
no mock data. In-page ConfirmModal for destructive actions.

SCTE-35: new playout_scte_breaks table (migration 033), endpoints to
schedule/trigger/list/cancel breaks (POST/GET/DELETE /channels/:id/scte[/trigger]),
scheduler due-break sweep, engine triggerScte + auto-return + as-run 'scte'
rows + on-air SCTE-BREAK state and timeline AD markers. In-stream SCTE-35
cue injection is a documented stub (CasparCG FFMPEG consumer exposes no
scte35 muxer) — scheduling/triggering/countdown/as-run are functional.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 19:58:02 -04:00
8a958046ef fix(growing-files): MPEG-TS growing master + promotion-worker share mount
Root cause: MXF OP1a writes its index/duration only in the footer partition
on finalize, so a growing MXF has no footer and VLC/Premiere/ffmpeg-strict
refuse it ("Unable to open file on disk"). Separately the proxy job pointed
at a .mov S3 key that never existed (promotion worker watched a local empty
disk, not the SMB share), so stop -> instant proxy failure.

Fix: growing master is now MPEG-TS (H.264 high422 all-intra + AAC), which is
readable from the first PAT/PMT while still growing (verified mid-write decode).
hiresKey derives from the actual produced extension. Capture skips finalize for
growing recorders (leaves asset live for promotion). Promotion worker CIFS-
mounts the same growing_smb share before scanning; worker image gets cifs-utils
and worker-p4 runs privileged (local /growing bind removed). /live-path uses .ts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 19:41:28 -04:00
08499b93b2 feat(gpu+capture): nvenc HLS preview, source-backend abstraction, GPU affinity+telemetry
#164 HLS preview uses h264_nvenc (forced-IDR, GOP=segment) when the sidecar
has the GPU, else keeps libx264 fallback.
#168 source-backend abstraction in capture-manager (blackmagic implemented as
a behavior-preserving refactor; deltacast/aja stubbed pending hardware).
#167 per-recorder gpu_uuid (migration 032) plumbed mam-api->agent->
NVIDIA_VISIBLE_DEVICES (defaults to 'all').
#166 node-agent reports encoder util + NVENC session count per GPU; Cluster
screen renders per-GPU GPU/ENC util, VRAM, sessions.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 18:38:56 -04:00
ca1eec0600 fix/feat: recorder finalize-grace + codec validation, cluster mem/version, library download
#162 local-spawn stop now uses /stop?t=180 + waits for asset to leave 'live'
before removing the container (no more SIGKILL-corrupted masters / stuck-live).
#163 validateRecorderConfig guard (PCM!=MP4, HEVC!=MXF, NVENC needs GPU) on
create+PATCH; codec presets in new-recorder modal.
#159 container list reads Docker /stats memory (N/A when null) + UI render.
#160 primary node self-populates version + uptime on the Cluster screen.
#145 asset-detail Download original gated by dismissable size warning.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 18:34:36 -04:00
fffff1c016 feat(cluster): install capture-card drivers/SDKs from the admin screen
Per-node "Capture Drivers / SDKs" panel installs Blackmagic / AJA / Deltacast
/ NDI drivers without SSH. node-agent gains NODE_TOKEN-gated /driver/install
+ /driver/status (spawns a one-shot privileged ubuntu container that bind-
mounts host kernel paths + the repo and runs deploy/install-driver.sh);
mam-api adds admin-gated /cluster/:id/install-driver + /driver-status.
Driver files live in-repo under sdk/<vendor>/ (private repo); binaries are
admin-supplied per each sdk/<vendor>/README.md. Vendor allowlist throughout.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 18:14:59 -04:00
43011bd794 Merge feat/playout-mcr into main
Playout/MCR, as-run log, redesigned dashboard, capture CIFS/growing-files,
SDI settings, cluster Add Node wizard, homepage refresh.

# Conflicts:
#	services/mam-api/src/routes/cluster.js
#	services/mam-api/src/routes/playout.js
#	services/mam-api/src/scheduler.js
#	services/playout/Dockerfile
#	services/playout/entrypoint.sh
#	services/web-ui/public/screens-home.jsx
2026-05-31 17:46:12 -04:00
19f0abeabe feat(cluster+home): Add Node wizard, homepage tagline/logo/settings tweaks
Cluster: AddNodeModal on Admin->Cluster mints a node token via /auth/tokens
and emits a ready-to-paste curl|bash onboarding command. New admin-only
GET /cluster/onboard-info returns apiUrl/scriptUrl/branch. Role->PROFILES
mapping (worker/capture/gpu); gate worker-l4 behind compose profile [gpu].

Home: restore "Let's Create" kicker + one-line "Media Asset Management &
Production Platform" tagline; animated accent pulse behind the dragon logo
(reduced-motion safe); move Settings tile to a centered bottom row.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 17:37:37 -04:00
f21bc490e8 feat(web-ui): redesigned Dashboard + playout as-run log
Dashboard (screens-home.jsx): rebuild to new design, fully live-wired.
Dropped fabricated figures per "real data" rule (object-store %, uptime,
storage breakdown); repurposed ingest cell to real Assets-24h count.
Fixed undefined refs and double-rendered Resources section.

Playout: as-run writer in scheduler.js writeAsRun() off the health-tick
/status poll; AsRunPanel UI + missing CSS in styles-playout.css.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 17:15:32 -04:00
d3ad2397fb fix(playout): serve preview m3u8 via /api to bypass the proxy's static cache
Root cause of the persistent black preview, fully isolated: ZAMPP1's nginx
serves the live .m3u8 fresh on every request (no-store works there), but
the PUBLIC reverse proxy (159.112.211.103 -> ZAMPP1) caches the static
.m3u8 by path with a multi-second TTL, ignoring both the origin's no-store
and query params. hls.js reloads the playlist ~every second, always landing
inside that TTL, so it sees the live playlist as never advancing
("live playlist MISSED" forever), never establishes the timeline, and never
loads a fragment -> readyState 0 (black). Proven: rapid reads via ZAMPP1
localhost advance (404->405); the same rapid reads via the public URL are
stuck; query-busting doesn't help (proxy caches by path).

Fix: serve the playlist through GET /api/v1/playout/channels/:id/hls/index.m3u8
instead of the static /media/live path. /api/ is not proxy-cached (the live
status poll already updates fine through it), so hls.js always gets the fresh
live edge. Segment (.ts) lines are rewritten to absolute /media/live/<id>/
URLs so they still load from the static path (immutable; caching them is
correct). ProgramMonitor points hls.js at the /api playlist and sends the
session cookie (xhrSetup withCredentials) since /api is auth-gated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 15:57:41 -04:00
e0e0b83810 fix(assets): live-path no longer gates on removed global growing_enabled
Growing-files mode became per-recorder (recorders.growing_enabled); the
global growing_enabled setting was removed. GET /assets/:id/live-path —
which the Premiere plugin calls to mount a still-growing master — still
required growing_enabled==='true', so Mount Live would 409 "Growing-files
mode is disabled" on any deploy where that stale key isn't set. Drop the
global gate: a status='live' asset already proves a growing recorder is
producing the file; only the editor-facing growing_smb_url is required.
Response contract is unchanged, so the plugin needs no update.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 15:05:06 -04:00
5968d4f681 feat(settings/growing): storage warning, SMB auth + CIFS mount, per-recorder growing
Implements docs/superpowers/specs/2026-05-31-storage-settings-growing-smb-design.md.

1. Storage warning banner at the top of Settings → Storage (set-once /
   path-change-corrupts-data warning).

2. Growing-files SMB credentials + system CIFS mount (Approach A):
   - settings.js: new global keys growing_smb_mount / growing_smb_username /
     growing_smb_vers; growing_smb_password is write-only (GET returns only
     growing_smb_password_exists; growing_smb_password_clear:true removes it).
   - GrowingSettingsCard: SMB mount/username/password (masked, "saved" state) +
     CIFS version fields.
   - capture Dockerfile: add cifs-utils + util-linux.
   - capture-manager: on growing start, mount //host/share at /growing using a
     root-only credentials file (creds never on the command line); unmount on
     stop; mount failure falls back to S3 streaming so a recording is never lost.
   - recorders.js: pass GROWING_SMB_* env; don't host-bind /growing when a CIFS
     mount is configured (an empty mountpoint is required).

3. Per-recorder growing mode (global toggle removed):
   - Removed the global "capture writes to local SMB share first" checkbox; the
     growing card is now SMB-infrastructure-only.
   - recorders.js reads the per-recorder recorders.growing_enabled column
     (already present from migration 014) instead of the global setting;
     RECORDER_FIELDS += growing_enabled.
   - New-recorder modal: "Growing-files mode" toggle.
   - storage.js overview: "enabled" now means the SMB landing zone is configured
     (mount source set), surfaced as smb_mount; health strip labels updated.

No DB migration required (recorders.growing_enabled exists; new settings are
key/value rows).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 14:50:36 -04:00
e8f91cf4b4 fix(playout): immediate failover on new channels + play 502 vs 409
- spawnChannelSidecar: set last_heartbeat_at = NOW() when flipping
  channel to 'running'. Without this, last_heartbeat_at is NULL so
  the first scheduler tick sees ageMs = (now - epoch) >> TIMEOUT_MS
  and triggers failover before the sidecar has had a single chance
  to respond.
- scheduler playoutHealthTick: when last_heartbeat_at is NULL fall
  back to updated_at as the baseline (belt-and-suspenders with the
  spawnChannelSidecar fix). Also include updated_at in the query.
- POST /channels/:id/play: catch callSidecar errors explicitly and
  return 502 Bad Gateway instead of delegating to next(err) which
  the error middleware maps to 409 Conflict.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-31 12:34:41 -04:00
e51cf1aa9c feat(jobs): surface playout-stage queue in Jobs screen
- jobs.js: add playout-stage BullMQ queue to QUEUES; asset_id from
  job data is already resolved to a name by attachAssetNames
- screens-jobs.jsx: map type 'playout-stage' -> kind 'Stage' with
  monitor icon

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-31 12:06:08 -04:00
3578c7b4e9 fix(playout): Privileged only for decklink (SRT/NDI/RTMP/HLS crashed when GPU exposed without driver) 2026-05-30 18:59:27 -04:00
cddcc9a29e fix(mam-api): selfHeartbeat writes last_seen_at so primary node isn't stale-failover-killed 2026-05-30 18:57:20 -04:00
0e844c0fc3 fix(scheduler): use updated_at as grace anchor when last_heartbeat_at NULL
Without this, a freshly-spawned channel with NULL last_heartbeat_at was
instantly failover-killed by the playoutHealthTick because `0` was used as
the lastSeen timestamp, making ageMs huge on the very first tick.
2026-05-30 17:32:15 -04:00
b4f2fb12ff fix(mam-api): heartbeat writes last_seen_at so playout failover sees healthy nodes 2026-05-30 16:32:11 -04:00
c2409bd037 fix(mam-api): add last_seen_at to cluster_nodes for playout failover
Playout failover queries cluster_nodes.last_seen_at to find healthy nodes
for channel re-placement. Column missing from original cluster schema.

Migration 031 adds column + backfills existing nodes to NOW().

Fixes scheduler error: column "last_seen_at" does not exist

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-05-30 13:39:06 -04:00
Zac
9d098e9778 feat(auth-ui): interactive permissions matrix, admin 2FA reset, Downloads button
Backend (routes/users.js):
- GET / now returns totp_enabled so the UI can show 2FA status
- GET /:id/access — admin-only effective per-project access (MAX over direct +
  group grants), labels via=direct|group:<name>; admins report all/edit
- POST /:id/totp/disable — admin clears a locked-out user's 2FA without their
  password (self-service disable still requires it); dev user blocked
- role validated against {admin,editor,viewer} on create + PATCH (was unchecked)

Frontend:
- Users>Policies tab: static prose replaced with interactive per-user matrix —
  inline role select, 2FA badge, Reset-2FA action, lazy per-user access expander
- Home "Premiere panel" tile -> "Downloads"; modal renamed, adds Teams ISO row
  (disabled "coming soon" until the .exe is supplied); UXP .ccx link unchanged
- data.jsx: window.TEAMS_ISO placeholder ({available:false})

Not runtime-tested in browser yet. Teams ISO .exe still pending from user.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 15:59:27 +00:00
Zac
ca71e47035 fix(playout): repair failover, authenticate scheduler self-calls, fix playlist walk + CasparCG consumer syntax
Post-review fixes for the 8-commit playout-mcr drop:

- Scheduler self-calls (callSelf -> /recorders, /playout) carried no auth, so
  under AUTH_ENABLED=true requireUiHeader 403'd every mutating POST. This broke
  playout failover AND scheduled recordings. Add a per-boot in-process service
  token (x-internal-token) the scheduler attaches; requireAuth/requireUiHeader
  treat it as the seeded admin. No env/compose config needed.

- Failover deadlocked: restartChannel set status='starting' then the scheduler
  called the guarded /start route, which 409s on 'starting'. Extract the spawn
  body into spawnChannelSidecar() shared by /start and restartChannel; failover
  now spawns directly with no self-call.

- Phase A playlist stalled after 2 clips: _scheduleAdvance cued the next clip
  via LOADBG AUTO but never advanced the pointer. Pass asset_duration_ms in the
  /play payload and arm a duration-based timer that advances currentIndex and
  cues subsequent clips, keeping as-run in sync for arbitrary-length playlists.

- CasparCG consumer syntax was invalid: "ADD <ch> FFMPEG" is the producer name,
  not a consumer keyword, and old -vcodec/-acodec short args are rejected. Use
  STREAM/FILE with -codec:v / -codec:a / -preset:v / -tune:v and a format=yuv420p
  filter ahead of libx264 (channel output is RGBA).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-30 14:51:35 +00:00
Zac
5538683d78 feat(mam-api): /playout control plane + auto-failover
Routes: channel + playlist CRUD, start/stop/play/pause/skip transport, as-run
log. RBAC via assertProjectAccess on channel.project_id; null project ⇒
admin-only (recorder convention).

Sidecar orchestration mirrors recorders.js: Docker socket for local node,
node-agent /sidecar/start for remote. Channel start passes CHANNEL_ID env so
the sidecar can write HLS preview to /media/live/<id>.

DeckLink port-contention guard: blocks starting a decklink channel when a
recorder or another channel on the same node+device_index is active.

restartChannel(id) helper picks another healthy cluster node and re-places
non-decklink channels; decklink is alert-only. Exposed for the scheduler.

Scheduler tick adds step 6: poll each running channel's sidecar /status,
update last_heartbeat_at, and after ~3 misses trigger restartChannel +
self-call /start. Reuses the existing PG advisory lock so multi-replica
deploys don't double-fire failovers.
2026-05-30 14:02:25 +00:00
Zac
29187a90df feat(mam-api): migration 029 — playout schema
Six tables: channels, playlists, items, sidecars (sidecar registry for
health-check), schedule (Phase B), as-run log.

- video_format default 1080p5994 (house standard, capture cadence)
- restart_count / last_restart_at / last_heartbeat_at on channels for
  auto-failover bookkeeping
- audio_normalized flag on items so re-stages skip the loudnorm pass
- unique partial index on (channel_id) for running sidecars
2026-05-30 14:02:25 +00:00