apt-get install ffmpeg pulls in ~80 transitive shared libs (libav*,
libx264, libdrm, libva...) that perturb CasparCG 2.4.0's headless
runtime linking and make it abort with SIGABRT (exit 134) on almost
every launch. Replace it with john van sickle's self-contained static
ffmpeg/ffprobe binaries in /usr/local/bin — the standalone CLI the HLS
re-muxer needs, with zero new shared libraries, keeping CasparCG's
environment identical to the known-good image.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The 2.4.x server aborts at startup if its configured log-path isn't
writable/creatable. /opt/casparcg is a read-only-ish symlinked install
dir; the entrypoint already mkdirs /media/casparcg/{log,data}. Point the
config there to match (the working image used these paths).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CasparCG's bundled FFMPEG/HLS consumer muxes a broken audio track
(aac sample_rate=0, time_base 1/0) into the preview, and silently drops
every arg that would remove it (-an, -codec:a, -g, -r all "Unused
option"). That corrupt audio black-screens the browser preview because
neither ffmpeg nor hls.js can decode the playlist.
Re-architect the preview path: CasparCG now STREAMs plain mpegts to a
UDP loopback port, and a Node-spawned STANDALONE ffmpeg (where -an
actually works) re-muxes it to clean, video-only HLS with -c:v copy.
The child process is tracked, auto-respawned while running, and killed
in stopChannel(). The PRIMARY SRT/RTMP/SDI/NDI output (with program
audio) is untouched.
Also fix the Dockerfile to match the working image: ubuntu:22.04 base +
CasparCG 2.4.0 ubuntu22 zip + NodeSource Node 20, and add a standalone
ffmpeg CLI. The old 2.3.3 tarball URL 404s. entrypoint.sh updated for
the 2.4.x bin/casparcg layout + bundled lib/ LD_LIBRARY_PATH.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Definitive root cause of the black preview, found via server-side ffmpeg
decode of the live playlist:
Error while decoding stream #0:1: Invalid data found (x57)
[abuffer] Value inf for parameter 'time_base' ... time_base to value 1/0
Stream #0:1 is the AAC audio. CasparCG's real-time channel feeds the HLS
consumer an audio stream whose muxed time_base is 1/0 (infinity). ffmpeg
itself cannot decode the playlist, and hls.js silently fails to append the
fragment after demux, so the <video> stays at readyState 0 (black) even
though the video PTS is perfectly continuous and segments serve 200.
Fix: drop audio from the HLS confidence monitor (-an). The video track is
clean h264 and plays in hls.js. Program audio still rides the primary
SRT/RTMP/SDI/NDI output, which is unaffected.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CasparCG's FFMPEG consumer ignores -force_key_frames ("Unused option")
because it routes args to the muxer, not the encoder. Revert to the
frame-based GOP (-g 60 -keyint_min 60) but keep the forced CFR rate
(-r 30000/1001): at 29.97fps a 60-frame GOP is exactly 2.0s, so keyframes
and HLS splits land on clean 2s boundaries. CFR is what was missing
originally — with the channel's irregular feed rate, "60 frames" drifted.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Root cause of the black preview: CasparCG's real-time channel feeds the
HLS consumer frames with irregular timestamps (the "packet with pts X has
duration 0" warnings). With frame-count GOPs (-g 60) the muxer split
points drift, producing erratic segment durations (0.4s-4.2s) that exceed
the declared TARGETDURATION. hls.js parses the resulting live playlist but
can never establish a fragment timeline — it reloads forever
("sliding 0.00 / prev-sn na / MISSED") and never appends a fragment, so
the video element stays at readyState 0 (black). Verified live via the
browser: manifest + segments serve 200, segment is valid h264/aac with a
keyframe start, yet hls.js logs zero FRAG_LOADED.
Fix: force a constant output frame rate (-r 30000/1001, regenerates
uniform PTS) and time-based keyframes every 2s (-force_key_frames
expr:gte(t,n_forced*2)), so every segment is a clean keyframe-aligned 2.0s
chunk. Yields a spec-compliant playlist (TARGETDURATION 2, stable
8-segment/16s window) identical in shape to the capture/VOD HLS the rest
of the app already plays successfully through the same hls.js.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- nginx.conf: add /media/live/ location serving from the media volume
mount. CasparCG sidecar writes HLS preview to /media/live/<id>/ but
nginx only had /live/ (capture volume). Without this, preview
requests returned the SPA shell instead of the .m3u8 playlist.
- ProgramMonitor: add live elapsed counter (MM:SS, ticks every 500ms)
driven by engine.currentItemStartedAt. Shows alongside clip index.
Adds a ⚠ pip when lastError is set (e.g. NDI SDK missing) without
blocking operation.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- startChannel: make primary consumer ADD non-fatal. CasparCG decodes
and routes media without an output consumer, so NDI channels (no SDK)
and misconfigured SRT/RTMP channels still load/play clips and expose
the HLS preview. state.lastError carries the consumer error for UI
visibility without blocking operation.
- loadPlaylist: throw early if state.running=false (channel/start was
never called or failed hard) with a clear error instead of a cryptic
CasparCG AMCP error propagating to the operator.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- spawnChannelSidecar: set last_heartbeat_at = NOW() when flipping
channel to 'running'. Without this, last_heartbeat_at is NULL so
the first scheduler tick sees ageMs = (now - epoch) >> TIMEOUT_MS
and triggers failover before the sidecar has had a single chance
to respond.
- scheduler playoutHealthTick: when last_heartbeat_at is NULL fall
back to updated_at as the baseline (belt-and-suspenders with the
spawnChannelSidecar fix). Also include updated_at in the query.
- POST /channels/:id/play: catch callSidecar errors explicitly and
return 502 Bad Gateway instead of delegating to next(err) which
the error middleware maps to 409 Conflict.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- jobs.js: add playout-stage BullMQ queue to QUEUES; asset_id from
job data is already resolved to a name by attachAssetNames
- screens-jobs.jsx: map type 'playout-stage' -> kind 'Stage' with
monitor icon
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- playout-stage: skip loudnorm pass 2 when measured_I=-inf (silent or
no-audio clip); fall back to plain AAC transcode so staging completes
instead of erroring out
- screens-home: add Playout tile; replace Premiere panel tile with
Downloads tile opening a combined modal (Premiere panel releases +
Dragon-ISO link to forge.wilddragon.net/WildDragonLLC/dragon-iso)
- screens-playout: add Delete channel button (visible only when stopped);
removes channel from list and selects next on confirm
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix event bubbling: e.stopPropagation() in onItemDrop prevents
duplicate POST when dropping on an existing playlist item
- Wrap all drop handlers in try/catch with inline error display
- ProgramMonitor: replace text placeholder with hls.js video player
loading /media/live/<channel_id>/index.m3u8; falls back to native
HLS on Safari; destroys Hls instance on channel stop/unmount
- Playlist: per-item duration (MM:SS), staging progress bar with
animated stripe while staging, now-playing highlight + ▶ indicator
driven by engine.currentIndex from 4s status poll
- Playlist footer: clip count + total duration sum
- Transport: Play button disabled + shows '⏳ N staging' until all
items are media_status=ready, eliminating the staging-not-ready 409
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without this, a freshly-spawned channel with NULL last_heartbeat_at was
instantly failover-killed by the playoutHealthTick because `0` was used as
the lastSeen timestamp, making ageMs huge on the very first tick.
Playout failover queries cluster_nodes.last_seen_at to find healthy nodes
for channel re-placement. Column missing from original cluster schema.
Migration 031 adds column + backfills existing nodes to NOW().
Fixes scheduler error: column "last_seen_at" does not exist
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Backend (routes/users.js):
- GET / now returns totp_enabled so the UI can show 2FA status
- GET /:id/access — admin-only effective per-project access (MAX over direct +
group grants), labels via=direct|group:<name>; admins report all/edit
- POST /:id/totp/disable — admin clears a locked-out user's 2FA without their
password (self-service disable still requires it); dev user blocked
- role validated against {admin,editor,viewer} on create + PATCH (was unchecked)
Frontend:
- Users>Policies tab: static prose replaced with interactive per-user matrix —
inline role select, 2FA badge, Reset-2FA action, lazy per-user access expander
- Home "Premiere panel" tile -> "Downloads"; modal renamed, adds Teams ISO row
(disabled "coming soon" until the .exe is supplied); UXP .ccx link unchanged
- data.jsx: window.TEAMS_ISO placeholder ({available:false})
Not runtime-tested in browser yet. Teams ISO .exe still pending from user.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
First 2.5 build got past the deb install but the binary-discovery step
produced an empty $BIN (test -n failed): the 2.5 deb names its executable
casparcg-server-2.5, which the old case pattern (*/casparcg, */CasparCG
Server) didn't match. Broaden the match to /usr/bin/*casparcg*server*, fall
back to the known /usr/bin/casparcg-server-2.5, symlink it to
/usr/local/bin/casparcg, and make /opt/casparcg a real dir for our config
(no longer symlinked onto /usr/bin). Entrypoint launches `casparcg <config>`
from PATH instead of ./casparcg in a cwd.
Still NOT runtime-validated: 2.5 may reject the 2.3-era casparcg.config
schema (a bad config shows up as "Configuration file --version was not
found"); the deb ships a reference config at
/usr/share/casparcg-server-2.5/casparcg.config to diff against at smoke time.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The image never built: CASPAR_URL pointed at a v2.3.3-stable Linux tarball
that CasparCG never published (2.3.x is Windows-only; Linux builds start at
2.4.0, and 2.4.1+ ship only as .deb). Rewrite to install the 2.5.0 noble
server + CEF debs on an ubuntu:24.04 base (Node 20 via nodesource), letting
apt resolve the GL/ffmpeg/openal runtime deps. Binary install dir is
discovered from the deb file list and symlinked to /opt/casparcg so the
entrypoint + config still run from there. Move CasparCG log/data dirs to
/media (writable mount) since the install dir may be read-only.
NOT runtime-validated: the 2.5 casparcg.config schema and the AMCP consumer
syntax (ADD <ch> STREAM/FILE) were authored against 2.3 and must be smoke-
tested against 2.5 before a channel start can be trusted.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fetches /playout/channels separately and degrades silently when the
endpoint or schema is absent.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Post-review fixes for the 8-commit playout-mcr drop:
- Scheduler self-calls (callSelf -> /recorders, /playout) carried no auth, so
under AUTH_ENABLED=true requireUiHeader 403'd every mutating POST. This broke
playout failover AND scheduled recordings. Add a per-boot in-process service
token (x-internal-token) the scheduler attaches; requireAuth/requireUiHeader
treat it as the seeded admin. No env/compose config needed.
- Failover deadlocked: restartChannel set status='starting' then the scheduler
called the guarded /start route, which 409s on 'starting'. Extract the spawn
body into spawnChannelSidecar() shared by /start and restartChannel; failover
now spawns directly with no self-call.
- Phase A playlist stalled after 2 clips: _scheduleAdvance cued the next clip
via LOADBG AUTO but never advanced the pointer. Pass asset_duration_ms in the
/play payload and arm a duration-based timer that advances currentIndex and
cues subsequent clips, keeping as-run in sync for arbitrary-length playlists.
- CasparCG consumer syntax was invalid: "ADD <ch> FFMPEG" is the producer name,
not a consumer keyword, and old -vcodec/-acodec short args are rejected. Use
STREAM/FILE with -codec:v / -codec:a / -preset:v / -tune:v and a format=yuv420p
filter ahead of libx264 (channel output is RGBA).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replaces the earlier aspirational "complete" log with the actual commit
sequence on feat/playout-mcr, the §7 decisions as built, the media-flow
diagram, port-contention + failover scope, and a runtime testing checklist
(migration → image build → SRT smoke → failover kill test).
- Add /mnt/NVME/MAM/wild-dragon-media:/media to mam-api (rw) and worker-p4
(rw); web-ui (ro, for serving HLS preview segments).
- worker-p4 WORKER_QUEUES gains 'playout-stage' so master-tier nodes pick up
the loudnorm stage jobs (they already have ffmpeg + the media mount).
- New build-only 'playout' service with profile ["build-only"] so
`docker compose --profile build-only build playout` produces the
wild-dragon-playout:latest image without compose trying to up it as a
long-running service. mam-api spawns these on demand.
- mam-api env adds PLAYOUT_IMAGE + PLAYOUT_AMCP_BASE_PORT (5250 default).
- .env.example: PLAYOUT_IMAGE, PLAYOUT_AMCP_BASE_PORT.
screens-playout.jsx + styles-playout.css: program monitor (HLS preview from
the sidecar), media bin, drag-drop playlist editor, transport controls. Plain
HTML5 drag-drop, no extra library. Talks to /api/v1/playout via
ZAMPP_API.fetch.
Wired into the shell: "Playout" under Operations, breadcrumb mapping, route
case in app.jsx, stylesheet + dist/screens-playout.js script in index.html.
Format dropdown defaults to 1080p5994 (matches the new channel default).
Routes: channel + playlist CRUD, start/stop/play/pause/skip transport, as-run
log. RBAC via assertProjectAccess on channel.project_id; null project ⇒
admin-only (recorder convention).
Sidecar orchestration mirrors recorders.js: Docker socket for local node,
node-agent /sidecar/start for remote. Channel start passes CHANNEL_ID env so
the sidecar can write HLS preview to /media/live/<id>.
DeckLink port-contention guard: blocks starting a decklink channel when a
recorder or another channel on the same node+device_index is active.
restartChannel(id) helper picks another healthy cluster node and re-places
non-decklink channels; decklink is alert-only. Exposed for the scheduler.
Scheduler tick adds step 6: poll each running channel's sidecar /status,
update last_heartbeat_at, and after ~3 misses trigger restartChannel +
self-call /start. Reuses the existing PG advisory lock so multi-replica
deploys don't double-fire failovers.
One container per channel. Built like capture/build-with-decklink: NDI +
DeckLink SDKs fetched at build, runs --privileged with Xvfb for the GL
context where no real display is present.
Components:
- entrypoint.sh: Xvfb + CasparCG launch, creates /media/live/<CHANNEL_ID>
- src/amcp.js: TCP AMCP client
- src/playout-manager.js: channel lifecycle, playlist walk via LOADBG AUTO
for gapless transitions; primary consumer (decklink/ndi/srt/rtmp) plus a
second FFMPEG HLS consumer (~600 kbps, 2s segments) for the UI preview
- src/index.js: HTTP shim — /channel/start, /playlist/load, transport
- frame-rate helper picks fps from video_format (59.94 → 60000/1001) so
SEEK / LENGTH frame math is correct
Stages playlist items from S3 to the shared CasparCG media volume. Pass 1
measures, pass 2 applies linear loudnorm (I=-23 LUFS, TP=-1 dBTP, LRA=11);
output is AAC 192k @ 48 kHz, video stream copied. Atomic rename on success
so CasparCG never sees a partial file. Per-item audio_normalized flag means
re-stages of the same asset skip the loudnorm pass.
Wired into worker/src/index.js behind WORKER_QUEUES=playout-stage so
capability-routed deploys can pin it to nodes that already have ffmpeg +
the media mount.
Six tables: channels, playlists, items, sidecars (sidecar registry for
health-check), schedule (Phase B), as-run log.
- video_format default 1080p5994 (house standard, capture cadence)
- restart_count / last_restart_at / last_heartbeat_at on channels for
auto-failover bookkeeping
- audio_normalized flag on items so re-stages skip the loudnorm pass
- unique partial index on (channel_id) for running sidecars
Review of the v2 auth landing turned up four weak spots in the MFA path.
All four are now fixed; behaviour is unchanged for the password-correct
+ correct-TOTP happy path.
1. TOTP brute-force gate (the big one). /login was calling
ipBackoff.recordSuccess(ip) the instant the password hashed correctly,
*before* the second factor was proven. That cleared the per-IP failure
counter, so each /login retry let an attacker with a known password
hammer the 6-digit /login/totp space (10^6) at full speed.
Now recordSuccess fires only inside establishSession() — i.e. after
every required factor has actually passed (password [+TOTP] or
OAuth [+TOTP]).
2. MFA ticket binding. Tickets issued by /login (and the Google callback)
were unbound — a stolen ticket replayed from a different origin still
worked. Tickets now carry SHA-256 hashes of the issuing request's IP
and User-Agent; redeemTicket rejects on mismatch. The ticket is burned
even on mismatch so a wrong-binding probe can't be retried.
3. TOTP replay within the same 30s step (RFC 6238 §5.2). The verifier
accepted the same code as many times as you submitted it. Now
verifyToken returns the matched counter, and /login/totp does a CAS
UPDATE on users.totp_last_counter — codes at counters <= the last
accepted value are rejected. New migration 030 adds totp_last_counter,
seeded on /totp/enable so the enrollment code itself can't be reused
at first login, and zeroed on /totp/disable.
4. Google OAuth domain check no longer falls back to the email suffix
when the hd (hosted-domain) claim is missing. Email-suffix matching
let consumer (non-Workspace) Google accounts whose email happens to
end in the allowed domain through; if GOOGLE_ALLOWED_DOMAIN is set,
the operator means "only this Workspace", so accounts without a
verified hd must be rejected.
Tests: new mfa-tickets.test.js covers ip/UA binding, single-use on
mismatch, and bindings-absent back-compat. totp.test.js updated for the
new verifyToken return shape (counter on success, null on failure;
truthiness still works at call sites) and adds an explicit
matched-counter check.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Review of the v2 auth landing found four places where the per-project RBAC
helpers weren't applied to destination/source projects, letting a scoped
editor write into projects they don't have access to:
- assets PATCH /🆔 bin_id moved with no check, so an editor in project A
could stuff their asset into a bin in project B. Now validates the bin's
project_id matches the asset's own project (assets don't change project).
- assets POST /:id/copy: body's projectId/binId never checked, so any
reachable asset could be cloned into an arbitrary project. Now asserts
edit on the destination project and validates binId belongs there.
- bins POST /:id/assets: requireBinEdit checks edit on the bin's project but
not on the source asset's project, so an asset from project B could be
pulled into A's bin tree (and surfaced in A's views). Now the asset must
belong to the bin's own project.
- jobs POST /conform: project_id from body never gated, so any logged-in
user could enqueue conform jobs against any project. Now asserts edit.
- upload POST /init, POST /simple: projectId/binId from body never gated,
same class of bug. Now asserts edit on projectId and validates binId.
- upload GET /: returned every in-progress upload globally, leaking
filenames across projects. Now scoped via accessibleProjectIds.
These are the same pattern as the holes 2615143 closed on recorders/
sequences/imports/comments — these routes existed before the RBAC commit
landed and were never marked TODO(authz), so the broad sweep missed them.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase 1 scoped only projects/assets/bins and left recorders, sequences,
imports, comments carrying TODO(authz) markers. A scoped editor/viewer could
still read and mutate those across every project. This closes the gap using
the existing authz.js helpers — no open TODO(authz) markers remain.
- recorders: param('id') resolves project + view baseline, requireRecorderEdit
on PATCH/DELETE/start/stop, GET / filtered by accessibleProjectIds, POST /
asserts edit on the target project (null project = admin-only)
- sequences: same param pattern + requireSequenceEdit on PUT/:id,/clips,conform
and DELETE; GET//POST/ assert on the query/body project
- imports: POST /youtube asserts edit on the body projectId
- comments: router.use guard resolves project via the asset (view to read, edit
to write); also fixes the author bug (req.session.userId -> req.user.id, which
was always NULL so comments had no recorded author)
- capture: intentionally any-logged-in (shared hardware, asset scoped on
registration) — TODO replaced with a rationale note
Security fixes from review of this change:
- recorders POST /:id/start: a per-take projectId override could route a live
asset into a project the caller lacks edit on — now asserts edit on the
override target
- sequences PUT /:id/clips: spliced asset_ids weren't checked, so an editor
could pull in (and via GET /:id leak signed proxy URLs for) assets from a
project they can't access — now every clip asset must belong to the
sequence's project; pre-transaction queries moved inside try/catch so a DB
error returns 500 instead of hanging the request
- tests: recorders-access, sequences-access (incl. cross-project clip guard),
comments-access (incl. author-id regression)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>