dragonflight/WORK_LOG_PLAYOUT.md
Zac 34352e3299 docs(playout): work log — commit map, decisions, testing checklist
Replaces the earlier aspirational "complete" log with the actual commit
sequence on feat/playout-mcr, the §7 decisions as built, the media-flow
diagram, port-contention + failover scope, and a runtime testing checklist
(migration → image build → SRT smoke → failover kill test).
2026-05-30 14:05:57 +00:00

5 KiB

Playout / Master Control — Implementation Work Log

Branch: feat/playout-mcr (off main) Started: 2026-05-30 Status: Code complete, awaiting runtime validation

Tracks the build of the playout (MCR) subsystem against the design at docs/superpowers/specs/2026-05-30-playout-mcr-design.md.


Commit sequence

# Commit Scope
1 docs(playout) Design spec, §7 questions answered
2 feat(mam-api): migration 029 Six tables, failover columns, audio_normalized flag
3 feat(worker): playout-stage S3 → /media + EBU R128 loudnorm + index.js wiring
4 feat(playout): sidecar CasparCG image + AMCP shim, HLS preview consumer, fps-aware frame math
5 feat(mam-api): /playout control plane + auto-failover Routes + scheduler health tick + restartChannel helper
6 feat(web-ui): MCR page screens-playout, styles, app/shell/index.html wiring
7 build(playout): compose wiring + .env knobs /media volume, queue addition, build-only service
8 docs(playout): work log This file

Resolved §7 decisions (2026-05-30)

  • Audio loudness: pre-normalize at stage time. ffmpeg loudnorm two-pass (I=-23 LUFS, TP=-1 dBTP, LRA=11), linear mode preserves dynamics. Output AAC 192k @ 48 kHz, video stream copied. Per-item audio_normalized flag so re-stages of the same asset skip the pass.
  • Frame rate: 1080p5994 default (was 1080i5994). Per-channel override allowed via video_format. fpsFor(videoFormat) helper in the sidecar drives SEEK / LENGTH / transition-frames math.
  • Preview latency: HLS v1. CasparCG runs a second FFMPEG consumer alongside the primary output, writing /media/live/<channel_id>/index.m3u8 (~600 kbps, 2s segments, 6-window list). Web UI plays via the existing HLS plumbing.
  • Failover: auto-restart on healthy node for NDI/SRT/RTMP. Alert-only for DeckLink (device-index pinning makes blind re-placement risky). Scheduler tick (PG advisory lock, same lock as recorder schedules) polls sidecar /status; ~3 missed checks → restartChannel(id) picks the most recently-seen-online other node, bumps restart_count, calls /start.

Architecture notes

Sidecar model. One CasparCG container per channel. Spawned by mam-api via local Docker socket (primary node) or remote node-agent /sidecar/start. Tracked in playout_sidecars plus playout_channels.container_id. Killed on /stop or by restartChannel during failover.

Media flow.

S3 master/proxy → playout-stage worker → /media/playout/<assetId>.<ext>
                                         (loudnormed, AAC@-23 LUFS)
                                                ↓
                                       CasparCG channel #1
                                                ↓
                                   primary consumer  HLS consumer
                                  (DeckLink/NDI/         ↓
                                   SRT/RTMP)        /media/live/<ch_id>/*.m3u8

Port contention. assertDeckLinkFree() blocks starting a SDI channel when a recorder or another channel on the same node+device_index is active.

Failover scope. NDI/SRT/RTMP have no hardware tie, so any healthy cluster_node is eligible. DeckLink channels surface an alert in the UI (status='error' + error_message) and require operator intervention.

Testing checklist

  • Apply migration 029 on dev DB
  • Build playout image: docker compose --profile build-only build playout
  • Build web-ui (screens-playout joins the esbuild list automatically)
  • Create channel via POST /api/v1/playout/channels (SRT first, no HW)
  • Stage 2-3 assets to a playlist, verify loudnorm metadata in stderr
  • Start channel → sidecar container appears in docker ps
  • AMCP smoke: telnet <host> 5250, VERSION, INFO
  • Play playlist; verify HLS at /media/live//index.m3u8
  • Skip / pause / resume / stop
  • As-run log: GET /api/v1/playout/channels/:id/asrun
  • Kill sidecar container → scheduler should restart on another node within ~3 ticks (~45s), restart_count increments
  • DeckLink channel kill: status flips to 'error', NO restart attempt
  • Try starting a decklink channel on a device_index already held by a recorder → 409
  • MCR UI smoke: nav entry visible, page renders, drag-drop adds items, transport buttons hit the API

Known gaps (deferred)

  • No WebRTC preview (HLS-only v1 — 4-6s lag, fine for confidence monitor).
  • No graphics/CG overlay layer in Phase A (templates land in Phase B).
  • No Phase B scheduler / 24/7 wall-clock channel (schema is in place, scheduler tick is not).
  • No multi-channel grid view (one channel at a time per page).
  • No timecode / remaining-duration overlay (would need CasparCG INFO poll).
  • No audio level meters on the UI.
  • restartChannel updates DB state and triggers /start; if the new node also fails repeatedly, there's no exponential backoff yet — bounded only by the manual stop button.