datarhei-dragonfork-core/docs/design/2026-04-17-datarhei-dragon-fork-m2-webrtc-core-integration.md
Zac Gaetano 86bae816c1 docs(m2): WebRTC into Core proper — design spec
M2 promotes the M1 standalone PoC into the datarhei Core binary so
WebRTC becomes a first-class output alongside RTMP/SRT/HLS, surfaced
in the core-ui dashboard.

Architecture: new app/webrtc sibling subsystem + two small hooks on
restream (ProcessHooks + AppendOutput), reusing the untouched M1
core/webrtc package. WHEP served under /api/v3/process/{id}/whep,
inheriting JWT auth. A new "Live (WebRTC)" tab on the process detail
view provides the embedded browser player.

Covers: purpose, architecture diagram, decision table, components,
data flow (enable/subscribe/stop/disable/restart), error handling,
testing strategy (unit/integration/e2e), acceptance criteria,
rollback, and a seven-milestone sanity breakdown.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-17 09:42:16 -04:00

15 KiB
Raw Permalink Blame History

M2 — WebRTC into datarhei Core proper

Status: Design approved, implementation pending Date: 2026-04-17 Author: Zac (zgaetano@wilddragon.net), Dragon Fork Depends on: M1 (2026-04-16-datarhei-dragon-fork-m1-webrtc-poc.md) Branch: m2-webrtc-core-integration

1. Purpose

M1 produced a standalone cmd/webrtc-poc binary that proved the Pion-based WHEP egress path end-to-end on TrueNAS. M2 promotes that work into the datarhei Core binary so WebRTC becomes a first-class output alongside RTMP, SRT, and HLS, surfaced in the core-ui dashboard.

After M2 a user can:

  1. Create or edit a process in core-ui.
  2. Toggle a "WebRTC" switch on that process's config.
  3. Save → Core restarts the process with an extra RTP output leg.
  4. Open the process's "Live (WebRTC)" tab and watch the feed in the browser with sub-second latency, authenticated by the user's JWT.

Out of scope for M2 (explicit):

  • Public / unauthenticated embeds (handled in M3 via signed URLs).
  • A separate "broadcast center" dashboard page (per-process tab is enough).
  • Lazy / on-demand Source binding — eager binding only.
  • WHIP ingest — that's M4.

2. High-level architecture

                  ┌────────────────────────────────────────────┐
                  │               datarhei Core                │
                  │                                            │
  FFmpeg (per     │   ┌──────────────┐      ┌──────────────┐   │
  process,        │   │   restream   │─────▶│  app/webrtc  │   │
  spawned by      │──▶│              │◀─────│  (NEW)       │   │
  restream) ───┐  │   │ - lifecycle  │hooks │              │   │
               │  │   │ - AppendOut  │      │ - registry   │   │
               │  │   │ - config     │      │ - sources    │   │
               │  │   │   (now incl. │      │ - PeerFactory│   │
               │  │   │    WebRTC)   │      │ - WHEP mux   │   │
               │  │   └──────────────┘      └──────┬───────┘   │
               │  │                                │           │
  udp://       │  │   ┌──────────────┐             │           │
  127.0.0.1:   └─▶│   │ core/webrtc  │◀────uses────┘           │
  <auto>rtp      │   │ (from M1,    │                          │
                  │  │  unchanged)  │     ┌────────────────┐   │
                  │  └──────────────┘     │   http/server  │   │
                  │                       │                │   │
                  │                       │ mounts         │   │
                  │                       │ /api/v3/process│   │
                  │                       │ /:id/whep      │   │
                  │                       └────────┬───────┘   │
                  └────────────────────────────────┼───────────┘
                                                   │
                         (DTLS-SRTP over ICE)      │
                                                   ▼
                                           Browser (core-ui
                                           player tab, RTCPeer)

Three boxes matter:

  • existing restream — grows two tiny hooks.
  • existing core/webrtc (from M1) — unchanged.
  • new app/webrtc — the glue subsystem.

3. Key decisions (settled during brainstorming)

# Decision Choice
1 Scope Backend + full UI with embedded player
2 Stream addressing /whep/{processID} — per-process
3 HTTP listener Under Core's /api/v3 group (inherits JWT)
4 Viewer auth JWT only in M2 — public embeds are M3
5 FFmpeg wiring Auto-inject UDP RTP output; re-encode when needed
6 Enable state Field on restream.Config.WebRTC
7 UI surface New "Live (WebRTC)" tab on process detail view
8 Lifecycle Eager — Source bound when process starts
9 Code placement New app/webrtc sibling subsystem (not inside restream)

4. Components

4.1 Config — config/data.go + restream/app/process.go

Per-process:

// restream/app/process.go — new sibling of ConfigIO on Config
type ConfigWebRTC struct {
    Enabled        bool   // master switch for this process
    VideoPT        uint8  // default 102 (H.264)
    AudioPT        uint8  // default 111 (Opus)
    ForceTranscode bool   // default false — true => always re-encode
}

Global (Core config, one block):

// config/data.go
type DataWebRTC struct {
    Enable     bool     // master feature flag; default false for safety
    PublicIP   string   // NAT1To1 / ICE host candidate rewrite (e.g. LAN IP)
    NAT1To1IPs []string // advanced: multiple public IPs
    UDPMuxPort int      // optional: single UDP port for all ICE traffic
                        // (0 = ephemeral per peer, default)
}

Registered through the existing vars.Register mechanism in config/config.go.

4.2 New package — app/webrtc/

File Responsibility
subsystem.go type WebRTC struct with Start() / Stop(); owns the core/webrtc.Registry and a single core/webrtc.PeerFactory. Implements the same shape as other Core subsystems.
lifecycle.go OnProcessStart(id, cfg) / OnProcessStop(id) callbacks registered with restream. Allocates a UDP port, calls restream.AppendOutput, binds a core/webrtc.Source, registers it.
portalloc.go Alloc() (int, error) — binds :0 on loopback, reads the port, closes the listener, returns the number. Race window is microseconds; NewSourceOn re-binds immediately. If the rebind fails (rare: another process grabbed the port in the gap), OnStart returns the error, restream aborts the start, operator retries. Tested with 100× tight-loop.
ffmpeg_args.go BuildArgs(cfg ConfigWebRTC, port int) []string — emits the -map, -c:v, -c:a, -f rtp, udp://127.0.0.1:PORT?pkt_size=1316 fragments. Branches on ForceTranscode.
handler.go HTTP handler for WHEP — wraps the M1 core/webrtc.NewWHEPHandler, but looks up the Source by processID path param. Adds DELETE /api/v3/process/:id/whep/:peerid.

4.3 Two additions to restream

  1. Lifecycle callback pair. Added as fields on the restream manager:

    type ProcessHook func(id string, cfg *app.Config) error
    type ProcessHooks struct {
        OnStart ProcessHook  // fires after args are assembled, before exec
        OnStop  ProcessHook  // fires after wait() returns
    }
    

    Single consumer is fine — no event bus yet. app/webrtc registers itself at subsystem start.

  2. AppendOutput(id string, extra []string) error — mutates the pending FFmpeg args for a process that has fired OnStart but has not yet exec'd. Inside OnStart, the subsystem calls AppendOutput to add the -f rtp udp://… fragment; restream then exec's with the augmented args. Outside the OnStart window AppendOutput returns an error — Core does not mutate running FFmpeg processes.

These two additions are useful beyond WebRTC (stats consumers, future sidecar modules), so the surface cost is justified.

4.4 One route in http/server.go

Inside the existing /api/v3 group (inherits JWT auth):

api.POST("/process/:id/whep",          webrtcHandler.Subscribe)
api.DELETE("/process/:id/whep/:peerid", webrtcHandler.Unsubscribe)

4.5 UI — core-ui/src/views/Edit/LiveTab.jsx (new)

  • Shown only when process.config.webrtc.enabled === true.
  • <video autoplay muted playsinline /> driven by a small useWHEP() hook that does:
    1. new RTCPeerConnection({ iceServers: [] })
    2. pc.addTransceiver('video', { direction: 'recvonly' })
    3. pc.addTransceiver('audio', { direction: 'recvonly' })
    4. await pc.setLocalDescription(await pc.createOffer())
    5. POST offer SDP to /api/v3/process/{id}/whep with the JWT.
    6. pc.setRemoteDescription(answer).
    7. pc.ontrack → attach stream to the <video>.
  • "Copy WHEP URL" button.
  • Status line derived from pc.connectionState + pc.getStats() (codec, bitrate).
  • No external WebRTC dependency — browser-native RTCPeerConnection.

5. Data flow

5.1 Enabling WebRTC (write)

core-ui  ──PUT /api/v3/process/{id} { ..., config: { webrtc: { enabled: true }}}──▶ http
http     ──restream.UpdateProcess(id, cfg)──▶ restream
restream ──persist → stop old → about to exec new──▶ OnProcessStart(id, cfg)
app/webrtc ─port P = Alloc()
app/webrtc ─restream.AppendOutput(id, BuildArgs(cfg.WebRTC, P))
app/webrtc ─NewSourceOn(id, "127.0.0.1", P).Start() → registry[id] = src
restream ─exec ffmpeg with augmented args

Ordering guarantee: Source is bound before FFmpeg execs. No race window.

5.2 WHEP subscribe (read)

browser  ──POST /api/v3/process/{id}/whep (SDP offer, JWT)──▶ http
http (JWT ok) ──handler.Subscribe──▶ app/webrtc
app/webrtc ─src = registry[id]  (404 if absent)
app/webrtc ─peer, answer = factory.NewPeer(src, offer)
app/webrtc ─go forwarder: src.Subscribe(ch) → peer.WriteRTP
http     ──201 Created, Location: .../whep/{peerid}, body=answer──▶ browser
browser  ──ICE, DTLS-SRTP──▶ peer ──▶ <video>

5.3 Process stop (teardown)

restream ─kill ffmpeg, wait()──▶ OnProcessStop(id)
app/webrtc ─for each peer in peers[id]: peer.Close()
app/webrtc ─src = registry.Remove(id); src.Close()
app/webrtc ─delete peers[id]

5.4 Disabling WebRTC on a running process

Same as 5.1 in reverse: new cfg has webrtc.enabled = false. Restream persists → stops (fires OnProcessStop → 5.3 runs) → starts without RTP leg.

5.5 Core restart

Restream enumerates stored configs at boot and starts each process. OnProcessStart fires inside that loop for every webrtc.enabled = true process. WebRTC state rebuilds from the persisted config — no separate bootstrap path.

6. Error handling

Failure Surface
Port alloc fails OnProcessStart returns error → restream aborts start, logs webrtc: port alloc failed. Process shows failed in UI.
FFmpeg wiring fails (bad codec + !ForceTranscode) Source binds; RTP counter stays zero. Log after N seconds of silence; expose RTPPacketsReceived to UI.
WHEP POST for unknown id 404 stream not found (same as M1).
Peer DELETE unknown peerid 204 No Content (idempotent).
JWT missing / invalid 401 — inherited from /api group. No code in handler.
ICE fails on client Browser iceconnectionstatechange = failed → UI retry button. Server no-op.
Subsystem Start fails at boot (bad PublicIP, etc.) Subsystem logs the error and declines to start; the hooks are never registered; restream runs all processes without the RTP leg. Core does not exit — WebRTC is non-critical.
Subscriber backpressure Already handled in core/webrtc.Source — full channel drops. No change.

Design rule: a WebRTC subsystem failure must not prevent a process's RTMP/SRT/HLS outputs from running. Hooks wrap their own errors and log; restream does not abort a start because of a WebRTC problem unless the AppendOutput itself fails (wrong args shape — a programming bug, not a runtime condition).

7. Testing strategy

7.1 Unit (fast, in-package, no network)

  • app/webrtc/ffmpeg_args_test.go — table-driven: video-only, audio-only, both, transcode on/off. Asserts exact arg slice.
  • app/webrtc/portalloc_test.goAlloc() returns a port that a subsequent ListenUDP can bind; run 100× to catch races.
  • app/webrtc/lifecycle_test.go — fake restream calls OnProcessStart / OnProcessStop; asserts registry state transitions and Source is closed exactly once.

7.2 Integration (in-process, real HTTP, no FFmpeg)

  • app/api/api_webrtc_whep_test.go — boot a Core with a fake process that has webrtc.enabled=true; inject synthetic RTP on the allocated port; POST a WHEP offer using the M1 test/whep-client.Subscribe helper (now imported as a library); assert both tracks receive a packet within 2s.
  • app/api/api_webrtc_auth_test.go — POST without JWT → 401; POST for unknown id → 404; DELETE unknown peerid → 204.
  • app/api/config_persist_test.go — create process with webrtc.enabled, simulate Core restart, assert Source is re-bound and WHEP still works.

7.3 End-to-end (manual, TrueNAS)

  • Replace the M1 test/publish.sh workflow with a real Core process configured via core-ui (testsrc2 as input), flip WebRTC on, open the Live tab, verify the test pattern plays.
  • Use chrome://webrtc-internals to confirm ICE completes and SRTP is flowing.

No new test dependencies. test/whep-client graduates from binary to importable helper package.

8. Acceptance criteria

M2 is done when, on a fresh TrueNAS deploy of the Core binary:

  1. POST /api/v3/config with a webrtc.enable=true global block succeeds.
  2. Creating a process with config.webrtc.enabled=true via core-ui persists and starts.
  3. POST /api/v3/process/{id}/whep with a valid JWT returns 201 with an SDP answer, and the connection reaches iceconnectionstate=connected.
  4. The core-ui "Live (WebRTC)" tab plays video within 3 seconds of opening.
  5. Disabling WebRTC in the UI stops the stream and subsequent WHEP POSTs return 404.
  6. Restarting the Core binary keeps the stream working without manual reconfiguration.
  7. All unit and integration tests pass with -race.

9. Rollback

Each layer has a rollback lever:

  • Operator: set global webrtc.enable = false in Core config → subsystem declines to start (no hooks registered); processes run without the RTP leg; existing RTMP/SRT/HLS unaffected. Core continues to serve normally.
  • Per-process: toggle config.webrtc.enabled = false in the process config → restream restarts the process without the leg.
  • Code: the app/webrtc subsystem is a single import in main.go. Removing that import and the two restream hook wires restores pre-M2 behavior. core/webrtc stays in the tree as inert code.

10. Milestones inside M2

Not the full plan — that lives in a separate plan doc after this spec is approved. This is a sanity breakdown:

  1. Config wiring — add DataWebRTC and ConfigWebRTC; tests for marshal/unmarshal and defaults.
  2. Restream hooks — add ProcessHooks and AppendOutput; unit tests using the existing restream test harness.
  3. app/webrtc package — subsystem, lifecycle, portalloc, ffmpeg_args, handler; unit tests per the testing strategy.
  4. Core main.go wiring — instantiate subsystem, register hooks, mount HTTP route.
  5. Integration tests — in-process WHEP end-to-end, auth, persistence.
  6. core-ui LiveTab — new React tab + WHEP hook.
  7. TrueNAS smoke test — rebuild Core image, redeploy, verify live.

Each milestone ends with a commit. The feature branch is m2-webrtc-core-integration (created from m1-webrtc-poc).