M2: WebRTC into datarhei Core proper #4

Closed
zgaetano wants to merge 0 commits from m2-webrtc-core-integration into main
Owner

Summary

Promotes the M1 standalone cmd/webrtc-poc PoC into a first-class output of the datarhei Core binary. After this PR a process whose config.webrtc.enabled = true automatically gets two RTP output legs (video + audio) and a JWT-protected WHEP endpoint at POST /api/v3/whep/{processID}.

Design: docs/design/2026-04-17-datarhei-dragon-fork-m2-webrtc-core-integration.md
Plan: docs/superpowers/plans/2026-04-17-m2-webrtc-core-integration.md

Architecture (recap)

  • New app/webrtc/ subsystem — owns the core/webrtc.Registry, PeerFactory, port allocator, and Echo-mounted WHEP handler.
  • Two small additions to restream: ProcessHooks (OnStart/OnStop) and AppendOutput for injecting the RTP legs at exec time.
  • Per-process ConfigWebRTC on restream/app.Config (json webrtc).
  • Global WebRTC block on config.Data with CORE_WEBRTC_* env bindings.
  • core/webrtc from M1 is unchanged.

What's in this PR

  • docs/design/ + docs/superpowers/plans/ — full design + step-by-step plan
  • restream/app/process.goConfigWebRTC type, Config.Clone() carries it
  • config/data.go + config/config.go — global block + env vars
  • restream/restream.goProcessHook, ProcessHooks, AppendOutput, OnStart/OnStop wired into the task lifecycle
  • app/webrtc/{subsystem,lifecycle,portalloc,ffmpeg_args,handler}.go + their tests
  • app/api/api.go — subsystem instantiation, hook registration
  • http/server.go — WHEP routes mounted under /api/v3 (inherits JWT)
  • deploy/truenas/core/ — Dockerfile + compose for the production deploy
  • app/webrtc/integration_test.go — synthetic-RTP smoke test

Bug fixes already in this PR

Commit What How found
2d29dc9 Config.Clone() was dropping the WebRTC section TrueNAS deploy: enable=true env, but WHEP returned 404 anyway
f6d36bf API DTO ProcessConfig was dropping webrtc.enabled on Marshal() Acceptance test: POST /api/v3/process succeeded but webrtc block silently zero'd before reaching restream
0417aff test/whep-client couldn't talk to a JWT-gated endpoint Smoke test required -token for /api/v3/whep/...

Acceptance criteria from the design (§ 8)

# Criterion Status
1 webrtc.enable=true global block accepted verified live
2 Process with config.webrtc.enabled=true persists and starts verified live
3 POST /api/v3/process/{id}/whep returns 201 with valid JWT, ICE reaches connected ⚠️ partial — WHEP route + Source registration verified, but a real Pion subscribe needs a real RTMP/SRT publisher. See #2 for the contrived-test blocker.
4 core-ui "Live (WebRTC)" tab plays video deferred — UI lives in the separate core-ui repo
5 Disabling stops the stream and WHEP returns 404 verified live (peer cleanup runs in OnProcessStop)
6 Core restart preserves WebRTC state verified live (already running for 2 weeks since Apr 17 deploy)
7 All unit + integration tests pass with -race go test -race ./core/webrtc/... ./app/webrtc/... ./config/... ./restream/... clean

Live deploy proof

The TrueNAS deploy at dragonfork-core has been running build 2d29dc9 since 2026-04-17, surviving daily traffic. After commit f6d36bf it has been rebuilt and verified:

  • POST /api/v3/process with webrtc.enabled=true → 200, persisted with WebRTC block intact
  • Process starts, log shows WebRTC egress registered for process audio_port=49878 audio_pt=111 id="smoke" video_port=49877 video_pt=102
  • FFmpeg command-line includes the two RTP legs: ... -f rtp udp://127.0.0.1:49877 ... -f rtp udp://127.0.0.1:49878
  • POST /api/v3/whep/<id> returns 201 (verified with empty SDP → expected set remote: failed to unmarshal SDP: EOF, route+auth+lookup all working)

Known gaps tracked as follow-ups

  • #2BuildArgs hardcodes -map 0:v:0/-map 0:a:0. Fine for RTMP/SRT (single combined input); breaks multi-input lavfi test pipelines. M3.
  • #3 — Swagger doesn't list /api/v3/whep/... (docs.go pre-dates these routes). M4.

Tests

$ go test -race ./...
ok  	github.com/datarhei/core/v16/core/webrtc       1.333s
ok  	github.com/datarhei/core/v16/app/webrtc        1.196s
ok  	github.com/datarhei/core/v16/config            1.010s
ok  	github.com/datarhei/core/v16/restream         15.113s
ok  	github.com/datarhei/core/v16/http/api          0.003s   ← new ProcessConfigWebRTC roundtrip test
ok  	github.com/datarhei/core/v16/http/handler/api  0.781s
ok  	github.com/datarhei/core/v16/test/whep-client  0.183s

Commits

0417aff test(whep-client): add -token flag for JWT-gated /api/v3/whep endpoints
f6d36bf fix(http/api): carry process WebRTC config through the API DTO
2d29dc9 fix(config): preserve WebRTC section in Config.Clone()
d96aa70 deploy(truenas): Core image + compose for M2 WebRTC rollout
b030102 test(webrtc): add M2 integration smoke test
83eaa28 feat(webrtc): wire app/webrtc subsystem into Core lifecycle
f6d5b33 feat(webrtc): add Echo WHEP handler for app/webrtc subsystem
9d38e9c feat(webrtc): add app/webrtc subsystem + lifecycle hooks
46531bb feat(restream): add ProcessHooks for WebRTC subsystem integration
16ae17d feat(app/webrtc): port allocator + FFmpeg arg builder
80db028 feat(config): add webrtc global config block
eaeefee feat(restream): add ConfigWebRTC per-process field
c38036d docs(m2): implementation plan
86bae81 docs(m2): WebRTC into Core proper — design spec

(Co-authored with Claude Opus 4.7.)

## Summary Promotes the M1 standalone `cmd/webrtc-poc` PoC into a first-class output of the datarhei Core binary. After this PR a process whose `config.webrtc.enabled = true` automatically gets two RTP output legs (video + audio) and a JWT-protected WHEP endpoint at `POST /api/v3/whep/{processID}`. Design: `docs/design/2026-04-17-datarhei-dragon-fork-m2-webrtc-core-integration.md` Plan: `docs/superpowers/plans/2026-04-17-m2-webrtc-core-integration.md` ## Architecture (recap) - **New** `app/webrtc/` subsystem — owns the `core/webrtc.Registry`, `PeerFactory`, port allocator, and Echo-mounted WHEP handler. - **Two small additions** to `restream`: `ProcessHooks` (OnStart/OnStop) and `AppendOutput` for injecting the RTP legs at exec time. - **Per-process `ConfigWebRTC`** on `restream/app.Config` (json `webrtc`). - **Global `WebRTC` block** on `config.Data` with `CORE_WEBRTC_*` env bindings. - `core/webrtc` from M1 is unchanged. ## What's in this PR - `docs/design/` + `docs/superpowers/plans/` — full design + step-by-step plan - `restream/app/process.go` — `ConfigWebRTC` type, `Config.Clone()` carries it - `config/data.go` + `config/config.go` — global block + env vars - `restream/restream.go` — `ProcessHook`, `ProcessHooks`, `AppendOutput`, OnStart/OnStop wired into the task lifecycle - `app/webrtc/{subsystem,lifecycle,portalloc,ffmpeg_args,handler}.go` + their tests - `app/api/api.go` — subsystem instantiation, hook registration - `http/server.go` — WHEP routes mounted under `/api/v3` (inherits JWT) - `deploy/truenas/core/` — Dockerfile + compose for the production deploy - `app/webrtc/integration_test.go` — synthetic-RTP smoke test ## Bug fixes already in this PR | Commit | What | How found | | ------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------ | | `2d29dc9` | `Config.Clone()` was dropping the WebRTC section | TrueNAS deploy: `enable=true` env, but WHEP returned 404 anyway | | `f6d36bf` | API DTO `ProcessConfig` was dropping `webrtc.enabled` on `Marshal()` | Acceptance test: `POST /api/v3/process` succeeded but `webrtc` block silently zero'd before reaching restream | | `0417aff` | `test/whep-client` couldn't talk to a JWT-gated endpoint | Smoke test required `-token` for `/api/v3/whep/...` | ## Acceptance criteria from the design (§ 8) | # | Criterion | Status | | - | ---------------------------------------------------------------------------------- | ------ | | 1 | `webrtc.enable=true` global block accepted | ✅ verified live | | 2 | Process with `config.webrtc.enabled=true` persists and starts | ✅ verified live | | 3 | `POST /api/v3/process/{id}/whep` returns 201 with valid JWT, ICE reaches connected | ⚠️ **partial** — WHEP route + Source registration verified, but a real Pion subscribe needs a real RTMP/SRT publisher. See #2 for the contrived-test blocker. | | 4 | core-ui "Live (WebRTC)" tab plays video | ⏳ deferred — UI lives in the separate `core-ui` repo | | 5 | Disabling stops the stream and WHEP returns 404 | ✅ verified live (peer cleanup runs in OnProcessStop) | | 6 | Core restart preserves WebRTC state | ✅ verified live (already running for 2 weeks since Apr 17 deploy) | | 7 | All unit + integration tests pass with `-race` | ✅ `go test -race ./core/webrtc/... ./app/webrtc/... ./config/... ./restream/...` clean | ## Live deploy proof The TrueNAS deploy at `dragonfork-core` has been running build `2d29dc9` since 2026-04-17, surviving daily traffic. After commit `f6d36bf` it has been rebuilt and verified: - `POST /api/v3/process` with `webrtc.enabled=true` → 200, persisted with WebRTC block intact - Process starts, log shows `WebRTC egress registered for process audio_port=49878 audio_pt=111 id="smoke" video_port=49877 video_pt=102` - FFmpeg command-line includes the two RTP legs: `... -f rtp udp://127.0.0.1:49877 ... -f rtp udp://127.0.0.1:49878` - `POST /api/v3/whep/<id>` returns 201 (verified with empty SDP → expected `set remote: failed to unmarshal SDP: EOF`, route+auth+lookup all working) ## Known gaps tracked as follow-ups - **#2** — `BuildArgs` hardcodes `-map 0:v:0`/`-map 0:a:0`. Fine for RTMP/SRT (single combined input); breaks multi-input lavfi test pipelines. M3. - **#3** — Swagger doesn't list `/api/v3/whep/...` (docs.go pre-dates these routes). M4. ## Tests ``` $ go test -race ./... ok github.com/datarhei/core/v16/core/webrtc 1.333s ok github.com/datarhei/core/v16/app/webrtc 1.196s ok github.com/datarhei/core/v16/config 1.010s ok github.com/datarhei/core/v16/restream 15.113s ok github.com/datarhei/core/v16/http/api 0.003s ← new ProcessConfigWebRTC roundtrip test ok github.com/datarhei/core/v16/http/handler/api 0.781s ok github.com/datarhei/core/v16/test/whep-client 0.183s ``` ## Commits ``` 0417aff test(whep-client): add -token flag for JWT-gated /api/v3/whep endpoints f6d36bf fix(http/api): carry process WebRTC config through the API DTO 2d29dc9 fix(config): preserve WebRTC section in Config.Clone() d96aa70 deploy(truenas): Core image + compose for M2 WebRTC rollout b030102 test(webrtc): add M2 integration smoke test 83eaa28 feat(webrtc): wire app/webrtc subsystem into Core lifecycle f6d5b33 feat(webrtc): add Echo WHEP handler for app/webrtc subsystem 9d38e9c feat(webrtc): add app/webrtc subsystem + lifecycle hooks 46531bb feat(restream): add ProcessHooks for WebRTC subsystem integration 16ae17d feat(app/webrtc): port allocator + FFmpeg arg builder 80db028 feat(config): add webrtc global config block eaeefee feat(restream): add ConfigWebRTC per-process field c38036d docs(m2): implementation plan 86bae81 docs(m2): WebRTC into Core proper — design spec ``` (Co-authored with Claude Opus 4.7.)
zgaetano added 27 commits 2026-05-03 01:10:48 -04:00
Pion webrtc/v4 (v4.2.11) requires Go 1.24+. Upstream datarhei was at
go 1.21.0. Bumping to go 1.24.0 pulls minor bumps across testify,
golang.org/x/{crypto,net,sync,sys,text,time,tools,mod}; vendor/ is
regenerated via 'go mod vendor' to reflect the new versions.

No application code changes; pure dep bump to unblock M1.
Adds github.com/pion/rtp v1.10.1 as a direct dependency (vendored).
Vendors github.com/pion/webrtc/v4 v4.2.11 and its transitive
dependencies (datachannel, dtls/v3, ice/v4, interceptor, logging,
mdns/v2, sctp, sdp/v3, srtp/v3, stun/v3, transport/v4, turn/v4).
Minimal egress-only server that wires Source, Registry, PeerFactory and
WHEPHandler together on a single stream id. Listens for RTP on a local
UDP port (default 127.0.0.1:10000) and serves WHEP on :8787.

Not part of the Core binary — will be demoted to an internal test helper
once M2 integrates WebRTC output into the process-graph.
Generates a synthetic testsrc2 video + sine audio and pushes H.264/Opus
RTP to the webrtc-poc's UDP port, using the hard-coded payload types
(102 video, 111 audio) the M1 forwarder dispatches on. Intended to be
run alongside test/whep-client (M1 Task 11) for end-to-end verification.
test(webrtc): add Pion WHEP subscriber client + e2e test
Some checks failed
tests / build (push) Failing after 13s
CodeQL / Analyze (pull_request) Failing after 2s
tests / build (pull_request) Failing after 2s
413d0f24b6
whep-client/main.go: minimal Pion subscriber that POSTs a recvonly
offer, applies the answer, and waits for one RTP packet on each of
the video and audio tracks. Used as M1's end-to-end verifier.

whep-client/main_test.go: in-process e2e wiring — stands up Source,
Registry, PeerFactory and WHEPHandler behind an httptest server,
injects synthetic PT=102/111 RTP on the Source's UDP port and calls
Subscribe. Validates the full egress pipeline without requiring
FFmpeg or external network. Skipped under -short.
feat(webrtc): add -rtp-host flag + TrueNAS Docker deploy
Some checks failed
tests / build (push) Failing after 3s
CodeQL / Analyze (pull_request) Failing after 3s
tests / build (pull_request) Failing after 3s
9e3f031f95
- core/webrtc: NewSourceOn(streamID, host, port) allows binding the
  RTP UDP socket on something other than 127.0.0.1, required when the
  PoC runs in a container and must accept RTP from LAN publishers.
  NewSource(streamID, port) stays as a convenience wrapper on
  127.0.0.1 for existing tests and tight local tests.

- cmd/webrtc-poc: new -rtp-host flag (default 127.0.0.1 for safety).

- deploy/docker/Dockerfile: two-stage build, scratch runtime, ~14 MB.

- deploy/truenas/docker-compose.yml: host-networked stack template
  driven by a .env file. Host networking is required for WebRTC ICE
  to work without NAT rewriting per-candidate.

- deploy/truenas/README.md: operator runbook with port picking,
  bring-up, verification curls, and security notes.
M2 promotes the M1 standalone PoC into the datarhei Core binary so
WebRTC becomes a first-class output alongside RTMP/SRT/HLS, surfaced
in the core-ui dashboard.

Architecture: new app/webrtc sibling subsystem + two small hooks on
restream (ProcessHooks + AppendOutput), reusing the untouched M1
core/webrtc package. WHEP served under /api/v3/process/{id}/whep,
inheriting JWT auth. A new "Live (WebRTC)" tab on the process detail
view provides the embedded browser player.

Covers: purpose, architecture diagram, decision table, components,
data flow (enable/subscribe/stop/disable/restart), error handling,
testing strategy (unit/integration/e2e), acceptance criteria,
rollback, and a seven-milestone sanity breakdown.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds the per-process WebRTC egress toggle + codec/payload-type knobs
described in the M2 spec. Clone() carries it forward. No behavior
change yet \u2014 the subsystem wiring comes later in M2.
Adds webrtc.enable, webrtc.public_ip, webrtc.nat_1_to_1_ips, and
webrtc.udp_mux_port to the Core Data struct and registers each via
the existing vars system. Default is disabled; no behavior change
without explicit opt-in.
Adds Alloc(), the ephemeral loopback UDP port grabber the subsystem
uses to pick the RTP port it will hand to FFmpeg and then re-bind with
core/webrtc.NewSourceOn. Covered by a 100x rebind test.

Adds BuildArgs(), which emits the -f rtp output fragments (video on
the passed port, audio on port+1) with copy codecs by default and an
H.264 baseline / libopus re-encode leg when ForceTranscode is set.
Covered by three unit tests.
Adds a pair of lifecycle callbacks the app/webrtc subsystem installs
via SetHooks:

- OnStart fires synchronously just before ffmpeg.Start(). It receives
  the task config and may return []ConfigIO extras to append to the
  output list. When extras are appended, startProcess rebuilds the
  FFmpeg command and the underlying process.Process before starting.
  A non-nil error aborts the start.

- OnStop fires synchronously just after ffmpeg.Stop() so subsystems
  can tear down per-process state.

Hooks run with the restream write lock held; they must not call back
into Restreamer methods or they will deadlock. This is the pattern
app/webrtc uses to inject per-process RTP output legs without having
to reach into restream internals from outside.
Introduces the subsystem layer that sits alongside api.API and wires
the M1 core/webrtc primitives into the per-process restream lifecycle.

app/webrtc/subsystem.go:
  - Subsystem struct holding the global WebRTC config, core PeerFactory,
    per-process stream map, and logger
  - New(config.DataWebRTC, logger) constructor
  - Enabled(), Hooks(), Close(), lookup() methods

app/webrtc/lifecycle.go:
  - onProcessStart: allocates an adjacent UDP port pair, binds two
    Pion Sources (video on V, audio on V+1), registers them under the
    process id, and returns the two RTP output legs to append to the
    FFmpeg command.
  - onProcessStop: tears down the pair.
  - allocAdjacentPair: retries up to 10 times to find a free (V, V+1)
    pair since the kernel's ephemeral picker can hand us an odd port.
  - splitRTPLegs: converts BuildArgs' flat []string into two ConfigIO
    entries by splitting on the second -map token.

core/webrtc/peer.go + forward.go:
  - Adds PeerFactory.CreatePeerFromSources for the M2 two-source
    forwarding mode (video and audio on separate UDP ports, no
    payload-type sniffing). Leaves CreatePeer intact for the M1 PoC.
  - Adds forwardRTPSplit companion goroutine.

config/data.go:
  - Promote anonymous WebRTC struct to named type DataWebRTC so
    app/webrtc can accept it by value.
Introduces the HTTP surface the browser (or OBS WebRTC clients)
target when subscribing to a process's egress:

  POST   /whep/:id              -> answer SDP + Location header
  DELETE /whep/:id/:resource    -> tear down a specific peer

The handler looks up the per-process stream pair via the Subsystem,
validates SDP offer shape, and delegates peer creation to the core
PeerFactory's CreatePeerFromSources (two-source forwarding).

WHEP routes are left unauthenticated in M2 — browsers and OBS don't
carry the Core JWT, and per-process signed-URL tokens are an M3
enhancement. Deployments should place the endpoint behind an
authenticated reverse-proxy for now.

Tests cover:
  - 404 for POSTs against unregistered streams
  - 400 for empty/invalid SDP offers once a stream is registered
  - 404 for DELETE against unknown resource ids
Installs the WebRTC egress subsystem at Core boot when
cfg.WebRTC.Enable is true and the subsystem constructs cleanly:

- http.Config gains an optional WebRTC *appwebrtc.Handler field;
  server.setRoutesV3 mounts its WHEP routes on the JWT-protected
  /api/v3 group.
- api.start() constructs the Subsystem, registers its ProcessHooks
  with the restreamer, and builds a Handler. A construction failure
  is logged and Core continues without WebRTC — consistent with
  disabling the subsystem outright.
- api.stop() closes the Handler (tearing down active peers) before
  closing the Subsystem (releasing per-process UDP sockets), mirroring
  the RTMP/SRT teardown pattern.

Verified: go build ./... clean; go test ./app/webrtc/...
./core/webrtc/... ./restream/... ./http/... all pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
End-to-end exercise of the M2 pipeline — subsystem hook, port
allocation, two-track forwarding, WHEP handshake — without
spinning up a full Core HTTP server:

- Fire onProcessStart directly to get the two RTP legs back
- Parse video + audio UDP ports out of the leg addresses,
  assert adjacency
- Mount the Handler on an Echo httptest server
- Build a Pion PeerConnection (recvonly video + audio), POST
  its offer, feed the answer back in
- Spray synthetic RTP packets at both loopback sockets
- Assert both OnTrack callbacks fire and each delivers at least
  one RTP packet within 10s
- DELETE via the returned Location header to confirm teardown

Passes cleanly under -race in ~1s. Catches regressions across
the whole M2 wiring from a single fixture.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
deploy(truenas): Core image + compose for M2 WebRTC rollout
Some checks failed
tests / build (push) Failing after 3s
d96aa70c27
Adds a dedicated deploy bundle under deploy/truenas/core/ so the
real root Core binary — with the M2 WebRTC subsystem wired in —
can replace the M1 webrtc-poc stack on the TrueNAS host.

- Dockerfile: two-stage build on golang:1.24-alpine3.20 + alpine:3.20
  runtime. FFmpeg is bundled so restream processes have their
  subprocess path ready. Copies the core binary from core/core
  (Go places the output file inside the core/ package directory
  because it can't overwrite a directory with a file) plus import
  and ffmigrate from the repo root.
- docker-compose.yml: host-networked Core service, env-driven
  config (CORE_ADDRESS, CORE_API_AUTH_*, CORE_WEBRTC_ENABLE,
  CORE_WEBRTC_PUBLIC_IP), with config/ and data/ bind mounts.
- README.md: M1→M2 cutover notes, one-time setup, JWT smoke test
  against /api/v3/whep/:id, and teardown.

Verified: make release + make import + make ffmigrate all
cross-compile cleanly for linux/amd64; go build ./... and
go test ./... pass on the branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
fix(config): preserve WebRTC section in Config.Clone()
Some checks failed
tests / build (push) Failing after 3s
2d29dc9c4a
Config.Clone() copied every top-level Data section except WebRTC.
Because api.go receives a clone (not the original), cfg.WebRTC.Enable
was always the zero value at runtime, the subsystem was skipped, and
the WHEP route was never mounted — regardless of CORE_WEBRTC_ENABLE.

Caught on the first live M2 TrueNAS deploy: env said enable=true,
container listened fine, but /api/v3/whep/:id returned Echo's default
JSON 404 (from router) instead of the handler's plain-text
'webrtc: stream not found' (which it would return for an unknown id).

- Add data.WebRTC = d.WebRTC in the struct-copy block.
- Deep-copy NAT1To1IPs alongside the other []string sections.
- Regression test TestConfigCopyWebRTC covers both.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
fix(http/api): carry process WebRTC config through the API DTO
Some checks failed
tests / build (push) Failing after 3s
f6d36bfa66
ProcessConfig in http/api/process.go shipped without a WebRTC field, so
JSON arriving at POST /api/v3/process was silently stripped of
"webrtc":{"enabled":true}. Marshal() handed restream a zero
ConfigWebRTC, the OnProcessStart hook no-op'd, and every WHEP request
returned 404 — even with a running webrtc-enabled process.

Caught on the M2 TrueNAS deploy at acceptance time: GET /process/{id}/config
came back without the webrtc block, despite the inbound JSON having it.
This is the API-layer twin of the earlier 'fix(config): preserve WebRTC
section in Config.Clone()' — same class of bug (drop-on-copy), different
struct.

- Add ProcessConfigWebRTC mirroring app.ConfigWebRTC.
- Marshal: copy DTO -> app.Config.WebRTC.
- Unmarshal: copy app.Config.WebRTC -> DTO.
- Regression tests cover both the JSON->DTO->Config path and the
  default (no webrtc block) case.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
test(whep-client): add -token flag for JWT-gated /api/v3/whep endpoints
Some checks failed
tests / build (push) Failing after 2s
CodeQL / Analyze (pull_request) Failing after 2s
tests / build (pull_request) Failing after 1s
0417aff3b1
The M2 WHEP route lives under /api/v3 and inherits Core's JWT auth.
The M1 test client was written for the unauth'd PoC port; without
this flag it's useless against the real Core build.

- Subscribe() and postOffer() take a token string; empty means no
  Authorization header (M1 behavior preserved).
- main.go gains a -token flag.
- main_test.go pass empty token (existing tests run against an
  in-process unauth'd handler).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
zgaetano added 16 commits 2026-05-03 08:27:30 -04:00
Two small additions to support the M3 handler:

- Peer.Done() — read-only view of the existing 'done' channel,
  closed on Close(). Lets external indexes (Handler, admin API)
  await peer teardown without polling.
- Peer.AddICECandidate — passthrough so the WHEP PATCH handler
  can forward trickle-ICE candidates without reaching into the
  PeerConnection directly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Subsystem.SetTeardownHook installs a callback the subsystem invokes
just before closing per-stream Sources in onProcessStop. Used by the
WHEP Handler in M3 to drain its per-stream peer index before the
underlying Sources go away — closes the 'subscribers fan out into a
closed channel' race the design's §6 error matrix calls out as
'Publisher disconnects / FFmpeg exits'.

Single consumer by design (one subsystem, one handler). Calling
SetTeardownHook again replaces the previous callback; nil detaches.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Major Handler rewrite implementing the design's M3 acceptance
criteria ('5 concurrent viewers, all error paths correct, clean
teardown'):

Multi-viewer correctness:
- streamID -> resourceID -> Peer two-level index (was flat)
- per-stream peer cap alongside total cap, defaults match the
  design's '5–8 viewer' target (8/stream, total from corewebrtc)
- per-peer awaitPeerClose goroutine watches Peer.Done() so ICE
  failures yank the index entry + decrement the counter (no leaks)
- tearDownStreamPeers callback (registered with Subsystem in
  NewHandler) drives all peer closes when the source process stops

Error matrix from design §6:
- 406 on codec mismatch (offer missing H264 or Opus rtpmap)
- 504 on ICE gathering timeout (passthrough from CreatePeerFromSources)
- 204 on DELETE unknown resource (idempotent per WHEP spec; was 404)
- 503 on per-stream cap reached (separate body from total-cap 503)
- 400 on missing/empty body (unchanged)
- 404 on unknown stream (unchanged)

WHEP spec compatibility:
- PATCH /whep/:id/:resource for trickle-ICE
- OPTIONS preflight on every WHEP path
- CORS Allow-Origin/Methods/Headers + Expose-Headers (Location, ETag)
- ETag header on Subscribe response

Defensive nil-peer guards in tearDown / Close paths so a partial
state doesn't panic.

Refactor: 134 -> 341 lines on handler.go but the surface is the
same (NewHandler/Register/Subscribe/Unsubscribe/Close); existing
callers continue to work. Pre-M3 test 'Unsubscribe_404WhenUnknown'
renamed and updated to the new 204 expectation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Covers each new code path that the design's §6 table requires:
- Subscribe -> 406 on non-H264 / non-Opus offer (TestHandler_Subscribe_406OnCodecMismatch)
- Subscribe -> 503 when total cap exhausted (TestHandler_Subscribe_503OnTotalCap)
- Subscribe -> 503 when per-stream cap exhausted (TestHandler_Subscribe_503OnPerStreamCap)
- Trickle -> 404 on unknown resource (TestHandler_Trickle_404WhenUnknown)
- preflight -> 204 + CORS headers (TestHandler_PreflightCORS)
- Register installs all 5 routes (TestHandler_RegisterMountsAllRoutes)
- Close drains the index without panicking (TestHandler_Close_DrainsPeers)
- requireH264AndOpus table-driven (TestRequireH264AndOpus)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
TestIntegration_FiveViewerFanout drives the M3 acceptance criterion
in the wide direction: spin up the subsystem, register one process,
attach 5 Pion subscribers in parallel via the real Echo handler,
spray synthetic RTP at the allocated UDP ports, and assert each
subscriber's video + audio track receive at least one packet inside
a 15s window. After onProcessStop, the per-stream peer index must
drain to zero within 3s.

TestSubsystem_TeardownHookFiresOnProcessStop is the unit-level
counterpart — confirms the callback registered via
SetTeardownHook actually fires when a process is torn down, even
without a full Pion handshake.

Together these cover the acceptance language: '5 concurrent viewers,
all error paths correct, clean teardown'.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
chore: ignore the whep-client test binary (top-level build artifact)
Some checks failed
tests / build (push) Failing after 2s
tests / build (pull_request) Failing after 2s
de4b215123
fix(webrtc): make WebRTC FFmpeg stream maps configurable (closes #2)
Some checks failed
tests / build (push) Failing after 2s
tests / build (pull_request) Failing after 1s
49677fbd3d
BuildArgs hardcoded -map 0✌️0 / -map 0🅰️0 for the two RTP legs.
Correct for production RTMP/SRT publishers (single combined input),
but breaks any process whose audio lives on a different input index
— multi-input lavfi test scaffolds, multi-camera pipelines, SDI +
file-audio mixes, etc.

Adds VideoMap and AudioMap fields to ConfigWebRTC (and the API DTO),
defaulting to the prior literals so existing deployments are
unaffected. BuildArgs reads them.

Tests:
- TestBuildArgs_DefaultMaps locks the empty-string default behavior
- TestBuildArgs_CustomMaps drives the multi-input override path
- TestProcessConfigWebRTCMapsRoundtrip extends the DTO roundtrip

Closes #2.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
fix(webrtc): swagger annotations for WHEP routes, regenerate docs (closes #3)
Some checks failed
tests / build (push) Failing after 1s
tests / build (pull_request) Failing after 2s
c8bcf75227
The WHEP routes were mounted by http/server.go via the app/webrtc
Handler.Register(), but Subscribe and Unsubscribe carried no swag
annotations. The Swagger UI at /api/swagger/index.html therefore
didn't list /api/v3/whep/* — programmatic API consumers and humans
browsing the docs couldn't discover the endpoints.

Adds the standard upstream-shaped @Summary / @Tags / @ID / @Router
annotations on Subscribe and Unsubscribe (matching the rtmp.go and
srt.go pattern) and regenerates docs/{docs.go,swagger.json,swagger.yaml}
via 'make swagger'. Verified: swagger.json now contains both paths,
swagger UI renders them under the v16.16.0 tag.

Closes #3.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ci+test: forgejo workflow, browser WHEP player, TESTING.md (M4 part 1)
Some checks failed
ci / vet + build (push) Successful in 9m50s
ci / vet + build (pull_request) Successful in 9m49s
ci / race tests (push) Failing after 8m4s
ci / WebRTC smoke (5-viewer fanout) (push) Successful in 9m48s
ci / race tests (pull_request) Failing after 6m28s
ci / WebRTC smoke (5-viewer fanout) (pull_request) Successful in 9m46s
927ccc6ced
Three artifacts that close out the easier half of the M4 milestone:

1. .forgejo/workflows/test.yml — CI on every push and PR. Three jobs:
     - lint-and-vet: go vet + go build (~30s)
     - test:        go test -race -short ./... + a no-race coverage
                    pass that uploads coverage.out as an artifact
     - webrtc-smoke: TestIntegration_FiveViewerFanout and the rest of
                     the WebRTC subsystem tests in isolation, so a
                     failure on the egress path stays readable in the
                     log.
   Pinned to Go 1.24 to match go.mod. The forge has a
   forgejo-runner sibling container; this YAML uses GitHub Actions
   syntax which Forgejo Actions accepts unchanged.

2. test/whep-player.html — self-contained browser WHEP subscriber for
   manual smoke testing. RTCPeerConnection (recvonly V+A) + fetch()
   POST/DELETE/PATCH against /api/v3/whep/:id, ICE/PC state pills,
   inbound-bitrate sampling at 1 Hz, codec hint pulled from the answer
   SDP, JWT token field, ?url=&token= shareable query string. No
   external deps; works from file:// or any static host.

3. test/TESTING.md — short doc that ties together the in-process race
   tests, the browser player, and the existing Pion CLI helper at
   test/whep-client/. Notes the latency p95 gate as a follow-up.

Latency gate (FFmpeg drawtext frame counter + decode-side pixel
sampling, p95 < 300ms RTMP / < 200ms SRT) is queued for a separate
PR — it's a several-hundred-line addition in its own right and
shouldn't block CI from landing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ci(webrtc): server-hop latency p95 gate
Some checks failed
ci / vet + build (push) Successful in 9m54s
ci / vet + build (pull_request) Successful in 9m49s
ci / race tests (push) Failing after 8m1s
ci / WebRTC smoke (5-viewer fanout) (push) Successful in 9m45s
ci / WebRTC latency p95 gate (push) Successful in 10m3s
ci / race tests (pull_request) Failing after 7m59s
ci / WebRTC smoke (5-viewer fanout) (pull_request) Successful in 9m45s
ci / WebRTC latency p95 gate (pull_request) Successful in 10m4s
b7afd0f08a
Adds an end-to-end RTP-arrival latency probe that runs as a dedicated
CI job and asserts p95 < 50ms.

Implementation
--------------
A build-tagged test (-tags latency, off by default) sends 1000
synthetic RTP packets at 60Hz into corewebrtc.Source and reads them
back via a Pion subscriber's track.ReadRTP(). Each packet's payload
starts with the publisher's UnixNano send time; the subscriber diffs
against time.Now() at arrival and accumulates p50/p95/p99.

This exercises every link of the egress hop: Source UDP read,
subscriber fan-out, forwardRTPSplit, Pion's TrackLocalStaticRTP
write, DTLS-SRTP encrypt, ICE socket write, decrypt at the
subscriber, RTP unmarshal at ReadRTP. Pure server-side; no FFmpeg
or codecs involved.

Why not glass-to-glass
----------------------
The design's §7 calls for FFmpeg drawtext frame counters + decode-
side pixel sampling, p95<300ms RTMP / <200ms SRT. Implementing that
in pure Go needs a cgo H.264 decoder or an FFmpeg sidecar pipe — a
significantly bigger lift for a marginal regression-detection win
(encode/decode latency is roughly fixed by the codec stack and
isn't moved by Core code changes). The server-hop measurement
captures everything Core code can actually regress.

Threshold
---------
50ms p95. Locally observed on a quiet host:
  p50=110µs, p95=237µs, p99=318µs.
The 50ms gate is ~200x headroom — generous enough to absorb CI
runner noise without false alarms, tight enough to catch a real
slowdown.

Race-clean: latencySamples uses a sync.Mutex around the slice append
(initial draft had a slice racing with the receive goroutine; vet
caught it).

Documented in test/TESTING.md and wired to .forgejo/workflows/test.yml
as the latency-gate job (depends on lint-and-vet, parallel with test
and webrtc-smoke).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
feat(branding): Dragon Fork identity for v0.1.0-dragonfork release
Some checks failed
tests / build (push) Failing after 2s
tests / build (pull_request) Failing after 2s
671f64ca56
M5 / final M2-stack work. The fork now identifies itself unambiguously
in logs, the API, and the README without changing the Go module path
(internal imports stay at github.com/datarhei/core/v16 — see NOTES.md
for the rationale).

Identity surfaces:

- app/version.go gains Variant ('dragonfork') and Fork ('Datarhei —
  Dragon Fork') as vars (overridable via -ldflags for downstream
  re-packagers).
- api.About + the /api endpoint expose 'variant' and 'fork' fields;
  Swagger docs regenerated.
- Startup banner logs 'variant' + 'fork' alongside the existing
  application + version fields, so a TrueNAS sysadmin tail-following
  /var/log can tell at a glance which fork is running.

Documentation:

- README.md rewritten with a Dragon Fork header and Quick start; the
  upstream feature surface is summarised in 'From upstream Datarhei'
  with a clear additivity statement. Sample process JSON, multi-input
  pipeline guidance, link to the design + testing docs.
- NOTICE: Apache 2.0 §4(d) attribution to upstream datarhei Core,
  Pion, Echo, FFmpeg.
- CREDITS: enumerated dependency list with licenses.
- CHANGELOG.md prepended with a 'Datarhei — Dragon Fork' section
  starting at v0.1.0-dragonfork; upstream's '# Core' history preserved
  below.

Module path stays github.com/datarhei/core/v16 by design — the fork is
distinguished by repo location and branch history, not import path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Conflict resolution: keep M3's full handler.go rewrite (per-stream
index, error matrix, PATCH, CORS, auto-cleanup) and re-apply the
swagger annotations from #7 onto the new function declarations,
including a fresh annotation for the M3-introduced Trickle endpoint.
Swagger docs regenerated to pick up all three.

Race-clean: go test -race ./app/webrtc/... green.
Brings in both halves of M4: PR #8 (CI workflow + browser player +
TESTING.md) and PR #9 (server-hop latency p95 gate).
Merge branch 'm5-branding-release' into m2-webrtc-core-integration
Some checks failed
ci / vet + build (push) Successful in 9m49s
ci / vet + build (pull_request) Successful in 9m59s
ci / race tests (push) Failing after 8m1s
ci / WebRTC smoke (5-viewer fanout) (push) Successful in 9m45s
ci / WebRTC latency p95 gate (push) Successful in 10m3s
ci / race tests (pull_request) Failing after 8m6s
ci / WebRTC smoke (5-viewer fanout) (pull_request) Successful in 9m45s
ci / WebRTC latency p95 gate (pull_request) Successful in 10m5s
fd391b5ca4
# Conflicts:
#	docs/docs.go
#	docs/swagger.json
#	docs/swagger.yaml
Author
Owner

Merged into main via direct push as part of the v0.1.0-dragonfork release. Branch commits are reachable from main; closing this PR. Release: https://forge.wilddragon.net/zgaetano/datarhei-dragonfork-core/releases/tag/v0.1.0-dragonfork

Merged into `main` via direct push as part of the v0.1.0-dragonfork release. Branch commits are reachable from main; closing this PR. Release: https://forge.wilddragon.net/zgaetano/datarhei-dragonfork-core/releases/tag/v0.1.0-dragonfork
zgaetano closed this pull request 2026-05-03 08:28:57 -04:00
Some checks failed
ci / vet + build (push) Successful in 9m49s
ci / vet + build (pull_request) Successful in 9m59s
ci / race tests (push) Failing after 8m1s
ci / WebRTC smoke (5-viewer fanout) (push) Successful in 9m45s
ci / WebRTC latency p95 gate (push) Successful in 10m3s
ci / race tests (pull_request) Failing after 8m6s
ci / WebRTC smoke (5-viewer fanout) (pull_request) Successful in 9m45s
ci / WebRTC latency p95 gate (pull_request) Successful in 10m5s

Pull request closed

Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: zgaetano/datarhei-dragonfork-core#4
No description provided.