datarhei-dragonfork-core/CHANGELOG.md
ZGaetano c4857f5581
Some checks failed
ci / race tests (push) Blocked by required conditions
ci / WebRTC smoke (5-viewer fanout) (push) Blocked by required conditions
ci / WebRTC latency p95 gate (push) Blocked by required conditions
ci / vet + build (push) Has been cancelled
docs: add v0.3.0-dragonfork CHANGELOG entry
Covers WHIP ingest backend, keyframe cache, Wild Dragon UI WHIP toggle,
seed-data.sh always-overwrite fix, and the full core/webrtc test suite.
2026-05-10 13:06:06 -04:00

320 lines
14 KiB
Markdown

# Datarhei — Dragon Fork
## v0.3.0-dragonfork (2026-05-10)
WebRTC ingest (WHIP) milestone. Browsers and OBS can now push a
WebRTC stream into a channel, and the first-frame experience for WHEP
viewers is dramatically improved by the in-memory keyframe cache.
Resolves issues #15, #16, #17.
### Added
- **WHIP ingest path** — browsers and OBS Studio can push a WebRTC
stream (H.264 + Opus) into any Dragon Fork channel via
`POST /api/v3/whip/{id}`. The publisher sends an SDP offer; Core
answers, allocates a loopback UDP pair, and injects RTP input legs
into the FFmpeg command line — the exact mirror of the WHEP egress
path. `DELETE /api/v3/whip/{id}/{resource}` tears down the publisher
cleanly. Closes #16.
- **`ProcessConfigWHIPIngest` API struct** in `http/api/process.go`
mapping `whip_ingest.{enabled,video_pt,audio_pt}` between the JSON
API and `app.ConfigWHIPIngest`. Without this struct, `WHIPIngest.Enabled`
was always false and WHIP could never activate via the API.
- **WHIP ingest lifecycle hooks** — `onWHIPProcessStart` /
`onWHIPProcessStop` in `app/webrtc/whip_lifecycle.go` allocate and
teardown the ingest UDP port pair, controlled by the per-process
`whip_ingest.enabled` flag. Merged via `MergedHooks()` alongside the
existing WHEP egress hooks.
- **Wild Dragon UI — WHIP toggle control** (`overlay/src/misc/controls/WHIP.js`
in the `wilddragon-restreamer-ui` overlay). Mirrors WHEP.js exactly.
Renders an Enable checkbox with caption in the channel edit view.
- **Wild Dragon UI — Edit/index.js wiring** — renders the WHIP control
in the Edit view and patches `props.restreamer._upsertProcess` in the
`save()` handler to inject `whip_ingest.enabled` into the process
config before the SDK PUT reaches Core. The patch is required because
the Restreamer SDK's `UpsertIngest` does not forward `webrtc` or
`whip_ingest` fields (SDK gap).
- **In-memory H.264 keyframe cache** in `core/webrtc/keyframecache.go`.
Retains the most recent IDR burst (all RTP packets from the first IDR
NAL fragment until the next one) per video Source. Bounded at 512
packets / 2 MiB. Detects single-NAL IDR (type 5) and FU-A start
fragments (type 28, start bit set, inner type 5). Closes #17.
- **Subscribe pre-fill** — `Source.Subscribe()` snapshots the keyframe
cache before registering the new subscriber, then drains the burst
into the channel immediately. New WHEP peers receive a complete
reference frame on join instead of waiting up to one GOP (≈ 2 s at
30 fps / GOP=60).
- **`Source.EnableKeyFrameCache()`** — opt-in method; called only on
video sources in `allocAdjacentPair()`. Audio sources are
intentionally uncached (Opus payloads would accumulate without ever
triggering a reset).
- **Test suite for `core/webrtc`** — `keyframecache_test.go` (18
functions) and `source_test.go` (5 functions). Covers IDR detection
in all packetisation modes, cache reset, burst accumulation, capacity
caps, snapshot independence, concurrent read/write under `-race`, and
Subscribe pre-fill behaviour. All 34 tests in `core/webrtc` green
under `go test -race`.
### Fixed
- **`deploy/truenas/core/seed-data.sh`** — the old no-clobber-only
approach kept stale JS bundles alive on the data volume after image
rebuilds (`static/` was never refreshed because it already existed).
Fixed by splitting into two phases: always-overwrite for `index.html`,
`asset-manifest.json`, and `static/`; no-clobber for everything else
(channel data, player bundles, operator content). Prevents a class of
"new code never runs" deployment bugs.
### Upgrade (from v0.2)
```sh
cd deploy/truenas/core
git pull
docker compose build --no-cache core
docker compose up -d core
```
The `seed-data.sh` fix means there is no longer a need to manually
`docker exec` a static-bundle copy after rebuilds — it happens
automatically on container start.
---
## v0.2 backlog (2026-05-06)
Completes the open v0.2 issues from the post-GUI-ship backlog.
Resolves issues #11, #12, #13, #14.
### Added
- **WebRTC Prometheus metrics** — eleven metrics in the
`dragonfork_webrtc_*` namespace using RED-method principles.
Hybrid instrumentation: direct `client_golang` counters/histograms
for hot-path WHEP routes and ICE establishment in `app/webrtc/metrics.go`,
plus a snapshot collector for gauges in `prometheus/webrtc.go`.
Metrics: `whep_requests_total`, `whep_request_duration_seconds`,
`ice_establishment_duration_seconds`, `ice_failures_total`,
`codec_mismatches_total`, `cap_rejections_total`,
`ffmpeg_leg_failures_total`, `active_streams`, `active_peers`,
`udp_ports_in_use`. Closes #11.
- **Grafana observability stack** in `deploy/truenas/core/`:
Prometheus v2.55 and Grafana OSS 11.3 containers on a `dragonfork-mon`
bridge network reaching Core via `host.docker.internal`. Pre-loaded
WebRTC Health dashboard (5 rows: WHEP API, ICE, streams/peers, capacity,
silent-degradation canary). Four pre-loaded Prometheus alert rules.
Deploy upgrade: add `GRAFANA_ADMIN_PASSWORD` to `.env`,
`docker compose pull && docker compose up -d`. Closes #11.
- **Docker image CI publish workflow** at `.forgejo/workflows/publish.yml`.
Triggers on semver tags. Builds multi-arch (`linux/amd64` + `linux/arm64`)
and pushes to the configured registry (`REGISTRY` repo variable,
defaults to `ghcr.io`). Requires `REGISTRY_TOKEN` secret and optional
`REGISTRY_USER` / `IMAGE_NAME` variables. Layer cache via GitHub Actions
cache. Closes #12.
- **Upstream rebase policy** at `docs/REBASE.md`. Documents monthly
cadence, rebase-not-merge strategy, Dragon Fork divergence boundaries,
pre/post-rebase checklist, vendored-dependency procedure, first-rebase
runbook, and record-keeping table. First rebase against upstream is
pending (to be run locally per the procedure in `docs/REBASE.md`).
Closes #13.
- **WHEP sustained load test** at `test/load/sustained.go`.
Headless Go program (`//go:build ignore`, run with `go run`) that drives
N concurrent WHEP subscribers against a single stream for a configurable
duration. Measures: ICE establishment (p50/p95), jitter (RFC 3550 running
average), packet loss estimate (sequence-number gaps), packets received.
Outputs a markdown report to `test/load/results/`. Staggered connection
setup, trickle-ICE, and graceful DELETE on teardown. Closes #14.
- **`core/webrtc.Peer.Connected()` channel** — closed on first
`PeerConnectionStateConnected` event. Required by the ICE establishment
histogram (allows async measurement after the WHEP POST returns).
### Changed
- `deploy/truenas/core/docker-compose.yml`: adds `prom` and `grafana`
services + `dragonfork-mon` bridge network + named volumes. `core`
service is unchanged (stays on `network_mode: host`).
- `app/webrtc/handler.go`: WHEP route handlers now record request duration,
status code, codec mismatch, and cap rejection metrics. `tearDownStreamPeers`
records FFmpeg leg failures when peers were active at stop time.
- `app/webrtc/subsystem.go`: adds `StreamCount()` accessor for the
snapshot collector.
### Known limitations (remaining v0.2 open items)
- **Restreamer UI fork** (#15): separate repo, not started.
The upstream Restreamer UI does not yet have a WebRTC toggle; use
`/wilddragon-webrtc.html` in the meantime.
- **First upstream rebase** (#13, partially done): `docs/REBASE.md`
is committed; the actual `git rebase upstream/main` must be run
locally per the procedure. Record the result in the REBASE.md table.
### Upgrade (from v0.2.0-dragonfork)
```sh
cd deploy/truenas/core
git pull
# Add new lines to .env:
# GRAFANA_ADMIN_PASSWORD=$(openssl rand -base64 24)
# GRAFANA_PORT=3000
# PROM_PORT=9090
docker compose pull # pulls prom + grafana images
docker compose up -d # core unchanged, prom + grafana start fresh
```
To publish an image for the first time, set `REGISTRY`, `REGISTRY_USER`,
`IMAGE_NAME`, and `REGISTRY_TOKEN` in repo settings, then tag:
```sh
git tag v0.2.1-dragonfork && git push origin v0.2.1-dragonfork
```
---
## v0.2.0-dragonfork (2026-05-03)
The "GUI ship" release. Everything from v0.1 is preserved; this round
documents and ships a usable graphical surface for the WebRTC feature
that v0.1 only exposed through the API.
### Added
- **Wild Dragon WebRTC admin page** at `/wilddragon-webrtc.html`. Single-file
HTML/JS; no build step. Sign in with the API_AUTH_USERNAME / PASSWORD
creds, see every process, toggle `webrtc.enabled` per-process with one
click, restart on change, copy the WHEP URL, jump straight to the
smoke player. Closes the v0.1 GUI gap — the upstream Restreamer UI
ships with v0.2 but doesn't know about Core's `webrtc` config block,
so toggling WebRTC previously required direct API calls.
### Documented (was present, just unannounced)
- **Restreamer UI bundle** in the TrueNAS deploy. The `deploy/truenas/core/`
Dockerfile builds the upstream `datarhei/restreamer-ui` v1.14.0 React
bundle with the Wild Dragon overlay applied (logo / favicon / header
title / welcome card), copies the result into Core's disk filesystem
via `seed-data.sh`, and Core serves it at `/`. Was added during M2
but not called out in the v0.1 CHANGELOG.
- **WHEP smoke player** at `/whep-player.html`. Standalone WebRTC
subscriber with ICE/codec/bitrate diagnostics. Was added during M4.
---
## v0.1.0-dragonfork (2026-05-03)
The first tagged Dragon Fork release. Forked from upstream datarhei
Core v16.16.0; everything upstream does is preserved unchanged. New:
WebRTC (WHEP) egress, integrated with the existing process supervisor.
### Added
- **WebRTC subsystem** under `app/webrtc/`, mirroring the shape of
upstream's RTMP and SRT servers (Server interface, Echo handlers,
process-graph hooks, admin endpoints).
- **Per-process opt-in** via `config.webrtc.enabled` on every restream
process; resolver auto-injects two RTP output legs and allocates
loopback UDP ports.
- **`POST /api/v3/whep/{id}`** — WebRTC-HTTP Egress Protocol subscribe.
JWT-protected by the existing Core auth.
- **`DELETE /api/v3/whep/{id}/{resource}`** — idempotent teardown
(returns 204 even on unknown resource per WHEP spec).
- **`PATCH /api/v3/whep/{id}/{resource}`** — trickle ICE.
- **CORS preflight** on every WHEP route + `Access-Control-Expose-Headers`
for `Location` and `ETag` so browser-side WHEP players work
cross-origin.
- **Configurable stream maps** via `webrtc.video_map` / `webrtc.audio_map`
on the per-process config — defaults to `0:v:0` / `0:a:0` for
RTMP/SRT publishers, overridable for multi-input pipelines.
- **`webrtc.*` global config block** with `CORE_WEBRTC_*` env-var
bindings parallel to RTMP and SRT.
- **Admin API:** `GET /api/v3/webrtc/streams` + `/streams/{id}/peers`.
- **Browser smoke player** at `test/whep-player.html` with ICE / codec
/ bitrate diagnostics, JWT field, and `?url=&token=` shareable
URLs.
- **Server-hop latency p95 gate** in CI (`-tags latency`), enforced at
50ms on the runner; locally observed p95 ≈ 240µs.
- **TrueNAS deploy bundle** at `deploy/truenas/core/` — host-networked
Docker stack with bundled FFmpeg, env-driven config.
- **Multi-viewer correctness:** per-stream peer cap, ICE-failure
auto-cleanup goroutines, process-stop broadcast tear-down.
- **Error matrix:** 406 codec mismatch, 504 ICE timeout, 503 cap
reached (separate body for total vs per-stream), 204 DELETE
idempotent.
### Fixed
- `Config.Clone()` now preserves the `WebRTC` section.
- `http/api.ProcessConfig` Marshal/Unmarshal now carry the per-process
`webrtc` block.
---
# Core (upstream)
### Core v16.15.0 > v16.16.0
- Add ConnectionIdleTimeout to RTMP server
- Add WithLevel() to Logger interface
- Fix datarhei/restreamer#759
- Fix various RTMP bugs
- Fix wrong log output when receiving a RTMP stream
- Fix skipping session handling if collectors are nil
- Update dependencies
### Core v16.14.0 > v16.15.0
- Add migrating to ffmpeg 6
- Fix missing process data if process has been deleted meanwhile
- Fix maintaining the metadata on process config update (datarhei/restreamer#698)
- Fix placeholder parsing
- Fix concurrent memfs accesses
- Fix memfs concurrent read and write performance
### Core v16.13.1 > v16.14.0
- Add support for SRTv4 clients
- Add support for Enhanced RTMP in internal RTMP server
- Fix require positive persist interval (session)
- Fix race condition (process)
- Update dependencies
### Core v16.13.0 > v16.13.1
- Fix transfer of reports to updated process
- Fix calling Wait after process has been read
- Fix 509 return code if non-existing stream is requested
- Fix default search paths for config file
- Fix sized filesystem
- Update dependencies
### Core v16.12.0 > v16.13.0
- Add updated_at field in process infos
- Add preserve process log history when updating a process
- Add support for input framerate data from jsonstats patch
- Add number of keyframes and extradata size to process progress data
- Mod bumps FFmpeg to v5.1.3 (datarhei/core:tag bundles)
- Fix better naming for storage endpoint documentation
- Fix freeing up S3 mounts
- Fix URL validation if the path contains FFmpeg specific placeholders
- Fix purging default file from HTTP cache
- Fix parsing S3 storage definition from environment variable
- Fix checking length of CPU time array ([#10](https://github.com/datarhei/core/issues/10))
- Fix possible infinite loop with HLS session rewriter
- Fix not propagating process limits
- Fix URL validation if the path contains FFmpeg specific placeholders
- Fix RTMP DoS attack (thx Johannes Frank)
- Deprecate ENV names that do not correspond to JSON name