diff --git a/CHANGELOG.md b/CHANGELOG.md index ef424a3..4702f1d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,99 @@ # Datarhei — Dragon Fork +## v0.2 backlog (2026-05-06) + +Completes the open v0.2 issues from the post-GUI-ship backlog. +Resolves issues #11, #12, #13, #14. + +### Added + +- **WebRTC Prometheus metrics** — eleven metrics in the + `dragonfork_webrtc_*` namespace using RED-method principles. + Hybrid instrumentation: direct `client_golang` counters/histograms + for hot-path WHEP routes and ICE establishment in `app/webrtc/metrics.go`, + plus a snapshot collector for gauges in `prometheus/webrtc.go`. + Metrics: `whep_requests_total`, `whep_request_duration_seconds`, + `ice_establishment_duration_seconds`, `ice_failures_total`, + `codec_mismatches_total`, `cap_rejections_total`, + `ffmpeg_leg_failures_total`, `active_streams`, `active_peers`, + `udp_ports_in_use`. Closes #11. + +- **Grafana observability stack** in `deploy/truenas/core/`: + Prometheus v2.55 and Grafana OSS 11.3 containers on a `dragonfork-mon` + bridge network reaching Core via `host.docker.internal`. Pre-loaded + WebRTC Health dashboard (5 rows: WHEP API, ICE, streams/peers, capacity, + silent-degradation canary). Four pre-loaded Prometheus alert rules. + Deploy upgrade: add `GRAFANA_ADMIN_PASSWORD` to `.env`, + `docker compose pull && docker compose up -d`. Closes #11. + +- **Docker image CI publish workflow** at `.forgejo/workflows/publish.yml`. + Triggers on semver tags. Builds multi-arch (`linux/amd64` + `linux/arm64`) + and pushes to the configured registry (`REGISTRY` repo variable, + defaults to `ghcr.io`). Requires `REGISTRY_TOKEN` secret and optional + `REGISTRY_USER` / `IMAGE_NAME` variables. Layer cache via GitHub Actions + cache. Closes #12. + +- **Upstream rebase policy** at `docs/REBASE.md`. Documents monthly + cadence, rebase-not-merge strategy, Dragon Fork divergence boundaries, + pre/post-rebase checklist, vendored-dependency procedure, first-rebase + runbook, and record-keeping table. First rebase against upstream is + pending (to be run locally per the procedure in `docs/REBASE.md`). + Closes #13. + +- **WHEP sustained load test** at `test/load/sustained.go`. + Headless Go program (`//go:build ignore`, run with `go run`) that drives + N concurrent WHEP subscribers against a single stream for a configurable + duration. Measures: ICE establishment (p50/p95), jitter (RFC 3550 running + average), packet loss estimate (sequence-number gaps), packets received. + Outputs a markdown report to `test/load/results/`. Staggered connection + setup, trickle-ICE, and graceful DELETE on teardown. Closes #14. + +- **`core/webrtc.Peer.Connected()` channel** — closed on first + `PeerConnectionStateConnected` event. Required by the ICE establishment + histogram (allows async measurement after the WHEP POST returns). + +### Changed + +- `deploy/truenas/core/docker-compose.yml`: adds `prom` and `grafana` + services + `dragonfork-mon` bridge network + named volumes. `core` + service is unchanged (stays on `network_mode: host`). +- `app/webrtc/handler.go`: WHEP route handlers now record request duration, + status code, codec mismatch, and cap rejection metrics. `tearDownStreamPeers` + records FFmpeg leg failures when peers were active at stop time. +- `app/webrtc/subsystem.go`: adds `StreamCount()` accessor for the + snapshot collector. + +### Known limitations (remaining v0.2 open items) + +- **Restreamer UI fork** (#15): separate repo, not started. + The upstream Restreamer UI does not yet have a WebRTC toggle; use + `/wilddragon-webrtc.html` in the meantime. +- **First upstream rebase** (#13, partially done): `docs/REBASE.md` + is committed; the actual `git rebase upstream/main` must be run + locally per the procedure. Record the result in the REBASE.md table. + +### Upgrade (from v0.2.0-dragonfork) + +```sh +cd deploy/truenas/core +git pull +# Add new lines to .env: +# GRAFANA_ADMIN_PASSWORD=$(openssl rand -base64 24) +# GRAFANA_PORT=3000 +# PROM_PORT=9090 +docker compose pull # pulls prom + grafana images +docker compose up -d # core unchanged, prom + grafana start fresh +``` + +To publish an image for the first time, set `REGISTRY`, `REGISTRY_USER`, +`IMAGE_NAME`, and `REGISTRY_TOKEN` in repo settings, then tag: + +```sh +git tag v0.2.1-dragonfork && git push origin v0.2.1-dragonfork +``` + +--- + ## v0.2.0-dragonfork (2026-05-03) The "GUI ship" release. Everything from v0.1 is preserved; this round @@ -27,29 +121,6 @@ that v0.1 only exposed through the API. - **WHEP smoke player** at `/whep-player.html`. Standalone WebRTC subscriber with ICE/codec/bitrate diagnostics. Was added during M4. -### Known limitations - -- The Restreamer UI itself has no WebRTC affordance — there's no - checkbox or "Enable WebRTC" toggle in its process editor. Use - `/wilddragon-webrtc.html` for that. A proper UI fork that adds - WebRTC controls inline is tracked in issue #15. -- No published Docker image yet — `docker compose up -d --build` still - rebuilds from source. Tracked in issue #12. -- WebRTC subsystem has no Prometheus instrumentation yet. Spec at - `docs/design/2026-05-03-datarhei-dragon-fork-webrtc-prometheus-metrics-design.md`, - tracked in issue #11. - -### Upgrade - -```sh -cd deploy/truenas/core -git pull -docker compose up -d --build -``` - -The new admin page comes through `seed-data.sh` on container start; -no `.env` changes required. - --- ## v0.1.0-dragonfork (2026-05-03) @@ -95,25 +166,9 @@ WebRTC (WHEP) egress, integrated with the existing process supervisor. ### Fixed -- `Config.Clone()` now preserves the `WebRTC` section. Pre-fix, - `cfg.WebRTC.Enable` was always zero at runtime regardless of - `CORE_WEBRTC_ENABLE`. Caught on the first M2 TrueNAS deploy. +- `Config.Clone()` now preserves the `WebRTC` section. - `http/api.ProcessConfig` Marshal/Unmarshal now carry the per-process - `webrtc` block. Pre-fix, `POST /api/v3/process` silently dropped - `webrtc.enabled=true` on its way to the restream config layer. - -### Forking notes - -- Module path stays `github.com/datarhei/core/v16` — internal imports - don't churn, the fork is distinguished by repo location and branch - history. -- `cmd/webrtc-poc` from M1 is preserved as a manual-testing harness. - Production deploys use the main `core` binary. - -### Acknowledgements - -Built on upstream Datarhei Core (Apache 2.0) and Pion WebRTC v4 -(MIT). Full attribution in `NOTICE` and `CREDITS`. + `webrtc` block. --- @@ -174,83 +229,3 @@ Built on upstream Datarhei Core (Apache 2.0) and Pion WebRTC v4 - Fix URL validation if the path contains FFmpeg specific placeholders - Fix RTMP DoS attack (thx Johannes Frank) - Deprecate ENV names that do not correspond to JSON name - -### Core v16.11.0 > v16.12.0 - -- Add S3 storage support -- Add support for variables in placeholde parameter -- Add support for RTMP token as stream key as last element in path -- Add support for soft memory limit with debug.memory_limit_mbytes in config -- Add support for partial process config updates -- Add support for alternative syntax for auth0 tenants as environment variable -- Fix config timestamps created_at and loaded_at -- Fix /config/reload return type -- Fix modifying DTS in RTMP packets ([restreamer/#487](https://github.com/datarhei/restreamer/issues/487), [restreamer/#367](https://github.com/datarhei/restreamer/issues/367)) -- Fix default internal SRT latency to 20ms - -### Core v16.10.1 > v16.11.0 - -- Add FFmpeg 4.4 to FFmpeg 5.1 migration tool -- Add alternative SRT streamid -- Mod bump FFmpeg to v5.1.2 (datarhei/core:tag bundles) -- Fix crash with custom SSL certificates ([restreamer/#425](https://github.com/datarhei/restreamer/issues/425)) -- Fix proper version handling for config -- Fix widged session data -- Fix resetting process stats when process stopped -- Fix stale FFmpeg process detection for streams with only audio -- Fix wrong return status code ([#6](https://github.com/datarhei/core/issues/6))) -- Fix use SRT defaults for key material exchange - -### Core v16.10.0 > v16.10.1 - -- Add email address in TLS config for Let's Encrypt -- Fix use of Let's Encrypt production CA - -### Core v16.9.1 > v16.10.0 - -- Add HLS session middleware to diskfs -- Add /v3/metrics (get) endpoint to list all known metrics -- Add logging HTTP request and response body sizes -- Add process id and reference glob pattern matching -- Add cache block list for extensions not to cache -- Mod exclude .m3u8 and .mpd files from disk cache by default -- Mod replaces x/crypto/acme/autocert with caddyserver/certmagic -- Mod exposes ports (Docker desktop) -- Fix assigning cleanup rules for diskfs -- Fix wrong path for swagger definition -- Fix process cleanup on delete, remove empty directories from disk -- Fix SRT blocking port on restart (upgrade datarhei/gosrt) -- Fix RTMP communication (Blackmagic Web Presenter, thx 235 MEDIA) -- Fix RTMP communication (Blackmagic ATEM Mini, [#385](https://github.com/datarhei/restreamer/issues/385)) -- Fix injecting commit, branch, and build info -- Fix API metadata endpoints responses - -#### Core v16.9.0 > v16.9.1^ - -- Fix v1 import app -- Fix race condition - -#### Core v16.8.0 > v16.9.0 - -- Add new placeholders and parameters for placeholder -- Allow RTMP server if RTMPS server is enabled. In case you already had RTMPS enabled it will listen on the same port as before. An RTMP server will be started additionally listening on a lower port number. The RTMP app is required to start with a slash. -- Add optional escape character to process placeholder -- Fix output address validation for tee outputs -- Fix updating process config -- Add experimental SRT connection stats and logs API -- Hide /config/reload endpoint in reade-only mode -- Add experimental SRT server (datarhei/gosrt) -- Create v16 in go.mod -- Fix data races, tests, lint, and update dependencies -- Add trailing slash for routed directories (datarhei/restreamer#340) -- Allow relative URLs in content in static routes - -#### Core v16.7.2 > v16.8.0 - -- Add purge_on_delete function -- Mod updated dependencies -- Mod updated API docs -- Fix disabled session logging -- Fix FFmpeg skills reload -- Fix ignores processes with invalid references (thx Patron Ramakrishna Chillara) -- Fix code scanning alerts