datarhei-dragonfork-core/NOTES.md

118 lines
5.2 KiB
Markdown
Raw Normal View History

# Datarhei - Dragon Fork — Implementation Notes
This file tracks observations, gotchas, and decisions made during the Dragon
Fork WebRTC egress implementation. Keep entries chronological; each milestone
adds a new section.
## Baseline (M1, 2026-04-17)
- Forked from upstream `datarhei/core` commit `0de97f4` ("Add linux/arm/v8 build").
- Upstream module path: `github.com/datarhei/core/v16`. The Dragon Fork keeps
this module path so internal imports don't churn; the fork is distinguished
by its repo location (`forge.wilddragon.net/zgaetano/datarhei-dragonfork-core`)
and branch history, not its Go module identity.
- Toolchain: Go 1.22.8, FFmpeg 4.4.2 in the sandbox. FFmpeg 6.x recommended
for publishers in Task 10; 4.4.2 is sufficient for the PoC (libx264 +
libopus + RTP muxer all present).
- `go build ./...` on the clean fork: succeeds.
- `go test -short ./...` on the clean fork: all packages pass. No upstream
flakes observed.
### Pre-existing state of note
- None flagged.
---
## v0.2 (2026-05-06)
### Restreamer SDK gap — `UpsertIngest` does not forward `webrtc` or `whip_ingest`
The datarhei Restreamer SDK's `UpsertIngest` builds the FFmpeg process config
from `control.hls`, `control.rtmp`, `control.srt`, and `control.process`, but
silently discards `control.webrtc` and `control.whip_ingest`. This was confirmed
by inspecting the minified SDK bundle inside the running container.
**Implication:** toggling WebRTC egress or WHIP ingest from the Restreamer UI
required a monkey-patch. The Edit view (`overlay/src/views/Edit/index.js`)
overrides `props.restreamer._upsertProcess` immediately before the
`UpsertIngest` call to inject `whip_ingest.enabled` (and in future,
`webrtc.enabled`) into the process config JSON, then restores the original
method in a `finally` block. The patch is narrow and reversible.
**Why not patch the SDK?** The SDK is a vendored minified bundle inside the
upstream Restreamer UI. Patching it would require either maintaining a fork of
the minifier input (the full SDK source is not in the UI repo) or deobfuscating
and re-minifying. The monkey-patch is pragmatic and self-documenting.
### UI state uses `enable` (no d); Core API uses `enabled` (with d)
WHEP.js and WHIP.js use `enable` (matching existing convention in the UI
controls). The monkey-patch performs the mapping:
```js
config.whip_ingest = { enabled: !!(control.whip_ingest && control.whip_ingest.enable) };
```
If this causes confusion in the future, unify on `enabled` in both places.
---
## v0.3 (2026-05-10)
### `ProcessConfigWHIPIngest` was the critical backend gap
`http/api/process.go` had `ProcessConfigWebRTC` for WHEP but no corresponding
struct for WHIP. Without it, the JSON field `whip_ingest` was dropped on
unmarshal and `app.Config.WHIPIngest.Enabled` was always `false`. WHIP
could never activate via the API regardless of what the UI sent. Adding
`ProcessConfigWHIPIngest` and wiring it into `Marshal()`/`Unmarshal()` was
the essential fix (commit `4d94c88`).
### seed-data.sh no-clobber bug
`seed-data.sh` used `cp -n` (no-clobber) for all files including the built
JS bundle. On first deploy the `static/` directory doesn't exist and the copy
succeeds. On every subsequent deploy the directory already exists and the copy
is skipped — so a rebuilt container silently serves the old bundle. The symptom
was confusing: the binary had new code, but the UI didn't reflect it.
**Fix:** split into two phases. `index.html`, `asset-manifest.json`, and
`static/` are always overwritten (`cp -Rfp`). All other content (channel
database, player bundles, operator-uploaded media) remains no-clobber to
protect live data.
### Keyframe cache lock ordering
`keyFrameCache` has its own mutex `c.mu` distinct from `Source.mu`. The two
locks are never nested:
- `readLoop` calls `cache.push(pkt)` (acquires/releases `c.mu`), then
acquires `s.mu` for the subscriber fanout — sequential, not nested.
- `Source.Subscribe()` takes the snapshot outside `s.mu` (acquires/releases
`c.mu`), then acquires `s.mu` to register the subscriber — also sequential.
This means there is no deadlock risk even though two separate mutexes are
involved. The snapshot-before-lock pattern in `Subscribe` was chosen for
clarity, not necessity; but it documents the intent explicitly.
### STAP-A IDR detection is not implemented
`isH264IDRStart` handles single-NAL (type 5) and FU-A start (type 28, start
bit, inner type 5). It does **not** handle STAP-A aggregates (type 24) that
happen to lead with an IDR NAL.
In practice, FFmpeg and GStreamer never emit IDR slices inside STAP-A — IDR
frames are large and STAP-A is designed for small NALs (SPS + PPS combos).
If a publisher that does use STAP-A for IDR ever appears, the cache will miss
the keyframe boundary; the worst outcome is a larger-than-expected burst (the
cache grows until the next correctly-detected IDR) rather than a crash or
incorrect video. Add STAP-A handling in a future revision if needed.
### `go test -race ./core/webrtc/...` baseline
All 34 tests in `core/webrtc` pass under the race detector as of v0.3
(commit `228ed4b`). The suite covers config, ICE, registry, peer creation,
WHEP handler, keyframe cache, and Source subscribe/pre-fill/close. Total
runtime ≈ 16 s (dominated by the two 5-second ICE gathering timeouts in
`TestPeerFactory_*`).