datarhei-dragonfork-core/test/TESTING.md
Zac Gaetano b7afd0f08a
Some checks failed
ci / vet + build (push) Successful in 9m54s
ci / vet + build (pull_request) Successful in 9m49s
ci / race tests (push) Failing after 8m1s
ci / WebRTC smoke (5-viewer fanout) (push) Successful in 9m45s
ci / WebRTC latency p95 gate (push) Successful in 10m3s
ci / race tests (pull_request) Failing after 7m59s
ci / WebRTC smoke (5-viewer fanout) (pull_request) Successful in 9m45s
ci / WebRTC latency p95 gate (pull_request) Successful in 10m4s
ci(webrtc): server-hop latency p95 gate
Adds an end-to-end RTP-arrival latency probe that runs as a dedicated
CI job and asserts p95 < 50ms.

Implementation
--------------
A build-tagged test (-tags latency, off by default) sends 1000
synthetic RTP packets at 60Hz into corewebrtc.Source and reads them
back via a Pion subscriber's track.ReadRTP(). Each packet's payload
starts with the publisher's UnixNano send time; the subscriber diffs
against time.Now() at arrival and accumulates p50/p95/p99.

This exercises every link of the egress hop: Source UDP read,
subscriber fan-out, forwardRTPSplit, Pion's TrackLocalStaticRTP
write, DTLS-SRTP encrypt, ICE socket write, decrypt at the
subscriber, RTP unmarshal at ReadRTP. Pure server-side; no FFmpeg
or codecs involved.

Why not glass-to-glass
----------------------
The design's §7 calls for FFmpeg drawtext frame counters + decode-
side pixel sampling, p95<300ms RTMP / <200ms SRT. Implementing that
in pure Go needs a cgo H.264 decoder or an FFmpeg sidecar pipe — a
significantly bigger lift for a marginal regression-detection win
(encode/decode latency is roughly fixed by the codec stack and
isn't moved by Core code changes). The server-hop measurement
captures everything Core code can actually regress.

Threshold
---------
50ms p95. Locally observed on a quiet host:
  p50=110µs, p95=237µs, p99=318µs.
The 50ms gate is ~200x headroom — generous enough to absorb CI
runner noise without false alarms, tight enough to catch a real
slowdown.

Race-clean: latencySamples uses a sync.Mutex around the slice append
(initial draft had a slice racing with the receive goroutine; vet
caught it).

Documented in test/TESTING.md and wired to .forgejo/workflows/test.yml
as the latency-gate job (depends on lint-and-vet, parallel with test
and webrtc-smoke).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 12:18:57 +00:00

2.9 KiB

Testing the WebRTC egress path

In-process (CI)

go test -race -count=1 ./app/webrtc/... ./core/webrtc/...

The integration tests under app/webrtc/ allocate UDP ports on loopback, spin up an Echo handler, attach a Pion subscriber, and spray synthetic RTP into the registered Source. TestIntegration_FiveViewerFanout covers the 5-concurrent-viewer acceptance path from the M3 design.

Manual / browser

whep-player.html is a self-contained WHEP subscriber a human can point at any live deploy. Open it directly in a browser:

file:///path/to/datarhei-dragonfork-core/test/whep-player.html

…or copy it onto a static host (no server-side dependency). It accepts the WHEP URL and an optional bearer token (the deploy uses Core's JWT, so paste an access_token from POST /api/login). It POSTs an SDP offer with a recvonly video + audio transceiver, applies the answer, and renders the stream in <video>. Stats panel shows ICE + PeerConnection states, the codec pulled from the answer SDP, and a 1-Hz inbound-bitrate sample. Disconnect issues a WHEP DELETE on the resource URL the server returned in Location.

Shareable URL:

file:///.../whep-player.html?url=http://10.0.0.25:8090/api/v3/whep/myStream&token=eyJhbGciOi...

Pion CLI helper

test/whep-client/ is the same handshake in Go, useful for scripting or running on the same machine as Core for an apples-to-apples loopback test:

cd test/whep-client
go build -o /tmp/whep-client .
/tmp/whep-client -url http://10.0.0.25:8090/api/v3/whep/myStream -token "$JWT" -timeout 15s

Exits 0 once both video and audio tracks have received their first RTP packet. Used in the M2 deploy verification on TrueNAS.

Latency p95 gate

Wired into CI via the latency-gate job in .forgejo/workflows/test.yml. Run locally:

go test -tags latency -timeout 90s -race -count=1 \
  -run TestLatencyServerHop ./app/webrtc/...

What it measures

Server-hop latency from corewebrtc.Source ingest through Pion's DTLS-SRTP egress to a subscriber's track.ReadRTP(). The publisher embeds a wall-clock UnixNano timestamp in each RTP payload; the subscriber reads it on arrival and diffs.

What it does NOT measure

True glass-to-glass latency would include FFmpeg encode and a real H.264 decoder on the subscriber side. The design (webrtc-design.md §7) calls for drawtext-burned frame counters + decode-side pixel sampling; implementing that in pure Go would require a cgo H.264 decoder or an FFmpeg-as-sidecar pipe, neither of which pays off for the dominant CI question ("did anybody regress the server hop?"). Encode/decode latency is fixed by the codec stack — Core code changes won't move it.

Threshold

p95 < 50 ms on the CI runner. Locally observed on a quiet host: p50 ≈ 110 µs, p95 ≈ 240 µs, p99 ≈ 320 µs. The 50ms gate is two orders of magnitude above that — generous, but a regression that crosses it indicates a genuine slowdown rather than runner noise.