datarhei-dragonfork-core/.forgejo/workflows/test.yml
Zac Gaetano b7afd0f08a
Some checks failed
ci / vet + build (push) Successful in 9m54s
ci / vet + build (pull_request) Successful in 9m49s
ci / race tests (push) Failing after 8m1s
ci / WebRTC smoke (5-viewer fanout) (push) Successful in 9m45s
ci / WebRTC latency p95 gate (push) Successful in 10m3s
ci / race tests (pull_request) Failing after 7m59s
ci / WebRTC smoke (5-viewer fanout) (pull_request) Successful in 9m45s
ci / WebRTC latency p95 gate (pull_request) Successful in 10m4s
ci(webrtc): server-hop latency p95 gate
Adds an end-to-end RTP-arrival latency probe that runs as a dedicated
CI job and asserts p95 < 50ms.

Implementation
--------------
A build-tagged test (-tags latency, off by default) sends 1000
synthetic RTP packets at 60Hz into corewebrtc.Source and reads them
back via a Pion subscriber's track.ReadRTP(). Each packet's payload
starts with the publisher's UnixNano send time; the subscriber diffs
against time.Now() at arrival and accumulates p50/p95/p99.

This exercises every link of the egress hop: Source UDP read,
subscriber fan-out, forwardRTPSplit, Pion's TrackLocalStaticRTP
write, DTLS-SRTP encrypt, ICE socket write, decrypt at the
subscriber, RTP unmarshal at ReadRTP. Pure server-side; no FFmpeg
or codecs involved.

Why not glass-to-glass
----------------------
The design's §7 calls for FFmpeg drawtext frame counters + decode-
side pixel sampling, p95<300ms RTMP / <200ms SRT. Implementing that
in pure Go needs a cgo H.264 decoder or an FFmpeg sidecar pipe — a
significantly bigger lift for a marginal regression-detection win
(encode/decode latency is roughly fixed by the codec stack and
isn't moved by Core code changes). The server-hop measurement
captures everything Core code can actually regress.

Threshold
---------
50ms p95. Locally observed on a quiet host:
  p50=110µs, p95=237µs, p99=318µs.
The 50ms gate is ~200x headroom — generous enough to absorb CI
runner noise without false alarms, tight enough to catch a real
slowdown.

Race-clean: latencySamples uses a sync.Mutex around the slice append
(initial draft had a slice racing with the receive goroutine; vet
caught it).

Documented in test/TESTING.md and wired to .forgejo/workflows/test.yml
as the latency-gate job (depends on lint-and-vet, parallel with test
and webrtc-smoke).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 12:18:57 +00:00

124 lines
3.8 KiB
YAML

# Forgejo Actions CI for Datarhei — Dragon Fork.
#
# Mirrors the upstream go-tests.yml shape (GitHub Actions syntax),
# but pinned to Go 1.24 to match go.mod and adds the M3 race-detector
# pass. The forgejo-runner picks this up automatically.
#
# Triggered on every push and pull request. Two jobs:
# - lint-and-vet: cheap, fast feedback (~30s)
# - test: full test suite with -race, ~3 minutes including
# the integration tests in app/webrtc that bind UDP
# sockets and run a real Pion handshake.
name: ci
on:
push:
branches:
- main
- 'm[0-9]*-*'
- 'fix/**'
pull_request:
jobs:
lint-and-vet:
name: vet + build
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: '1.24'
cache: true
- name: go vet
run: go vet ./...
- name: go build
run: go build ./...
test:
name: race tests
runs-on: ubuntu-22.04
needs: lint-and-vet
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: '1.24'
cache: true
# Integration tests need ephemeral UDP ports above 32768; the
# default sysctl on ubuntu runners covers this, so no extra
# setup is required.
- name: go test -race -short
run: go test -race -short -count=1 ./...
env:
# The integration tests start Pion peers; tighten the timeout
# so a flaky network-bound test never sits the whole job.
GORACE: 'halt_on_error=1'
- name: go test (coverage, no race)
# Race detector + coverage in one pass slows things meaningfully;
# do them separately. This step's purpose is the coverage.out
# artifact, not a second correctness signal.
run: go test -coverprofile=coverage.out -covermode=atomic -count=1 ./...
- name: Upload coverage artifact
uses: actions/upload-artifact@v4
if: success() || failure()
with:
name: coverage-go-${{ github.sha }}
path: coverage.out
if-no-files-found: warn
retention-days: 14
# --- WebRTC subsystem-only smoke ---------------------------------
# The 5-viewer fanout test catches the largest class of regressions
# for the egress path. Promoted to its own job so a failure on the
# WebRTC side reads cleanly in the actions log instead of being
# buried among ~80 packages of unrelated Core tests.
webrtc-smoke:
name: WebRTC smoke (5-viewer fanout)
runs-on: ubuntu-22.04
needs: lint-and-vet
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: '1.24'
cache: true
- name: WebRTC integration tests (race)
run: |
go test -race -count=1 -v \
-run 'TestIntegration_|TestSubsystem_TeardownHookFiresOnProcessStop|TestHandler_' \
./app/webrtc/... ./core/webrtc/...
# --- Latency gate ----------------------------------------------------
# Server-hop p95 latency check. Build-tagged so it doesn't run in the
# default `go test ./...` invocation; this dedicated job exists to
# catch regressions that would otherwise hide behind 'all tests pass'.
# Threshold: p95 < 50ms (locally observed: sub-ms; gate is generous
# to absorb CI runner noise without false alarms).
latency-gate:
name: WebRTC latency p95 gate
runs-on: ubuntu-22.04
needs: lint-and-vet
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v5
with:
go-version: '1.24'
cache: true
- name: Server-hop latency p95 < 50ms
run: |
go test -tags latency -timeout 90s -race -count=1 \
-run TestLatencyServerHop \
./app/webrtc/... -v