v0.2: WebRTC Prometheus metrics + Grafana stack #11

Closed
opened 2026-05-03 14:53:25 -04:00 by zgaetano · 0 comments
Owner

Tracking issue for the WebRTC observability work. Design spec is committed:

docs/design/2026-05-03-datarhei-dragon-fork-webrtc-prometheus-metrics-design.md (commit 949daa2)

Scope at a glance

  • Eleven metrics in dragonfork_webrtc_* namespace (RED-method on the WHEP surface plus state gauges)
  • Hybrid instrumentation: direct client_golang in app/webrtc/ for histograms, snapshot collector in prometheus/webrtc.go for gauges
  • Two new containers in deploy/truenas/core/ (Prometheus + Grafana) with provisioned datasource and dashboard
  • Four pre-loaded alert rules; no Alertmanager bundling
  • Backwards-compatible — additive only

Status

  • Design spec drafted
  • Design spec reviewed by Zac
  • Implementation plan (writing-plans skill)
  • app/webrtc/metrics.go + tests
  • prometheus/webrtc.go + tests
  • Subsystem.Stats() method
  • Deploy bundle additions (compose + prom config + grafana provisioning)
  • Dashboard JSON
  • test/TESTING.md verification section
  • CHANGELOG v0.2 section
  • Tag v0.2.0-dragonfork

Why now

v0.1 has been running on TrueNAS since 2026-04-17 with zero per-subsystem signal. If anything has been quietly degrading, we can't see it. This closes that gap before the v0.2 feature work (load test, UI fork) lands.

Closes the v0.1 observability gap.

Tracking issue for the WebRTC observability work. Design spec is committed: `docs/design/2026-05-03-datarhei-dragon-fork-webrtc-prometheus-metrics-design.md` (commit 949daa2) ## Scope at a glance - Eleven metrics in `dragonfork_webrtc_*` namespace (RED-method on the WHEP surface plus state gauges) - Hybrid instrumentation: direct `client_golang` in `app/webrtc/` for histograms, snapshot collector in `prometheus/webrtc.go` for gauges - Two new containers in `deploy/truenas/core/` (Prometheus + Grafana) with provisioned datasource and dashboard - Four pre-loaded alert rules; no Alertmanager bundling - Backwards-compatible — additive only ## Status - [x] Design spec drafted - [ ] Design spec reviewed by Zac - [ ] Implementation plan (writing-plans skill) - [ ] `app/webrtc/metrics.go` + tests - [ ] `prometheus/webrtc.go` + tests - [ ] `Subsystem.Stats()` method - [ ] Deploy bundle additions (compose + prom config + grafana provisioning) - [ ] Dashboard JSON - [ ] `test/TESTING.md` verification section - [ ] CHANGELOG v0.2 section - [ ] Tag v0.2.0-dragonfork ## Why now v0.1 has been running on TrueNAS since 2026-04-17 with zero per-subsystem signal. If anything has been quietly degrading, we can't see it. This closes that gap before the v0.2 feature work (load test, UI fork) lands. Closes the v0.1 observability gap.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: zgaetano/datarhei-dragonfork-core#11
No description provided.