Migration 023 was fixed in 9dc572b to use '00000000-0000-4000-8000-000000000000'
because 'v' isn't a valid hex digit, but the DEV_USER_ID constant in
middleware/auth.js still referenced the original '...000000000dev'. Every
route that passes DEV_USER_ID as a query parameter (users list, login lookup,
setup-required count) was throwing 22P02 invalid input syntax for type uuid.
The errors were swallowed by Promise.allSettled in the SPA's data load so the
app appeared to work in dev mode, but enabling AUTH_ENABLED=true would have
broken login entirely.
Final-review findings:
- Mount usersRouter at /api/v1/users in addition to /api/v1/auth/users so the
existing SPA Users page works; add PATCH /:id for inline edits (display_name,
role, password).
- Add X-Requested-With: dragonflight-ui to raw XHR/fetch paths that bypass
apiFetch (file uploads, SDK uploads, EDL export) — without it, requireUiHeader
403s before reaching the route.
- Exempt SERVICE_PATHS (/cluster/heartbeat) from requireUiHeader so node-agent
heartbeats keep working when NODE_TOKEN is unset.
- Remove stale auth.js.bak.
Fixes three issues in the authentication system:
C1: Add boot-time warning when AUTH_ENABLED=true but TRUST_PROXY!=true.
Without TRUST_PROXY=true behind nginx, req.ip becomes the proxy IP for all
clients, collapsing per-IP rate limiting into a shared pool. Operators must
explicitly set TRUST_PROXY=true to make per-IP rate limiting effective.
C2: Mount requireUiHeader middleware in test helpers (auth.test.js,
users.test.js, tokens.test.js). The CSRF header validation was not being
exercised in the test suite. Tests now send X-Requested-With: dragonflight-ui
headers that are actually validated by the middleware.
I1: Implement bounded rate-limit Map with MAX_ENTRIES=10000 and LRU eviction.
Unbounded Maps are vulnerable to spray attacks: attackers can force memory
exhaustion by requesting with distinct IPs. Now we evict the oldest entry
(by insertion order) when the map reaches capacity.
Code-review feedback:
- Dummy hash for user-enumeration-defense timing was 63 chars (bcrypt strings
are 60 chars). Worked by accident because bcrypt 5.x is lenient about
trailing chars; a future tightening would silently regress the timing
defense. Replaced with a real pre-computed bcrypt hash.
- last_login_at UPDATE now logs errors instead of silently swallowing them,
matching the pattern in requireAuth for api_tokens.last_used_at.
- Removed dead import of comparePassword from auth.test.js.
Code-review feedback: startsWith('/cluster') was a prefix match that exposed
destructive operator endpoints (POST /containers/:id/restart, DELETE /:id,
GET /devices/blackmagic/*) unauthenticated. Only POST /heartbeat is genuine
node-agent traffic; everything else in cluster.js is operator/UI surface
that should go through requireAuth. Long-term: issue node-agent a bound
api_token and drop the carve-out entirely.
Code-review feedback:
- Hard-fail boot when AUTH_ENABLED=true and SESSION_SECRET is unset, so
express-session can't silently use an in-memory random secret that
invalidates sessions on restart and breaks multi-node clusters.
- CORS rejection now returns cb(null, false) instead of cb(new Error)
so misconfigured origins surface as clean CORS errors in the browser
instead of HTTP 500s. Log a warn line for operator visibility.
- pruneSessionInterval units comment.
Code-review feedback: writing last_seen_at = now before loadUser() lets
the stamp persist if the lookup throws (resave:false still writes when
modified), extending the idle window without confirming the user exists.
Also clarify DEV_USER_ID is a specific placeholder, not a generic sentinel.
Code-review feedback: ON CONFLICT (id) only catches id collisions; a pre-existing
'dev' username would trigger a unique_violation on the username index and roll
back the migration, hard-failing the mam-api boot. Switch to bare ON CONFLICT
DO NOTHING so any unique conflict is no-op-safe.
CEP's embedded Chromium (used by Premiere Pro panels) does not support
oklch() color syntax. All color tokens were rendering as invalid/transparent,
causing the panel to appear unstyled. Converted all oklch() values to their
precise hex/rgba equivalents via OKLab→sRGB math. No design changes.
- Capture screen now polls /cluster/devices/blackmagic/signal every 3s
- Per-port chips show signal state (RECEIVING/CONNECTING/LOST/ERROR/IDLE) with pulsing dot
- BMD SVG card diagram rendered per node card
- Sidebar nav badge on Capture item shows live/total port count (pulsing green dot)
The Dashboard page was rendering as plaintext because all .dash-* CSS
classes (dash-section, dash-onair-*, dash-jobs-*, dash-cluster-*,
dash-statusbar, etc.) were missing. Added them with the full dark-theme
design-system styling matching the rest of the app.
The Cluster page's .stat-row and .stat-card classes were also missing,
causing node statistics (counts, CPU, GPUs, memory) to render unstyled.
Added grid-based stat row and card styles.
- remove requireAuth from all route files
- delete auth.js, tokens.js, users.js routes
- delete auth middleware
- remove session middleware and all auth deps from index.js
- delete login.html and auth-guard.js from web-ui
Scope (locked in via planning Q&A):
- Identity: local accounts only (PG users table) + existing bearer
tokens for headless callers.
- Transport: httpOnly cookie session for browser, Bearer for API.
- RBAC: admin / editor / viewer roles, plus an orthogonal
is_client flag for external (agency, talent, customer) accounts.
- Bootstrap: ADMIN_BOOTSTRAP_USER + ADMIN_BOOTSTRAP_PASSWORD env
seed the first admin on a clean install. Set ADMIN_BOOTSTRAP_RESET
to force-reset the named user (break-glass).
- Rate limit: in-memory, 10 fails per 15min per (IP, username).
- Password policy: \u22658 chars, mixed case, digit, symbol; small
blocklist of common passwords; cannot equal username.
- Self-service: change own display name + password. Everything
else (role, is_client, other-user mgmt) is admin only.
- Audit log: append-only table, indexed by actor + event_type +
created_at, populated by every auth/admin event.
Files added:
- services/mam-api/src/db/migrations/022-auth-rework.sql
users.is_client + last_login_at + failed_attempts; audit_log
table with FK to users (ON DELETE SET NULL).
- services/mam-api/src/middleware/audit.js
Fire-and-forget audit() helper. Caller never awaits, failure
logs but never throws — auditing cannot break the request
that triggered it.
- services/mam-api/src/middleware/passwordPolicy.js
Shared checkPassword(pw, { username }) used by setup, user
create/update, and self-service password change.
- services/mam-api/src/tasks/bootstrapAdmin.js
Runs after migrations. No-ops unless ADMIN_BOOTSTRAP_USER +
ADMIN_BOOTSTRAP_PASSWORD are set AND (users table empty OR
ADMIN_BOOTSTRAP_RESET=true).
- services/mam-api/src/routes/audit.js
Admin-only GET /audit (paginated, filter by event_type /
actor / target / date) and GET /audit/event-types.
- services/web-ui/public/modal-account-settings.jsx
Profile + Password tabs. Triggered by sidebar user button.
Files rewritten:
- services/mam-api/src/routes/auth.js
- POST /login: regenerate(), no manual save(); audit success/
fail/lockout; updates last_login_at + failed_attempts.
- POST /logout: destroys session, audits logout.
- GET /me: returns is_client + last_login_at. Synthetic admin
when AUTH_ENABLED=false.
- GET /setup-status: drives login.html UI state.
- POST /setup: blocked once any user exists; password policy.
- POST /password: self-service. Requires current pw, runs
policy, audits, invalidates other sessions implicitly via
users.js if changed by admin.
- PATCH /me: self-service display_name update.
- services/mam-api/src/routes/users.js
- is_client field in create/update/list/get.
- Guardrails: cannot delete or demote last admin, cannot
delete self, admins cannot be flagged is_client.
- Password change invalidates all sessions for that user
(DELETE FROM sessions WHERE sess->>'userId' = id).
- Audit on every mutation.
- Password policy enforced.
- services/mam-api/src/middleware/auth.js
- requireAuth now exposes req.user.is_client.
- New requireRole(["admin","editor"], { rejectClients: true })
helper. Applied to cluster, sdk, capture routes (infra).
- Synthetic user when AUTH_ENABLED=false has is_client=false.
- services/mam-api/src/index.js
- Loads bootstrap admin after migrations.
- Wires /api/v1/audit.
- Cleans up an earlier comment block.
- services/web-ui/public/login.html
- Password hint added next to setup-mode password field.
- services/web-ui/public/shell.jsx
- Sidebar user footer is a button that opens AccountSettings.
- CLIENT badge next to role when is_client=true.
- Nav filters: clients lose ingest tree + jobs + editor;
viewers lose ingest + editor; only admins see the Admin
section. Power button hidden when synthetic user.
- services/web-ui/public/screens-admin.jsx
- Users table: new Client column with inline toggle.
- InviteUserModal: Client checkbox + password hint, gated
off when role=admin.
- Last login column replaces Created in primary view.
- CSV export includes client + last_login.
- services/web-ui/public/data.jsx
- ZAMPP_DATA.ME carries is_client + display_name.
- services/web-ui/public/index.html
- Loads dist/modal-account-settings.js.
- services/web-ui/public/styles-rest.css
- .user-row grid widened to 6 columns.
- docker-compose.yml
- Plumbs SESSION_COOKIE_SECURE + ADMIN_BOOTSTRAP_* env vars.
Deploy:
cd /opt/wild-dragon
git pull origin main
# In .env:
# AUTH_ENABLED=true
# SESSION_SECRET=<openssl rand -hex 48>
# ADMIN_BOOTSTRAP_USER=admin
# ADMIN_BOOTSTRAP_PASSWORD=<strong>
docker compose build mam-api web-ui
docker compose up -d --force-recreate --no-deps mam-api web-ui
Auth work is parked until after ship. While AUTH_ENABLED=false:
- login.html now auto-redirects to / on load (no one should ever see
the login screen while auth is off; it was confusing).
- sidebar power button is hidden entirely when /auth/me returns a
synthetic user, so there's no broken-feeling no-op control.
- Removed connect-pg-simple createTableIfMissing flag in case
v9.0.1's handling of that option was responsible for the recent
boot 502 (the schema is created by migration 021 anyway).
The /auth/login + session.regenerate() + cookie fix from c34a721
stays in place — when we re-enable auth it'll work end-to-end. The
sessions table from migration 021 stays. Operator action to restore
auth later: set AUTH_ENABLED=true + SESSION_SECRET=<random> in the
mam-api environment and restart.
Login was returning 200 + correct user JSON + writing a row to the
sessions table, but emitting zero Set-Cookie headers. Root cause:
session.regenerate() → set fields → session.save() → res.json()
Calling session.save() manually writes the store but bypasses
express-session's res.end() hook, which is the only path that adds
the Set-Cookie header to the response. The cookie was never sent to
the browser even though the session existed server-side — hence the
redirect loop.
Fix: remove the manual save(). Set the session fields and call
res.json() directly inside regenerate()'s callback; express-session
handles store write + Set-Cookie automatically on res.end().
The redirect loop after successful login was almost certainly the
`sessions` table never being created. `schema.sql` defines it but
only runs on first-init via the postgres entrypoint; instances
bootstrapped via mam-api's own migration loop never got the table.
express-session's `req.session.save()` then failed silently and the
cookie pointed at a sid that wasn't in the store — every subsequent
request looked like a brand-new visitor.
- New migration 021-ensure-sessions-table.sql (idempotent).
- connect-pg-simple now configured with `createTableIfMissing: true`
as belt-and-braces.
- `POST /auth/login` now explicitly waits for session.save() and
surfaces both regenerate() and save() errors instead of treating
them as 'success'. Logs sid + req.secure + req.protocol so we can
confirm trust-proxy is doing the right thing behind NPM.
Three concrete issues kept the login flow broken on dragonflight.live:
1. mam-api trusted no proxy headers, so behind nginx/Cloudflare the
session cookie's `secure` flag and the rate-limiter's IP keying
both saw the wrong values. Now sets `app.set('trust proxy', 1)`.
2. Session config was tied to NODE_ENV and lacked sameSite/name. Now:
- SESSION_COOKIE_SECURE env (default: true when AUTH_ENABLED) so a
site behind HTTPS gets Secure cookies regardless of NODE_ENV.
- `sameSite: 'lax'` for predictable post-login redirects.
- Renamed to `df.sid` so it's obvious in DevTools.
- `rolling: true` extends the 7-day TTL on active use.
- SESSION_SECRET is now required when AUTH_ENABLED=true; the
server refuses to start with a dev default in prod.
3. login.html silently showed the sign-in panel even when no users
exist or auth is off:
- New GET /auth/setup-status reports {needs_setup, user_count,
auth_enabled}.
- login.html calls it on load and auto-flips into setup mode when
needs_setup is true, or shows an explicit "auth is off" flash
when auth_enabled is false (the previous symptom: logout button
did nothing because /auth/me returned a synthetic admin no matter
what).
- Added a `.flash.info` style for the new neutral notice.
4. Sidebar logout used to call /auth/logout then `window.location
.reload()`. With auth off that reload landed back on the synthetic-
admin app and looked like nothing happened. It now redirects to
/login.html in all states so the operator sees feedback (and the
server-side messaging about auth being off) instead of a no-op.
Deploy notes for zampp1:
- Set AUTH_ENABLED=true and a random SESSION_SECRET in the
mam-api environment (e.g. /opt/wild-dragon/.env).
- Restart mam-api.
- First load of /login.html will auto-route to the setup form so
you can create the first admin.
RustFS returns empty bodies for ranged GETs whose start offset is past
~5.9 MB on single-file proxy MP4s. HEAD reports correct size, full GET
(`bytes=0-`) works, but `bytes=8179166-` comes back 206 + correct
Content-Range header with zero bytes. Confirmed via direct S3 probe
against broadcastmgmt.cloud/dragonmam (see scratch tests).
Workaround in mam-api `GET /api/v1/assets/:id/video` until the proxy
worker emits HLS (planned v1.2.1):
- HEAD the object first to learn total size (also gives ETag /
Last-Modified for conditional requests).
- No-Range / unparseable-Range / pre-EOF requests \u2192 plain pipe.
- Parsed `bytes=N-M` requests below RUSTFS_RANGE_SAFE_START
(default 5_500_000) \u2192 direct ranged GET, RustFS handles fine.
- Anything reaching into the broken zone \u2192 stream from offset 0,
drop bytes below start, stop at end. Memory stays flat; extra
bandwidth = (end+1 - requested-size) per seek.
- Genuinely out-of-range \u2192 416 with Cache-Control: no-store so the
browser doesn't poison its cache.
Also stashes (not yet wired up) the HLS pieces we'll need for the
follow-up: `segmentToHls` ffmpeg helper + `uploadDirectoryToS3`
worker s3 helper. Harmless additions; not referenced by any code path
yet.
Confirmed against the affected asset (a72aaa03-...): bytes=0-100k +
50% +100k native pass-through; 70% +100k and near-EOF previously hung
the browser, now stream correctly via the stitched path.
Refs #143.