Review of the v2 auth landing turned up four weak spots in the MFA path.
All four are now fixed; behaviour is unchanged for the password-correct
+ correct-TOTP happy path.
1. TOTP brute-force gate (the big one). /login was calling
ipBackoff.recordSuccess(ip) the instant the password hashed correctly,
*before* the second factor was proven. That cleared the per-IP failure
counter, so each /login retry let an attacker with a known password
hammer the 6-digit /login/totp space (10^6) at full speed.
Now recordSuccess fires only inside establishSession() — i.e. after
every required factor has actually passed (password [+TOTP] or
OAuth [+TOTP]).
2. MFA ticket binding. Tickets issued by /login (and the Google callback)
were unbound — a stolen ticket replayed from a different origin still
worked. Tickets now carry SHA-256 hashes of the issuing request's IP
and User-Agent; redeemTicket rejects on mismatch. The ticket is burned
even on mismatch so a wrong-binding probe can't be retried.
3. TOTP replay within the same 30s step (RFC 6238 §5.2). The verifier
accepted the same code as many times as you submitted it. Now
verifyToken returns the matched counter, and /login/totp does a CAS
UPDATE on users.totp_last_counter — codes at counters <= the last
accepted value are rejected. New migration 030 adds totp_last_counter,
seeded on /totp/enable so the enrollment code itself can't be reused
at first login, and zeroed on /totp/disable.
4. Google OAuth domain check no longer falls back to the email suffix
when the hd (hosted-domain) claim is missing. Email-suffix matching
let consumer (non-Workspace) Google accounts whose email happens to
end in the allowed domain through; if GOOGLE_ALLOWED_DOMAIN is set,
the operator means "only this Workspace", so accounts without a
verified hd must be rejected.
Tests: new mfa-tickets.test.js covers ip/UA binding, single-use on
mismatch, and bindings-absent back-compat. totp.test.js updated for the
new verifyToken return shape (counter on success, null on failure;
truthiness still works at call sites) and adds an explicit
matched-counter check.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Optional "Sign in with Google" with auto-provisioning, fully config-gated:
without GOOGLE_CLIENT_ID/SECRET and OAUTH_REDIRECT_URL the routes 404 and the
button is hidden, so deployments without SSO are unaffected.
- migration 028: users.google_sub (unique) + email; password_hash nullable
for OAuth-only accounts
- src/auth/google-oauth.js: lazy google-auth-library, ID-token verify,
GOOGLE_ALLOWED_DOMAIN enforcement, requires email_verified === true
- auth routes: /auth/google (state-CSRF redirect), /auth/google/callback,
/auth/google/enabled; reuses establishSession
- web-ui: "Sign in with Google" on the login screen (shown only when enabled),
friendly callback error handling
- .env.example documents all new vars
Security hardening (from review of this + the TOTP work):
- resolveGoogleUser links ONLY by google_sub, never by email — a Google login
can never seize a pre-existing local account (account-takeover fix)
- a Google-linked account with TOTP still requires the second factor (ticket
in session, /?mfa=1 step) instead of bypassing it
- /login/totp now applies the per-IP login backoff
- recovery-code consumption is atomic (WHERE used_at IS NULL + rowCount)
- concurrent first-login race on google_sub is caught and re-resolved
- tests: google-oauth config helpers + google-link takeover/dedup regression
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- requireAuth bearer path now selects api_tokens.bound_hostname and users.role,
populates req.tokenBoundHostname and req.user.role. /cluster/heartbeat can
now authenticate via a bound api_token (issued via POST /auth/tokens with
bound_hostname).
- routes/tokens.js POST accepts bound_hostname; GET returns it so users can
see which tokens are bound.
- Remove /cluster/heartbeat from SERVICE_PATHS so requireAuth runs on it (the
bearer auth handles the gate; the heartbeat handler still enforces the
body.hostname === bound match).
- /auth/me now returns role (final-review I2). Closes the gap where every
signed-in user appeared as 'viewer' in the UI regardless of actual role.
- loadUser SELECTs role for session auth.
- Backend tests still 37/15/0/22 — no test changes needed; existing token
CRUD tests stay passing since bound_hostname is optional.
Code-review feedback:
- Dummy hash for user-enumeration-defense timing was 63 chars (bcrypt strings
are 60 chars). Worked by accident because bcrypt 5.x is lenient about
trailing chars; a future tightening would silently regress the timing
defense. Replaced with a real pre-computed bcrypt hash.
- last_login_at UPDATE now logs errors instead of silently swallowing them,
matching the pattern in requireAuth for api_tokens.last_used_at.
- Removed dead import of comparePassword from auth.test.js.
- remove requireAuth from all route files
- delete auth.js, tokens.js, users.js routes
- delete auth middleware
- remove session middleware and all auth deps from index.js
- delete login.html and auth-guard.js from web-ui
Scope (locked in via planning Q&A):
- Identity: local accounts only (PG users table) + existing bearer
tokens for headless callers.
- Transport: httpOnly cookie session for browser, Bearer for API.
- RBAC: admin / editor / viewer roles, plus an orthogonal
is_client flag for external (agency, talent, customer) accounts.
- Bootstrap: ADMIN_BOOTSTRAP_USER + ADMIN_BOOTSTRAP_PASSWORD env
seed the first admin on a clean install. Set ADMIN_BOOTSTRAP_RESET
to force-reset the named user (break-glass).
- Rate limit: in-memory, 10 fails per 15min per (IP, username).
- Password policy: \u22658 chars, mixed case, digit, symbol; small
blocklist of common passwords; cannot equal username.
- Self-service: change own display name + password. Everything
else (role, is_client, other-user mgmt) is admin only.
- Audit log: append-only table, indexed by actor + event_type +
created_at, populated by every auth/admin event.
Files added:
- services/mam-api/src/db/migrations/022-auth-rework.sql
users.is_client + last_login_at + failed_attempts; audit_log
table with FK to users (ON DELETE SET NULL).
- services/mam-api/src/middleware/audit.js
Fire-and-forget audit() helper. Caller never awaits, failure
logs but never throws — auditing cannot break the request
that triggered it.
- services/mam-api/src/middleware/passwordPolicy.js
Shared checkPassword(pw, { username }) used by setup, user
create/update, and self-service password change.
- services/mam-api/src/tasks/bootstrapAdmin.js
Runs after migrations. No-ops unless ADMIN_BOOTSTRAP_USER +
ADMIN_BOOTSTRAP_PASSWORD are set AND (users table empty OR
ADMIN_BOOTSTRAP_RESET=true).
- services/mam-api/src/routes/audit.js
Admin-only GET /audit (paginated, filter by event_type /
actor / target / date) and GET /audit/event-types.
- services/web-ui/public/modal-account-settings.jsx
Profile + Password tabs. Triggered by sidebar user button.
Files rewritten:
- services/mam-api/src/routes/auth.js
- POST /login: regenerate(), no manual save(); audit success/
fail/lockout; updates last_login_at + failed_attempts.
- POST /logout: destroys session, audits logout.
- GET /me: returns is_client + last_login_at. Synthetic admin
when AUTH_ENABLED=false.
- GET /setup-status: drives login.html UI state.
- POST /setup: blocked once any user exists; password policy.
- POST /password: self-service. Requires current pw, runs
policy, audits, invalidates other sessions implicitly via
users.js if changed by admin.
- PATCH /me: self-service display_name update.
- services/mam-api/src/routes/users.js
- is_client field in create/update/list/get.
- Guardrails: cannot delete or demote last admin, cannot
delete self, admins cannot be flagged is_client.
- Password change invalidates all sessions for that user
(DELETE FROM sessions WHERE sess->>'userId' = id).
- Audit on every mutation.
- Password policy enforced.
- services/mam-api/src/middleware/auth.js
- requireAuth now exposes req.user.is_client.
- New requireRole(["admin","editor"], { rejectClients: true })
helper. Applied to cluster, sdk, capture routes (infra).
- Synthetic user when AUTH_ENABLED=false has is_client=false.
- services/mam-api/src/index.js
- Loads bootstrap admin after migrations.
- Wires /api/v1/audit.
- Cleans up an earlier comment block.
- services/web-ui/public/login.html
- Password hint added next to setup-mode password field.
- services/web-ui/public/shell.jsx
- Sidebar user footer is a button that opens AccountSettings.
- CLIENT badge next to role when is_client=true.
- Nav filters: clients lose ingest tree + jobs + editor;
viewers lose ingest + editor; only admins see the Admin
section. Power button hidden when synthetic user.
- services/web-ui/public/screens-admin.jsx
- Users table: new Client column with inline toggle.
- InviteUserModal: Client checkbox + password hint, gated
off when role=admin.
- Last login column replaces Created in primary view.
- CSV export includes client + last_login.
- services/web-ui/public/data.jsx
- ZAMPP_DATA.ME carries is_client + display_name.
- services/web-ui/public/index.html
- Loads dist/modal-account-settings.js.
- services/web-ui/public/styles-rest.css
- .user-row grid widened to 6 columns.
- docker-compose.yml
- Plumbs SESSION_COOKIE_SECURE + ADMIN_BOOTSTRAP_* env vars.
Deploy:
cd /opt/wild-dragon
git pull origin main
# In .env:
# AUTH_ENABLED=true
# SESSION_SECRET=<openssl rand -hex 48>
# ADMIN_BOOTSTRAP_USER=admin
# ADMIN_BOOTSTRAP_PASSWORD=<strong>
docker compose build mam-api web-ui
docker compose up -d --force-recreate --no-deps mam-api web-ui
Login was returning 200 + correct user JSON + writing a row to the
sessions table, but emitting zero Set-Cookie headers. Root cause:
session.regenerate() → set fields → session.save() → res.json()
Calling session.save() manually writes the store but bypasses
express-session's res.end() hook, which is the only path that adds
the Set-Cookie header to the response. The cookie was never sent to
the browser even though the session existed server-side — hence the
redirect loop.
Fix: remove the manual save(). Set the session fields and call
res.json() directly inside regenerate()'s callback; express-session
handles store write + Set-Cookie automatically on res.end().
The redirect loop after successful login was almost certainly the
`sessions` table never being created. `schema.sql` defines it but
only runs on first-init via the postgres entrypoint; instances
bootstrapped via mam-api's own migration loop never got the table.
express-session's `req.session.save()` then failed silently and the
cookie pointed at a sid that wasn't in the store — every subsequent
request looked like a brand-new visitor.
- New migration 021-ensure-sessions-table.sql (idempotent).
- connect-pg-simple now configured with `createTableIfMissing: true`
as belt-and-braces.
- `POST /auth/login` now explicitly waits for session.save() and
surfaces both regenerate() and save() errors instead of treating
them as 'success'. Logs sid + req.secure + req.protocol so we can
confirm trust-proxy is doing the right thing behind NPM.
Three concrete issues kept the login flow broken on dragonflight.live:
1. mam-api trusted no proxy headers, so behind nginx/Cloudflare the
session cookie's `secure` flag and the rate-limiter's IP keying
both saw the wrong values. Now sets `app.set('trust proxy', 1)`.
2. Session config was tied to NODE_ENV and lacked sameSite/name. Now:
- SESSION_COOKIE_SECURE env (default: true when AUTH_ENABLED) so a
site behind HTTPS gets Secure cookies regardless of NODE_ENV.
- `sameSite: 'lax'` for predictable post-login redirects.
- Renamed to `df.sid` so it's obvious in DevTools.
- `rolling: true` extends the 7-day TTL on active use.
- SESSION_SECRET is now required when AUTH_ENABLED=true; the
server refuses to start with a dev default in prod.
3. login.html silently showed the sign-in panel even when no users
exist or auth is off:
- New GET /auth/setup-status reports {needs_setup, user_count,
auth_enabled}.
- login.html calls it on load and auto-flips into setup mode when
needs_setup is true, or shows an explicit "auth is off" flash
when auth_enabled is false (the previous symptom: logout button
did nothing because /auth/me returned a synthetic admin no matter
what).
- Added a `.flash.info` style for the new neutral notice.
4. Sidebar logout used to call /auth/logout then `window.location
.reload()`. With auth off that reload landed back on the synthetic-
admin app and looked like nothing happened. It now redirects to
/login.html in all states so the operator sees feedback (and the
server-side messaging about auth being off) instead of a no-op.
Deploy notes for zampp1:
- Set AUTH_ENABLED=true and a random SESSION_SECRET in the
mam-api environment (e.g. /opt/wild-dragon/.env).
- Restart mam-api.
- First load of /login.html will auto-route to the setup form so
you can create the first admin.
- recorders: dispatch df:recorders-changed on create/start/stop/delete so the
list updates immediately instead of waiting for the 10s poll tick
- library: poll every 4s while any asset is live/processing (15s otherwise) and
listen for df:assets-changed so a stopped recorder's LIVE badge drops and
the thumbnail appears without a manual refresh
- auth: synthetic /auth/me (AUTH_ENABLED=false) now uses LOCAL_OPERATOR / USER /
USERNAME instead of hardcoding "Admin", and flags synthetic:true
- shell: Sidebar takes `me` as a prop, drops the misleading "Admin" fallback,
and surfaces an "auth off" hint when the response is synthetic
- jobs: replace the always-empty ETA column with a Time column that shows
queued/started/done/failed N ago (full timestamp on hover); widen column
- schedule: new month-calendar view (default) with events plotted on day cells
by status; clicking a day pre-fills the new-schedule modal with a 30-min
window on that day; List view kept behind a toggle
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>