How to Test MFA and 2FA Flows

Short answer

MFA adds TOTP clock skew, one-time backup codes, SMS delivery, and step-up prompts—passing login without the second factor is not proof enforcement works. Seed TOTP secrets with otplib, stub SMS in CI, use Playwright clock for windows, and probe Assert on protected APIs—not shared authenticator apps tied to one phone.

Part of Testing Guides by auth and identity.

Who this is for

Teams shipping TOTP authenticator apps, SMS OTP, WebAuthn/passkeys, or backup codes (Auth0 MFA, Okta MFA, Firebase multi-factor, Duo, custom TOTP) who need Playwright E2E that covers enrollment, step-up, recovery, and lockout—not tests that disable MFA globally in CI.

Typical stacks: Auth0 Guardian, Okta Verify, Firebase multiFactor, AWS Cognito MFA, self-hosted TOTP with @otplib/preset-default.

Why testing MFA matters

MFA bugs are high severity because they bypass your last line of defense:

Revenue loss — "Remember this device" lasts forever; step-up never triggers on wire transfer after 90 days.
Security incidents — backup codes reusable; TOTP window accepts codes from ±24 hours; SMS OTP logged in plaintext; MFA skipped when X-Forwarded-For spoofed.
Support load — clock skew on VMs rejects valid codes; users burn backup codes during failed enrollment; recovery email loop when device lost.
Compliance exposure — PCI/SOC2 requires proof MFA enforced for admin roles; audit finds MFA toggled off via client-side flag manipulation.

E2E must assert API rejects sensitive actions without step-up—not only that a 6-digit input appears on screen.

Complexity map

Scenario	Edge case	Why tests break	Approach
TOTP enrollment	QR secret not captured	Cannot generate codes	Seed secret via API; otplib
TOTP clock skew	VM time drift	Valid code rejected	NTP sync; Wider window in test env only
TOTP replay	Same code twice	Should fail second use	Submit code twice
Backup codes	Single-use	Reuse accepted	Second attempt 401
SMS OTP	Real SMS cost	Untested	Twilio test creds / stub webhook
WebAuthn	Requires authenticator	Cannot run headless	Virtual authenticator OR API bypass tier
Step-up on action	Transfer vs read	Only login MFA tested	Probe POST /transfer without step-up
Remember device	Cookie lasts too long	No re-prompt	Clock forward 31 days
MFA bypass flag	`?skipMfa=1` in test	Ships to prod	Negative URL tamper test
Lost device recovery	Backup email link	Untested	Mailtrap recovery path
Admin MFA policy	Role without MFA	Privileged access	Seed admin without MFA → 403
Rate limit OTP	Brute force 6-digit	Lockout untested	Rapid wrong codes
Push notification MFA	Cannot automate Duo push	Skipped entirely	Nightly manual or vendor test API
Concurrent sessions	MFA on one device only	Confusing UX	Two browser contexts
Firebase MFA	Phone second factor	SMS stub needed	Emulator test numbers
Okta MFA enrollment	Factor setup redirect	Flaky UI	Okta API enroll factor in Arrange
Disable MFA	Re-auth required	Attacker disables	Probe requires password + MFA

Tools and libraries

Tool	Use case	Docs
otplib	Generate TOTP codes from secret in tests	otplib API
Playwright `clock`	Advance time for remember-device expiry	Playwright clock
Twilio test credentials	SMS without real send	Magic numbers
Auth0/Okta Management API	Enroll MFA factors programmatically	Vendor docs
WebAuthn virtual authenticator	Chrome CDP in headed tests	Playwright WebAuthn

Seed secret in Arrange (preferred)

Expose test route or use Management API to set known TOTP secret:

import { authenticator } from 'otplib';

// Arrange — enroll user with known secret
const secret = authenticator.generateSecret();
await request.post('/api/test/enroll-totp', {
  data: { runId, secret, verified: true },
});

const token = authenticator.generate(secret);

// Act — login with MFA
await page.goto('/login');
await page.getByLabel('Email').fill(`e2e-${runId}@test.local`);
await page.getByLabel('Password').fill(`pw-${runId}`);
await page.getByRole('button', { name: 'Sign in' }).click();
await page.getByLabel('Authentication code').fill(token);
await page.getByRole('button', { name: 'Verify' }).click();

await page.waitForURL('/dashboard');
expect((await page.request.get('/api/me')).status()).toBe(200);

Test enrollment UI separately

Reserve UI enrollment specs for QR display, manual secret entry, and invalid code errors. Use known secret from test shim that mirrors production enrollment API.

test('invalid TOTP rejected during enrollment', async ({ page }) => {
  await loginAsNewUser(page, runId);
  await page.goto('/settings/security/mfa');
  await page.getByRole('button', { name: 'Set up authenticator' }).click();
  await page.getByLabel('Verification code').fill('000000');
  await page.getByRole('button', { name: 'Verify' }).click();
  await expect(page.getByText(/invalid code/i)).toBeVisible();
  const factors = await page.request.get('/api/me/mfa-factors');
  expect((await factors.json()).totp).toBeFalsy();
});

Clock skew and time windows

TOTP validators typically allow ±1 step (30-second window). CI VMs with drift cause flakes.

test('TOTP accepts code within valid window', async ({ page }) => {
  await page.clock.install({ time: new Date('2025-06-01T12:00:00Z') });
  const secret = await enrollTotpForRunId(runId);
  const token = authenticator.generate({ secret, epoch: Date.now() });
  // ... submit token
});

test('TOTP rejects expired code after window', async ({ page }) => {
  await page.clock.install({ time: new Date('2025-06-01T12:00:00Z') });
  const secret = await enrollTotpForRunId(runId);
  const token = authenticator.generate({ secret, epoch: Date.now() });
  await page.clock.fastForward('00:02:00'); // past typical 30s + skew
  await submitTotp(page, token);
  await expect(page.getByText(/invalid/i)).toBeVisible();
});

Do not widen TOTP windows in production to fix tests—fix CI time sync or use Arrange secrets.

Backup codes

Test	Assert
Generate backup codes	8–10 codes returned once; not stored plaintext in probe
Login with backup code	Session established; code marked used
Reuse backup code	Rejected; probe 401
Regenerate codes	Old codes invalidated

test('backup code is single-use', async ({ page, request }) => {
  const { codes } = await request.post('/api/test/mint-backup-codes', {
    data: { runId, count: 5 },
  }).then(r => r.json());

  await loginWithBackupCode(page, runId, codes[0]);
  await page.context().clearCookies();

  await loginWithBackupCode(page, runId, codes[0]);
  await expect(page.getByText(/invalid|already used/i)).toBeVisible();
});

SMS OTP stubbing

Never send real SMS in CI for every spec.

Twilio test credentials — use magic numbers that simulate delivery
Vendor stub — Auth0/Okta test tenant with fixed OTP in lab mode
Webhook capture — test server records OTP for expect.poll retrieval

// Poll test OTP endpoint populated by your SMS webhook in staging
const otp = await expect.poll(async () => {
  const res = await request.get(`/api/test/last-sms-otp?phone=${encodeURIComponent(phone)}`);
  return (await res.json()).code;
}, { timeout: 15_000 }).toMatch(/^\d{6}$/);

Firebase phone MFA: use Auth emulator test numbers.

Step-up authentication

Sensitive actions (password change, API key create, billing update) should require fresh MFA even when session exists:

test('wire transfer requires step-up MFA', async ({ page, request }) => {
  await loginWithMfa(page, runId); // completed MFA at login

  const transfer = await request.post('/api/transfers', {
    data: { amount: 10000, toAccount: 'external' },
  });
  expect(transfer.status()).toBe(403);
  expect(await transfer.json()).toMatchObject({ code: 'STEP_UP_REQUIRED' });

  const secret = await getTotpSecret(runId);
  const stepUp = await request.post('/api/auth/step-up', {
    data: { totp: authenticator.generate(secret) },
  });
  expect(stepUp.status()).toBe(200);

  const retry = await request.post('/api/transfers', { data: { amount: 10000, toAccount: 'external' } });
  expect(retry.status()).toBe(200);
});

Probe the API, not only a modal appearance.

WebAuthn / passkeys (tiered)

WebAuthn in headless CI requires virtual authenticator:

import { test as base } from '@playwright/test';

test('register passkey and login', async ({ page, context }) => {
  const client = await context.newCDPSession(page);
  await client.send('WebAuthn.enable');
  const { authenticatorId } = await client.send('WebAuthn.addVirtualAuthenticator', {
    options: {
      protocol: 'ctap2',
      transport: 'internal',
      hasResidentKey: true,
      hasUserVerification: true,
      isUserVerified: true,
    },
  });

  await page.goto('/settings/security/passkeys');
  await page.getByRole('button', { name: 'Add passkey' }).click();
  // WebAuthn prompt auto-satisfied by virtual authenticator
  await expect(page.getByText(/passkey added/i)).toBeVisible();
});

Default PR CI: TOTP via otplib for coverage; nightly headed job for WebAuthn if prod relies on passkeys.

MFA policy by role

Map to RBAC guide:

Admin role must have MFA enrolled — probe admin API 403 without MFA
Viewer role optional MFA — login without second factor allowed
MFA enrollment deadline — grace period then blocked

CI checklist

otplib generates codes from Arrange secrets—no shared Google Authenticator
Unique user per worker with own TOTP secret
SMS stubbed; Firebase emulator for phone MFA
Step-up specs probe API, not modal only
Backup code single-use and regeneration tested
Playwright clock for remember-device expiry
Document nightly WebAuthn / push MFA jobs

Anti-patterns

Anti-pattern	Why it fails	Better approach
`DISABLE_MFA=true` in CI	Ships without MFA	Test tenant + otplib
Shared TOTP on engineer phone	Cannot parallelize	Per-run secrets
Assert 6-digit input visible	Bypass via API	Probe protected action
Skip backup code reuse	Account recovery hack	Single-use spec
Ignore step-up	Wire fraud	POST sensitive action probe
Real SMS every spec	Cost + flake	Stub webhook
`waitForTimeout` for OTP SMS	Slow/flaky	expect.poll inbox/webhook

Example scenario

Situation: Logged-in user with MFA attempts a $10,000 wire transfer without completing step-up.

Expected outcome: Transfer blocked with STEP_UP_REQUIRED—no funds moved.

Why UI-only automation breaks: Transfer button disabled in UI but API POST succeeds—test never calls API.

Arrange: User with MFA enrolled; valid session cookie from login 2 hours ago.
Act: POST /api/transfers directly without step-up token.
Assert: 403 STEP_UP_REQUIRED; after valid TOTP step-up, POST returns 200 and audit log records both events.

TestChimp workflow: Instrument mfa_challenge with mfa_method and recovery_path; compare prod step-up rate vs test.

Same Arrange/Act/Assert pattern as expired-coupon checkout.

Connect scenarios to your QA workflow

Capture business rules in markdown test plans and enforce them with seed routes and probe Assert. Link SmartTests with // @Scenario: for requirement traceability. Use /testchimp test on PRs; /testchimp explore on SmartTest paths for non-functional gaps (ExploreChimp).

Auth0 and Okta SSO — IdP-enforced MFA
Firebase Authentication — phone second factor
RBAC permissions — MFA required for admin role
Session timeout — step-up vs session expiry
Magic links — passwordless + MFA combo
Transactional email — recovery emails

External references

Frequently asked questions

How do I generate TOTP codes in Playwright without a phone?

Enroll users with a known secret via test API or Management API, then use otplib authenticator.generate(secret) in the test. Never share one Google Authenticator entry across parallel workers.

Should I disable MFA in CI?

No. Use test tenants, known TOTP secrets, and SMS stubs. Disabling MFA in CI means step-up and enrollment regressions ship to prod—exactly what MFA protects against.

How do I test step-up MFA for sensitive actions?

With a valid post-login session, POST to a sensitive API endpoint and expect 403 STEP_UP_REQUIRED. Complete step-up via TOTP API, then retry and expect 200. Do not rely on disabled buttons alone.

How do I test backup codes without burning real ones?

Use a test mint route that generates codes tied to runId. Assert first use succeeds, second fails, and regeneration invalidates old codes. Probe server state—not only UI.

Can WebAuthn run in headless CI?

Use Playwright virtual authenticator via CDP for headed or Chromium CI jobs. For default PR pipelines, cover MFA with TOTP otplib and run WebAuthn nightly if passkeys dominate prod—check TrueCoverage mfa_method.

SMS OTP is slow and flaky in tests—what should I do?

Stub SMS webhooks to a test OTP retrieval endpoint, use Twilio test credentials, or Firebase Auth emulator phone numbers. Poll with expect.poll; never send real SMS per spec.

How does TestChimp help track MFA coverage?

TrueCoverage compares mfa_method and challenge_context in prod vs test. Use /testchimp evolve to add step-up and backup-code scenarios when admin MFA adoption rises—link SmartTests with // @Scenario: for audit traceability.

Apply these patterns in your repo

Run `/testchimp init` to connect TestChimp to your repo, then `/testchimp test` on PRs to turn these patterns into maintained SmartTests. Use `/testchimp evolve` when you want to expand coverage as your app grows.

Start free on TestChimp · Book a demo

Who this is for​

Why testing MFA matters​

Complexity map​

Tools and libraries​

TOTP enrollment and login​

Seed secret in Arrange (preferred)​

Test enrollment UI separately​

Clock skew and time windows​

Backup codes​

SMS OTP stubbing​

Step-up authentication​

WebAuthn / passkeys (tiered)​

MFA policy by role​

CI checklist​

Anti-patterns​