Skip to main content

How to Test MFA and 2FA Flows

Short answer

MFA adds TOTP clock skew, one-time backup codes, SMS delivery, and step-up prompts—passing login without the second factor is not proof enforcement works. Seed TOTP secrets with otplib, stub SMS in CI, use Playwright clock for windows, and probe Assert on protected APIs—not shared authenticator apps tied to one phone.

Part of Testing Guides by auth and identity.

Who this is for

Teams shipping TOTP authenticator apps, SMS OTP, WebAuthn/passkeys, or backup codes (Auth0 MFA, Okta MFA, Firebase multi-factor, Duo, custom TOTP) who need Playwright E2E that covers enrollment, step-up, recovery, and lockout—not tests that disable MFA globally in CI.

Typical stacks: Auth0 Guardian, Okta Verify, Firebase multiFactor, AWS Cognito MFA, self-hosted TOTP with @otplib/preset-default.

Why testing MFA matters

MFA bugs are high severity because they bypass your last line of defense:

  • Revenue loss — "Remember this device" lasts forever; step-up never triggers on wire transfer after 90 days.
  • Security incidents — backup codes reusable; TOTP window accepts codes from ±24 hours; SMS OTP logged in plaintext; MFA skipped when X-Forwarded-For spoofed.
  • Support load — clock skew on VMs rejects valid codes; users burn backup codes during failed enrollment; recovery email loop when device lost.
  • Compliance exposure — PCI/SOC2 requires proof MFA enforced for admin roles; audit finds MFA toggled off via client-side flag manipulation.

E2E must assert API rejects sensitive actions without step-up—not only that a 6-digit input appears on screen.

Complexity map

ScenarioEdge caseWhy tests breakApproach
TOTP enrollmentQR secret not capturedCannot generate codesSeed secret via API; otplib
TOTP clock skewVM time driftValid code rejectedNTP sync; Wider window in test env only
TOTP replaySame code twiceShould fail second useSubmit code twice
Backup codesSingle-useReuse acceptedSecond attempt 401
SMS OTPReal SMS costUntestedTwilio test creds / stub webhook
WebAuthnRequires authenticatorCannot run headlessVirtual authenticator OR API bypass tier
Step-up on actionTransfer vs readOnly login MFA testedProbe POST /transfer without step-up
Remember deviceCookie lasts too longNo re-promptClock forward 31 days
MFA bypass flag?skipMfa=1 in testShips to prodNegative URL tamper test
Lost device recoveryBackup email linkUntestedMailtrap recovery path
Admin MFA policyRole without MFAPrivileged accessSeed admin without MFA → 403
Rate limit OTPBrute force 6-digitLockout untestedRapid wrong codes
Push notification MFACannot automate Duo pushSkipped entirelyNightly manual or vendor test API
Concurrent sessionsMFA on one device onlyConfusing UXTwo browser contexts
Firebase MFAPhone second factorSMS stub neededEmulator test numbers
Okta MFA enrollmentFactor setup redirectFlaky UIOkta API enroll factor in Arrange
Disable MFARe-auth requiredAttacker disablesProbe requires password + MFA

Tools and libraries

ToolUse caseDocs
otplibGenerate TOTP codes from secret in testsotplib API
Playwright clockAdvance time for remember-device expiryPlaywright clock
Twilio test credentialsSMS without real sendMagic numbers
Auth0/Okta Management APIEnroll MFA factors programmaticallyVendor docs
WebAuthn virtual authenticatorChrome CDP in headed testsPlaywright WebAuthn

TOTP enrollment and login

Seed secret in Arrange (preferred)

Expose test route or use Management API to set known TOTP secret:

import { authenticator } from 'otplib';

// Arrange — enroll user with known secret
const secret = authenticator.generateSecret();
await request.post('/api/test/enroll-totp', {
data: { runId, secret, verified: true },
});

const token = authenticator.generate(secret);

// Act — login with MFA
await page.goto('/login');
await page.getByLabel('Email').fill(`e2e-${runId}@test.local`);
await page.getByLabel('Password').fill(`pw-${runId}`);
await page.getByRole('button', { name: 'Sign in' }).click();
await page.getByLabel('Authentication code').fill(token);
await page.getByRole('button', { name: 'Verify' }).click();

await page.waitForURL('/dashboard');
expect((await page.request.get('/api/me')).status()).toBe(200);

Test enrollment UI separately

Reserve UI enrollment specs for QR display, manual secret entry, and invalid code errors. Use known secret from test shim that mirrors production enrollment API.

test('invalid TOTP rejected during enrollment', async ({ page }) => {
await loginAsNewUser(page, runId);
await page.goto('/settings/security/mfa');
await page.getByRole('button', { name: 'Set up authenticator' }).click();
await page.getByLabel('Verification code').fill('000000');
await page.getByRole('button', { name: 'Verify' }).click();
await expect(page.getByText(/invalid code/i)).toBeVisible();
const factors = await page.request.get('/api/me/mfa-factors');
expect((await factors.json()).totp).toBeFalsy();
});

Clock skew and time windows

TOTP validators typically allow ±1 step (30-second window). CI VMs with drift cause flakes.

test('TOTP accepts code within valid window', async ({ page }) => {
await page.clock.install({ time: new Date('2025-06-01T12:00:00Z') });
const secret = await enrollTotpForRunId(runId);
const token = authenticator.generate({ secret, epoch: Date.now() });
// ... submit token
});

test('TOTP rejects expired code after window', async ({ page }) => {
await page.clock.install({ time: new Date('2025-06-01T12:00:00Z') });
const secret = await enrollTotpForRunId(runId);
const token = authenticator.generate({ secret, epoch: Date.now() });
await page.clock.fastForward('00:02:00'); // past typical 30s + skew
await submitTotp(page, token);
await expect(page.getByText(/invalid/i)).toBeVisible();
});

Do not widen TOTP windows in production to fix tests—fix CI time sync or use Arrange secrets.

Backup codes

TestAssert
Generate backup codes8–10 codes returned once; not stored plaintext in probe
Login with backup codeSession established; code marked used
Reuse backup codeRejected; probe 401
Regenerate codesOld codes invalidated
test('backup code is single-use', async ({ page, request }) => {
const { codes } = await request.post('/api/test/mint-backup-codes', {
data: { runId, count: 5 },
}).then(r => r.json());

await loginWithBackupCode(page, runId, codes[0]);
await page.context().clearCookies();

await loginWithBackupCode(page, runId, codes[0]);
await expect(page.getByText(/invalid|already used/i)).toBeVisible();
});

SMS OTP stubbing

Never send real SMS in CI for every spec.

  1. Twilio test credentials — use magic numbers that simulate delivery
  2. Vendor stub — Auth0/Okta test tenant with fixed OTP in lab mode
  3. Webhook capture — test server records OTP for expect.poll retrieval
// Poll test OTP endpoint populated by your SMS webhook in staging
const otp = await expect.poll(async () => {
const res = await request.get(`/api/test/last-sms-otp?phone=${encodeURIComponent(phone)}`);
return (await res.json()).code;
}, { timeout: 15_000 }).toMatch(/^\d{6}$/);

Firebase phone MFA: use Auth emulator test numbers.

Step-up authentication

Sensitive actions (password change, API key create, billing update) should require fresh MFA even when session exists:

test('wire transfer requires step-up MFA', async ({ page, request }) => {
await loginWithMfa(page, runId); // completed MFA at login

const transfer = await request.post('/api/transfers', {
data: { amount: 10000, toAccount: 'external' },
});
expect(transfer.status()).toBe(403);
expect(await transfer.json()).toMatchObject({ code: 'STEP_UP_REQUIRED' });

const secret = await getTotpSecret(runId);
const stepUp = await request.post('/api/auth/step-up', {
data: { totp: authenticator.generate(secret) },
});
expect(stepUp.status()).toBe(200);

const retry = await request.post('/api/transfers', { data: { amount: 10000, toAccount: 'external' } });
expect(retry.status()).toBe(200);
});

Probe the API, not only a modal appearance.

WebAuthn / passkeys (tiered)

WebAuthn in headless CI requires virtual authenticator:

import { test as base } from '@playwright/test';

test('register passkey and login', async ({ page, context }) => {
const client = await context.newCDPSession(page);
await client.send('WebAuthn.enable');
const { authenticatorId } = await client.send('WebAuthn.addVirtualAuthenticator', {
options: {
protocol: 'ctap2',
transport: 'internal',
hasResidentKey: true,
hasUserVerification: true,
isUserVerified: true,
},
});

await page.goto('/settings/security/passkeys');
await page.getByRole('button', { name: 'Add passkey' }).click();
// WebAuthn prompt auto-satisfied by virtual authenticator
await expect(page.getByText(/passkey added/i)).toBeVisible();
});

Default PR CI: TOTP via otplib for coverage; nightly headed job for WebAuthn if prod relies on passkeys.

MFA policy by role

Map to RBAC guide:

  • Admin role must have MFA enrolled — probe admin API 403 without MFA
  • Viewer role optional MFA — login without second factor allowed
  • MFA enrollment deadline — grace period then blocked

CI checklist

  1. otplib generates codes from Arrange secrets—no shared Google Authenticator
  2. Unique user per worker with own TOTP secret
  3. SMS stubbed; Firebase emulator for phone MFA
  4. Step-up specs probe API, not modal only
  5. Backup code single-use and regeneration tested
  6. Playwright clock for remember-device expiry
  7. Document nightly WebAuthn / push MFA jobs

Anti-patterns

Anti-patternWhy it failsBetter approach
DISABLE_MFA=true in CIShips without MFATest tenant + otplib
Shared TOTP on engineer phoneCannot parallelizePer-run secrets
Assert 6-digit input visibleBypass via APIProbe protected action
Skip backup code reuseAccount recovery hackSingle-use spec
Ignore step-upWire fraudPOST sensitive action probe
Real SMS every specCost + flakeStub webhook
waitForTimeout for OTP SMSSlow/flakyexpect.poll inbox/webhook

Example scenario

Situation: Logged-in user with MFA attempts a $10,000 wire transfer without completing step-up.

Expected outcome: Transfer blocked with STEP_UP_REQUIRED—no funds moved.

Why UI-only automation breaks: Transfer button disabled in UI but API POST succeeds—test never calls API.

  1. Arrange: User with MFA enrolled; valid session cookie from login 2 hours ago.
  2. Act: POST /api/transfers directly without step-up token.
  3. Assert: 403 STEP_UP_REQUIRED; after valid TOTP step-up, POST returns 200 and audit log records both events.

TestChimp workflow: Instrument mfa_challenge with mfa_method and recovery_path; compare prod step-up rate vs test.

Same Arrange/Act/Assert pattern as expired-coupon checkout.

Connect scenarios to your QA workflow

Capture business rules in markdown test plans and enforce them with seed routes and probe Assert. Link SmartTests with // @Scenario: for requirement traceability. Use /testchimp test on PRs; /testchimp explore on SmartTest paths for non-functional gaps (ExploreChimp).

External references

Frequently asked questions

How do I generate TOTP codes in Playwright without a phone?

Enroll users with a known secret via test API or Management API, then use otplib authenticator.generate(secret) in the test. Never share one Google Authenticator entry across parallel workers.

Should I disable MFA in CI?

No. Use test tenants, known TOTP secrets, and SMS stubs. Disabling MFA in CI means step-up and enrollment regressions ship to prod—exactly what MFA protects against.

How do I test step-up MFA for sensitive actions?

With a valid post-login session, POST to a sensitive API endpoint and expect 403 STEP_UP_REQUIRED. Complete step-up via TOTP API, then retry and expect 200. Do not rely on disabled buttons alone.

How do I test backup codes without burning real ones?

Use a test mint route that generates codes tied to runId. Assert first use succeeds, second fails, and regeneration invalidates old codes. Probe server state—not only UI.

Can WebAuthn run in headless CI?

Use Playwright virtual authenticator via CDP for headed or Chromium CI jobs. For default PR pipelines, cover MFA with TOTP otplib and run WebAuthn nightly if passkeys dominate prod—check TrueCoverage mfa_method.

SMS OTP is slow and flaky in tests—what should I do?

Stub SMS webhooks to a test OTP retrieval endpoint, use Twilio test credentials, or Firebase Auth emulator phone numbers. Poll with expect.poll; never send real SMS per spec.

How does TestChimp help track MFA coverage?

TrueCoverage compares mfa_method and challenge_context in prod vs test. Use /testchimp evolve to add step-up and backup-code scenarios when admin MFA adoption rises—link SmartTests with // @Scenario: for audit traceability.

Apply these patterns in your repo

Run `/testchimp init` to connect TestChimp to your repo, then `/testchimp test` on PRs to turn these patterns into maintained SmartTests. Use `/testchimp evolve` when you want to expand coverage as your app grows.

Start free on TestChimp · Book a demo