Skip to main content

Missing Feature Flag Seed Breaks E2E

Short answer

CI targets preview where NEW_CHECKOUT is off; main assumes on—or a % rollout leaves worker 2 in the control bucket. E2E must seed flag posture per runId in Arrange, then probe gated APIs—never assume staging flags match prod or last week's run.

Part of Common E2E testing gotchas.

Symptom

  • Spec passes on main, fails on feature branch (or vice versa)
  • "Button not found" after flag toggle with no test update
  • Intermittent failures—only some parallel workers
  • Preview deploy missing step the spec expects

Root cause

Tests assume implicit flag state:

  • No seed for LaunchDarkly / Unleash / env vars in CI
  • Percentage rollouts non-deterministic across workers
  • Preview URL uses different FLAG_* env than staging
  • Client reads flags at build time; server at runtime—mismatch

Flags are Arrange preconditions—like coupons and roles. Missing seeds cause flakes similar to hardcoded test data and parallel collisions.

Fix: explicit flag posture in Arrange

1. Test-only flag override route

test.beforeEach(async ({ request }) => {
const runId = `flags-${test.info().workerIndex}-${Date.now()}`;
await request.post('/api/test/set-flags', {
data: {
runId,
flags: {
NEW_CHECKOUT: true,
BETA_EXPORT: false,
},
},
});
test.info().annotations.push({ type: 'runId', description: runId });
});

Gate route to non-production. Backend evaluation should honor runId or test header—not production rollout rules.

2. Pass posture into navigation

await page.goto(`/checkout?run=${runId}&testFlags=NEW_CHECKOUT:on`);

Or set cookie/localStorage your app reads in test env only.

3. Probe gated behavior—not just visible button

// UI may hide button while API stays open—or opposite
const res = await request.post('/api/v1/export', {
headers: { 'X-Test-Run-Id': runId, Authorization: `Bearer ${token}` },
});
expect(res.status()).toBe(403); // free tier + flag off

Entitlement depth: plan limits and feature flags.

4. CI env alignment

LayerCI requirement
Preview jobSame set-flags seed as staging—or document divergent specs
Env varsFLAG_* in GitHub Actions match test seed defaults
Build-time flagsSeparate spec projects if client bundles flags at build

Wire jobs: E2E in CI and GitHub Actions parallel.

Anti-patterns

Anti-patternWhy it failsBetter approach
Assume prod flag defaults in CIPreview differsSeed flags per runId
% rollout in test envNon-deterministic workersForce on/off in test provider
Skip spec when flag offCoverage holesTwo scenarios: on and off
UI-only "button hidden"API still exposedProbe 403 when gated
Manual LD toggle before suiteHumans forgetAutomated set-flags route

TestChimp workflow

Markdown scenarios should declare flag posture ("NEW_CHECKOUT on"). /testchimp init scaffolds set-flags alongside other seed routes. /testchimp test on PRs updates SmartTests when flag-gated flows change—agents read scenario preconditions instead of guessing preview env.

Frequently asked questions

Should we test both flag on and off?

Yes for critical paths—two specs or parameterized tests with different seed postures. Rollout percentage is not a test strategy.

LaunchDarkly in test mode—still need seed routes?

Use LD test environments with deterministic flags, or override via your app test API. Workers must not depend on live rollout targeting.

Flags differ preview vs staging—what now?

Align CI jobs to seed the same posture, or tag specs with required flags and run subsets per deploy target. Document in markdown scenarios.

Client reads flags at build time—how to E2E?

Separate Playwright projects per build flavor, or inject flags via test-only runtime endpoint the client fetches in non-prod.

Feature off but API returns 200—is that a bug?

Often yes for entitlements—UI hide is not security. Probe gated endpoints; see entitlements flow guide.

How do flags relate to parallel CI flakes?

Percentage rollouts assign workers to different buckets. Force flag state per runId in test env.

Does /testchimp init scaffold flag seeds?

Yes—it adds set-flags patterns with runId next to user and cart seeds so agents and SmartTests declare flag posture in Arrange—not implicit env luck.

Should we mock the flag SDK in unit tests only?

Unit mocks are fine. E2E should exercise your app flag integration path with test overrides—otherwise wiring regressions slip through.

Apply these patterns in your repo

Run `/testchimp init` to connect TestChimp to your repo, then `/testchimp test` on PRs to turn these patterns into maintained SmartTests. Use `/testchimp evolve` when you want to expand coverage as your app grows.

Start free on TestChimp · Book a demo