Skip to main content

How to Run Playwright E2E in GitHub Actions (Parallel, Sharding, Traces)

Short answer

Parallel Playwright in GitHub Actions stays green when Arrange is isolated per worker (runId seeds), Assert polls probes instead of sleeping, and sharding splits wall-clock time—not shared staging users. Upload traces on failure and wire @testchimp/playwright for test-run history on every PR.

Part of E2E testing in CI.

Who this is for

Startups running Playwright or SmartTests on every PR—especially after enabling workers > 1 or matrix sharding and seeing new flakes. Stacks: Next.js, Vite SPA, Rails with preview deploys on pull_request.

Why CI parallelization fails without foundations

Turning on parallelism multiplies world-state collisions:

  • Worker 2 consumes the coupon Worker 1 seeded
  • Shared storageState logs everyone in as admin
  • waitForTimeout passes on fast runners, fails on GitHub-hosted runners

Fix seed routes and probes before chasing shard count.

Complexity map

ScenarioEdge caseWhy tests breakApproach
Parallel workersSame coupon/userIntermittent 409Per-run runId in every seed
Shard imbalanceOne shard has slow specsLong poleSplit by timing or grep tags
Missing browsersnpx playwright install skippedLaunch errorOfficial install-deps action
Env secretsBASE_URL wrong404 on previewPR comment URL or Bunnyshell
Flaky retryMasks Arrange bugSilent debtFix probes first; retry only after
Artifact sizeFull video alwaysSlow uploadstrace: on-first-retry
Reporter noiseHTML in logsHard triageBlob + GitHub summary
Branch deploy lagPreview not readyConnection refusedwait-on health check

Baseline GitHub Actions workflow

# .github/workflows/e2e.yml
name: E2E
on:
pull_request:
branches: [main]

jobs:
test:
timeout-minutes: 30
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
shard: [1, 2, 3, 4]
env:
E2E_TEST_MODE: 'true'
BASE_URL: ${{ vars.PREVIEW_URL || 'http://localhost:3000' }}
steps:
- uses: actions/checkout@v4

- uses: actions/setup-node@v4
with:
node-version: 20
cache: npm

- name: Install dependencies
run: npm ci

- name: Install Playwright browsers
run: npx playwright install --with-deps chromium

- name: Build app
run: npm run build

- name: Start server
run: npm run start &

- name: Wait for server
run: npx wait-on "${{ env.BASE_URL }}" -t 120000

- name: Run Playwright (shard ${{ matrix.shard }}/4)
run: npx playwright test --shard=${{ matrix.shard }}/4 --workers=2
env:
PLAYWRIGHT_TEST_BASE_URL: ${{ env.BASE_URL }}

- uses: actions/upload-artifact@v4
if: failure()
with:
name: playwright-report-shard-${{ matrix.shard }}
path: |
playwright-report/
test-results/
retention-days: 7

See Playwright CI docs for official templates and playwright-github-action merge-reports pattern.

playwright.config.ts essentials

import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
testDir: './tests',
fullyParallel: true,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 1 : 0,
workers: process.env.CI ? 2 : undefined,
reporter: [
['list'],
['html', { open: 'never' }],
['@testchimp/playwright', { projectId: process.env.TESTCHIMP_PROJECT_ID }],
],
use: {
baseURL: process.env.PLAYWRIGHT_TEST_BASE_URL,
trace: 'on-first-retry',
screenshot: 'only-on-failure',
},
projects: [{ name: 'chromium', use: { ...devices['Desktop Chrome'] } }],
});

Product detail: run SmartTests in CI and runtime plugin.

Sharding vs workers

KnobWhat it doesWhen to use
--workers=NParallel tests within one jobI/O-bound specs, isolated seeds
--shard=i/kSplit suite across jobsLong suites (>15 min)
BothMaximum throughputAfter flake audit

Rule: shards do not isolate data—seeds must still be per-run, not per-shard.

Preview URLs on PRs

Point BASE_URL at preview deploys (Bunnyshell, Vercel, Render). Health-check before tests:

- name: Wait for preview
run: npx wait-on "${{ env.BASE_URL }}/api/health" -t 180000

Use multi-environment execution when SmartTests target branch-specific URLs.

Anti-patterns

Anti-patternWhy it failsBetter approach
workers: 8 before seedsCollision stormrunId fixtures first
waitForTimeout in CIStill racesexpect.poll on probes
No artifacts on failureUn-debuggableTrace + screenshot upload
Retry=3 alwaysHides Arrange bugsFix data; retry=1 max
Single global admin userSession racesSeed user per runId

TestChimp workflow

Gate merges with /testchimp test so agents repair SmartTests when selectors drift—scenario markdown supplies context recorders lack. @testchimp/playwright attaches runs to your TestChimp project for history across shards. After deploy, /testchimp evolve closes gaps between plans and production behaviour.

External references

Example scenario

Situation: PR enables 4-way sharding; checkout spec fails only on shard 3.

Expected outcome: Each shard runs isolated checkout with its own coupon and cart.

Why UI-only automation breaks: All shards reuse COUPON50; shard 3 hits 'already redeemed' intermittently.

  1. Arrange: Every spec seeds cart via /api/test/seed-cart with unique runId.
  2. Act: Complete checkout on preview URL from PLAYWRIGHT_TEST_BASE_URL.
  3. Assert: Probe returns paidOrderCount=1 for that runId only.

TestChimp workflow: @testchimp/playwright reporter links failing shard trace to scenario // @Scenario: checkout.

Same Arrange/Act/Assert pattern as expired-coupon checkout.

Frequently asked questions

How many shards should we use?

Start with wall-clock target: if suite exceeds 15 minutes at workers=2, try 4 shards. Increase only after per-run seeds eliminate shared-data flakes.

Should we run E2E on every PR?

Yes for critical paths if preview deploy + seeds are fast. Use grep tags (@smoke) for very large suites on draft PRs.

Does TestChimp replace GitHub Actions?

No—TestChimp orchestrates tests in your repo and records runs. You keep Playwright in Actions; add @testchimp/playwright for traceability.

playwright install-deps vs install?

On ubuntu-latest use install --with-deps for system libraries. Caching browser binaries speeds subsequent runs.

How do we merge shard HTML reports?

Playwright blob reporter + merge-reports CLI combines shard outputs into one HTML artifact—see official CI sharding guide.

Flakes started after parallel—what first?

Audit shared coupons, accounts, carts, storageState. Reproduce locally with --workers=4 before raising timeouts.

Can we run against production?

Avoid mutating prod in E2E. Read-only smoke against prod is rare—prefer preview env with E2E_TEST_MODE seeds.

How does /testchimp test fit CI?

Runs before or as part of PR workflow—agents update SmartTests linked to markdown scenarios so CI failures get repaired in Git, not siloed chat.

Apply these patterns in your repo

Run `/testchimp init` to connect TestChimp to your repo, then `/testchimp test` on PRs to turn these patterns into maintained SmartTests. Use `/testchimp evolve` when you want to expand coverage as your app grows.

Start free on TestChimp · Book a demo