How to Run Playwright E2E in GitHub Actions (Parallel, Sharding, Traces)

Short answer

Parallel Playwright in GitHub Actions stays green when Arrange is isolated per worker (runId seeds), Assert polls probes instead of sleeping, and sharding splits wall-clock time—not shared staging users. Upload traces on failure and wire @testchimp/playwright for test-run history on every PR.

Part of E2E testing in CI.

Who this is for

Startups running Playwright or SmartTests on every PR—especially after enabling workers > 1 or matrix sharding and seeing new flakes. Stacks: Next.js, Vite SPA, Rails with preview deploys on pull_request.

Why CI parallelization fails without foundations

Turning on parallelism multiplies world-state collisions:

Worker 2 consumes the coupon Worker 1 seeded
Shared storageState logs everyone in as admin
waitForTimeout passes on fast runners, fails on GitHub-hosted runners

Fix seed routes and probes before chasing shard count.

Complexity map

Scenario	Edge case	Why tests break	Approach
Parallel workers	Same coupon/user	Intermittent 409	Per-run `runId` in every seed
Shard imbalance	One shard has slow specs	Long pole	Split by timing or `grep` tags
Missing browsers	`npx playwright install` skipped	Launch error	Official install-deps action
Env secrets	`BASE_URL` wrong	404 on preview	PR comment URL or Bunnyshell
Flaky retry	Masks Arrange bug	Silent debt	Fix probes first; retry only after
Artifact size	Full video always	Slow uploads	`trace: on-first-retry`
Reporter noise	HTML in logs	Hard triage	Blob + GitHub summary
Branch deploy lag	Preview not ready	Connection refused	`wait-on` health check

Baseline GitHub Actions workflow

# .github/workflows/e2e.yml
name: E2E
on:
  pull_request:
    branches: [main]

jobs:
  test:
    timeout-minutes: 30
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        shard: [1, 2, 3, 4]
    env:
      E2E_TEST_MODE: 'true'
      BASE_URL: ${{ vars.PREVIEW_URL || 'http://localhost:3000' }}
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm

      - name: Install dependencies
        run: npm ci

      - name: Install Playwright browsers
        run: npx playwright install --with-deps chromium

      - name: Build app
        run: npm run build

      - name: Start server
        run: npm run start &
      
      - name: Wait for server
        run: npx wait-on "${{ env.BASE_URL }}" -t 120000

      - name: Run Playwright (shard ${{ matrix.shard }}/4)
        run: npx playwright test --shard=${{ matrix.shard }}/4 --workers=2
        env:
          PLAYWRIGHT_TEST_BASE_URL: ${{ env.BASE_URL }}

      - uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: playwright-report-shard-${{ matrix.shard }}
          path: |
            playwright-report/
            test-results/
          retention-days: 7

See Playwright CI docs for official templates and playwright-github-action merge-reports pattern.

playwright.config.ts essentials

import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
  testDir: './tests',
  fullyParallel: true,
  forbidOnly: !!process.env.CI,
  retries: process.env.CI ? 1 : 0,
  workers: process.env.CI ? 2 : undefined,
  reporter: [
    ['list'],
    ['html', { open: 'never' }],
    ['@testchimp/playwright', { projectId: process.env.TESTCHIMP_PROJECT_ID }],
  ],
  use: {
    baseURL: process.env.PLAYWRIGHT_TEST_BASE_URL,
    trace: 'on-first-retry',
    screenshot: 'only-on-failure',
  },
  projects: [{ name: 'chromium', use: { ...devices['Desktop Chrome'] } }],
});

Product detail: run SmartTests in CI and runtime plugin.

Sharding vs workers

Knob	What it does	When to use
`--workers=N`	Parallel tests within one job	I/O-bound specs, isolated seeds
`--shard=i/k`	Split suite across jobs	Long suites (>15 min)
Both	Maximum throughput	After flake audit

Rule: shards do not isolate data—seeds must still be per-run, not per-shard.

Preview URLs on PRs

Point BASE_URL at preview deploys (Bunnyshell, Vercel, Render). Health-check before tests:

- name: Wait for preview
  run: npx wait-on "${{ env.BASE_URL }}/api/health" -t 180000

Use multi-environment execution when SmartTests target branch-specific URLs.

Anti-patterns

Anti-pattern	Why it fails	Better approach
`workers: 8` before seeds	Collision storm	`runId` fixtures first
`waitForTimeout` in CI	Still races	expect.poll on probes
No artifacts on failure	Un-debuggable	Trace + screenshot upload
Retry=3 always	Hides Arrange bugs	Fix data; retry=1 max
Single global admin user	Session races	Seed user per `runId`

TestChimp workflow

Gate merges with /testchimp test so agents repair SmartTests when selectors drift—scenario markdown supplies context recorders lack. @testchimp/playwright attaches runs to your TestChimp project for history across shards. After deploy, /testchimp evolve closes gaps between plans and production behaviour.

External references

Example scenario

Situation: PR enables 4-way sharding; checkout spec fails only on shard 3.

Expected outcome: Each shard runs isolated checkout with its own coupon and cart.

Why UI-only automation breaks: All shards reuse COUPON50; shard 3 hits 'already redeemed' intermittently.

Arrange: Every spec seeds cart via /api/test/seed-cart with unique runId.
Act: Complete checkout on preview URL from PLAYWRIGHT_TEST_BASE_URL.
Assert: Probe returns paidOrderCount=1 for that runId only.

TestChimp workflow: @testchimp/playwright reporter links failing shard trace to scenario // @Scenario: checkout.

Same Arrange/Act/Assert pattern as expired-coupon checkout.

Frequently asked questions

How many shards should we use?

Start with wall-clock target: if suite exceeds 15 minutes at workers=2, try 4 shards. Increase only after per-run seeds eliminate shared-data flakes.

Should we run E2E on every PR?

Yes for critical paths if preview deploy + seeds are fast. Use grep tags (@smoke) for very large suites on draft PRs.

Does TestChimp replace GitHub Actions?

No—TestChimp orchestrates tests in your repo and records runs. You keep Playwright in Actions; add @testchimp/playwright for traceability.

playwright install-deps vs install?

On ubuntu-latest use install --with-deps for system libraries. Caching browser binaries speeds subsequent runs.

How do we merge shard HTML reports?

Playwright blob reporter + merge-reports CLI combines shard outputs into one HTML artifact—see official CI sharding guide.

Flakes started after parallel—what first?

Audit shared coupons, accounts, carts, storageState. Reproduce locally with --workers=4 before raising timeouts.

Can we run against production?

Avoid mutating prod in E2E. Read-only smoke against prod is rare—prefer preview env with E2E_TEST_MODE seeds.

How does /testchimp test fit CI?

Runs before or as part of PR workflow—agents update SmartTests linked to markdown scenarios so CI failures get repaired in Git, not siloed chat.

Apply these patterns in your repo

Run `/testchimp init` to connect TestChimp to your repo, then `/testchimp test` on PRs to turn these patterns into maintained SmartTests. Use `/testchimp evolve` when you want to expand coverage as your app grows.

Start free on TestChimp · Book a demo

Who this is for​

Why CI parallelization fails without foundations​

Complexity map​

Baseline GitHub Actions workflow​

playwright.config.ts essentials​

Sharding vs workers​

Preview URLs on PRs​

Anti-patterns​

TestChimp workflow​

External references​