Skip to main content

4 posts tagged with "Playwright"

Playwright-specific testing best practices

View All Tags

Your E2E tests are unreliable? Here's why

· 6 min read
Nuwan Samarasekera
Founder & CEO, TestChimp

End-to-end tests are a necessary evil: they are the last line of defense that something actually works in a real browser—but they break often enough that the suite becomes a burden instead of a trustworthy signal.

There are three main sources of variance that make E2E tests unreliable. Understanding them is the first step toward a suite you can actually rely on.

1. World-state variance

This is what happens when your tests run in a different world-state than the one they were written against. A common cause is a shared environment where manual testing and automated runs both happen. The world changes between runs; the next run fails for reasons that have nothing to do with the code under test.

This kind of variance does more than flake tests. It also slows feedback: if those environments only get updates after PRs merge to main, you get weaker root-cause isolation and more expensive triage when something breaks.

2. System variance

These are variances built into the stack itself: network latency, transient failures, UI paint timing, and so on. Mature frameworks like Playwright address a lot of this with built-in waiting, auto-waiting locators, and expect polling—so a big slice of “flakiness” is really tooling and patterns, not fate.

3. Product variance

Even in a steady state (with no product change) — modern web apps are not as simple as calculators. Behavior can be inherently non-deterministic, and that is only more true now that AI often sits in the user journey (for example, a splash or offer that appears only sometimes). Much of that variance may be irrelevant to what a given test is trying to prove.

When tests are authored with a fragile, UI-selector-heavy approach, those product-level variances show up as broken steps. The test is coupled to incidental UI, not to intent.


Solve for these three kinds of variance, and you get a suite that is finally trustworthy.

World-state variance → controlled environments

Use ephemeral environments loaded into known, predefined world-states, so each run matches the state the test was authored against as closely as possible.

System variance → solid automation primitives

This is largely where mature frameworks shine. With Playwright, you get strong primitives for timing and stability—so you are not reinventing waits on every test.

Product variance → intent where the UI is messy

This is where agentic steps in tests can help: natural-language instructions executed by an AI, instead of brittle coupling to selectors—only on the messy, flaky parts of the flow.

There is no free lunch: natural-language steps tend to be slower, costlier, and harder to debug than plain script. The goal is to use them surgically, not everywhere.

SmartTests: intent-based Playwright scripts

That tradeoff is what TestChimp SmartTests are designed around: intent-based Playwright scripts.

They are still scripts for the most part, with an extra capability when you need flexibility—the parts of the app that fight selector-based automation.

Instead of:

await page.locator('.anticon.anticon-plus-square.ant-tree-switcher-line-icon > svg').nth(1).click();

Now you can write:

await ai.act('Expand the tree displayed in the left pane');

Only where you need it—the brittle, shifting UI—not for every line of the test.

Because the tests remain Playwright-based, system variance is handled by the same patterns and tooling you already trust. Run them in ephemeral environments with controlled world-states, and you have a test suite with all 3 variances accounted for - a suite that you can trust.

And yes, we are cooking up something on the ephemeral-environment side too. Stay tuned...

References

The themes above—non-deterministic tests, UI-level flakiness, and mitigations—are well documented in both industry practice and research. These sources are a good starting point if you want to go deeper.

  1. Martin Fowler, Eradicating Non-Determinism in Tests — A widely cited overview of why tests become non-deterministic (including async and shared-state issues) and how to structure tests to get repeatable results.
    https://martinfowler.com/articles/nonDeterminism.html

  2. Google Testing Blog (George Pirocanac), Test Flakiness - One of the main challenges of automated testing (Dec 2020) — Describes categories of flakiness and why inconsistent automated tests slow development; follow-up posts in the same series expand on causes and responses.
    https://testing.googleblog.com/2020/12/test-flakiness-one-of-main-challenges.html

  3. Google Testing Blog (John Micco), Flaky Tests at Google and How We Mitigate Them (May 2016) — Early, concrete account of flaky tests at scale, including mitigation strategies and discussion of where UI tests skew flaky.
    https://testing.googleblog.com/2016/05/flaky-tests-at-google-and-how-we.html

  4. Wing Lam, Stefan Winter, Anjiang Wei, Tao Xie, Darko Marinov, Jonathan Bell, A Large-Scale Longitudinal Study of Flaky Tests, Proc. ACM Program. Lang. 4, OOPSLA, Article 202 (2020). Peer-reviewed study of when tests become flaky and how changes in code, tests, and dependencies contribute.
    https://doi.org/10.1145/3428270
    Conference entry: https://2020.splashcon.org/details/splash-2020-oopsla/78/A-Large-Scale-Longitudinal-Study-of-Flaky-Tests

  5. Microsoft Playwright, Auto-waiting (Actionability) — Official documentation for pre-action checks (visible, stable, receiving events, enabled, etc.) that reduce timing-driven failures.
    https://playwright.dev/docs/actionability

  6. Microsoft Playwright, Assertions — Describes auto-retrying assertions (expect) that wait until conditions hold, complementary to actionability for stable checks.
    https://playwright.dev/docs/test-assertions

  7. Heroku / 12factor, Dev/prod parity (The Twelve-Factor App) — Classic framing for keeping development, staging, and production sufficiently aligned so “works in my environment” mismatches show up earlier; relevant when reasoning about world-state and shared environments.
    https://12factor.net/dev-prod-parity

  8. Google Research (Diego Cavalcanti), De-Flake Your Tests: Automatically Locating Root Causes of Flaky Tests in Code at Google, ICSME 2020 — Empirical work on locating flaky-test root causes in code at Google scale; reports high accuracy for the proposed technique in their evaluation.
    https://research.google/pubs/de-flake-your-tests-automatically-locating-root-causes-of-flaky-tests-in-code-at-google/

Simplified View: No-Code Editor - Full Code Power

· 3 min read
Nuwan Samarasekera
Founder & CEO, TestChimp

TestChimp tests have always been plain Playwright under the hood — with extra capabilities like plain-English steps and lightweight scenario linking via code comments. That gives you fixtures, hooks, page objects, and the test organization you expect from a serious engineering setup.

Most QA teams are a mix of technical and non-technical teammates. Code-only authoring keeps contribution narrow. A separate no-code tool often means a second suite that drifts from the “real” tests and never gets the same CI treatment.

We added Simplified View in the web IDE so you do not have to choose.

SmartTests Simplified View in the web IDE

What Simplified View Is

Simplified View is a no-code surface for creating and editing SmartTests that still compiles to fully functional Playwright scripts. Everyone works on the same test; people just choose how they interact with the tests.

Your teammates can:

  • Add plain English steps - that are run agentically.
  • Use structured building blocks for common actions — less boilerplate and fewer syntax slips.
  • Drop in free-form code when you need it — custom waits, tricky selectors, helpers: full Playwright, no lock-in.

You pick the level of code per step and per person, not one rule for the whole team.

Why this matters

Non-technical members can contribute directly to test automation — not only by filing tickets for engineers to translate later. They build and edit steps in Simplified View; the result is still Playwright your automation folks can refine, reuse, and run in the same pipelines as everything else.

That lifts throughput for the whole team: more people can ship checks in parallel, fewer scenarios sit in a queue waiting for a coder, and engineers spend time on structure and hard cases instead of retyping flows from docs. Underneath, it stays real Playwright — deterministic runs, familiar debugging, ExploreChimp, CI, and Git workflows you already rely on.

Getting Started

Open a SmartTest in the web IDE and switch to Simplified View to author or edit steps. When you need the full script, switch to code view; both views stay aligned with the same underlying test.

For more on creating and editing SmartTests, see Creating Smart Tests.

Further Reading

If you’re interested in how no-code and low-code approaches impact QA team velocity and collaboration in general, these resources provide useful perspectives:

SmartTests Now Support The Full Playwright Ecosystem

· 4 min read
Nuwan Samarasekera
Founder & CEO, TestChimp

We’re excited to announce that SmartTests now fully support the core Playwright testing patterns and constructs you know and love. This means you can write maintainable, well-structured test suites that leverage Playwright’s powerful features while still getting all the AI-powered adaptability that makes TestChimp SmartTests special.

What Are SmartTests?

For those new to SmartTests, you can think of a SmartTest as a Playwright scripts with couple of twists:

Intent Comments:

SmartTest Steps include intent comments that describe what you’re trying to accomplish. When a test runs, it executes as a standard Playwright script for speed and determinism. But when a step fails, our AI agent steps in to fix the issue on the fly and raises a PR with the changes – giving you the best of both worlds: fast script execution and intelligent adaptability.

Screen-state annotations:

Markers that specify the screen and state the UI is at a given step in the script. These annotations are authored and used by ExploreChimp to tag the bugs to the correct screen-state in the SiteMap.

What's New: Full Playwright Compatibility

SmartTests now support all the essential Playwright patterns that help you build professional, maintainable test suites:

1. Hooks for Setup and Teardown

SmartTests now support all four Playwright hooks at both file and suite levels:

beforeAll– Run once before all tests in a suite – afterAll – Run once after all tests in a suite – beforeEach – Run before each test – afterEach – Run after each test

This means you can set up test data, initialize page objects, authenticate users, and clean up resources exactly as you would in standard Playwright tests.

2. Page Object Models (POMs)

SmartTests fully support the Page Object Model pattern, allowing you to encapsulate page interactions in reusable classes. This keeps your tests clean, maintainable, and aligned with best practices.

Example:

import { Page } from '@playwright/test';

class SignInPage {
constructor(private page: Page) {}

async navigate() {
await this.page.goto('/signin');
}

async login(email: string, password: string) {
await this.page.fill('#email', email);
await this.page.fill('#password', password);
await this.page.click('#sign-in-button');
}
}

test('user can sign in', async ({ page }) => {
const signInPage = new SignInPage(page);
await signInPage.navigate();
await signInPage.login('user@example.com', 'password123');
});

3. Fixtures for File Uploads

SmartTests support Playwright fixtures, making it easy to handle file uploads and other test artifacts. Upload your fixture files (like test data, images, or documents) under the fixtures folder in the SmartTests tab, and they will be available during test execution.

4. Playwright Configuration

SmartTests folder contains a playwright.config.js file in your project to configure the Playwright execution environment. This is essential for:

  • Browser Authentication: Set up HTTP basic auth for staging environments
  • Custom Headers: Add authorization tokens, API keys, or custom headers
  • Base URLs: Configure default URLs for your test environment
  • Viewport Settings: Set default browser viewport sizes And more: All standard Playwright configuration options

Example playwright.config.js:

const { defineConfig } = require(‘@playwright/test’);

module.exports = defineConfig({
use: {
baseURL: ‘https://staging.example.com’,
httpCredentials: {
username: ‘staging-user’,
password: ‘staging-password’
},
extraHTTPHeaders: {
‘Authorization’: ‘Bearer your-token’,
X-Environment’: ‘staging’
}
}
});

5. Test Suites with Multiple Tests

SmartTests support organizing multiple tests in a single file using Playwright’s test.describe() blocks. You can create nested suites, group related tests together, and apply suite-level hooks – just like in standard Playwright.

Why This Matters

These additions mean SmartTests are now fully compatible with Playwright’s ecosystem. You can:

✅ Write maintainable tests using industry-standard patterns like POMs and hooks

✅ Organize your test suite with proper grouping and structure

✅ Handle complex setups with configuration files and fixtures

✅ Reuse existing Playwright knowledge without learning new patterns

✅ Still get AI-powered fixes when tests fail – the best of both worlds!

Getting Started

If you’re already using SmartTests, you can start using these features immediately. Just structure your tests using standard Playwright patterns, and SmartTests will handle the rest.

For new users, SmartTests work just like Playwright tests – with the added benefit of AI-powered failure recovery & stepwise execution enabling guided exploration.

What's Next?

SmartTests continue to evolve, and we’re committed to maintaining full compatibility with Playwright’s ecosystem while adding intelligent features that make testing easier and more reliable. Stay tuned for more updates!

Got questions or feedback? We’d love to hear from you! Drop us a line at contact@testchimp.io.

ai-wright: AI Steps in Playwright Scripts

· 3 min read
Nuwan Samarasekera
Founder & CEO, TestChimp

Bring AI-native actions and verifications into your Playwright tests – open source, vision-enabled, and BYOL.

The Problem

Most “AI testing” frameworks make you throw away what already works.

They replace your entire test suite with “agentic” systems — where an LLM drives every click, assertion, and navigation step.

Sounds cool… until you hit:

  • Slow, flaky, or non-deterministic runs
  • Proprietary test formats
  • Complete vendor lock-in

For most teams, that’s a non-starter.

What if you could keep your existing Playwright scripts, and just inject AI where it’s actually needed – the ambiguous, messy, or dynamic parts of your app?

The Idea

ai-wright brings AI steps to Playwright.

You still write regular Playwright tests – deterministic, fast, inspectable – but when you hit a fuzzy point, you can drop in a step like:

await ai.act('Click on a top rated campaign', { page, test });

Or

await ai.verify('The campaign description should not contain offensive words"', { page, test });

That’s it. AI only handles that step.

Everything else stays Playwright-native.

Why It’s Different

  1. Vision-Enabled Existing libraries (like ZeroStep and auto-playwright) use sanitized HTML – which misses what’s actually on screen.

This causes many issues:

  1. HTML ≠ UI reality – static DOM can’t reveal if elements are disabled, visible, obscured, or off-screen – resulting in LLMs attempting interaction with non-interactive elements.
  2. Loss of semantics – sanitized HTML strips ARIA roles, computed text, layout cues, and shadow DOM content, which are critical for accurate reasoning.
  3. Unbounded prompt size – large DOMs can often get too verbose, requiring truncation (resulting in loss of context).
  4. Fragile selectors – HTML-based approaches force LLMs to guess selectors; ai-wright uses precise SoM IDs bound to live DOM nodes, enabling accurate one-shot execution.
  5. ai-wright is vision-enabled: it blends SOM (Set-Of-Marks) annotated screenshots + structured DOM context for grounded, visual reasoning.

The result: AI that operates just like a normal user would – based on what it sees on the screen.

  1. Better Reasoning

Instead of one-shot “guess the next click”, ai-wright uses a multi-step reasoning loop.

It plans ahead, performs coarse-grained objective handling (e.g., “fill out login form,” not just “click button”), and adapts to UI state changes – minimizing retries and random flailing.

It can identify blockers (such as Modals etc.), and execute pre-steps before actioning on the objective.

  1. BYOL (Bring Your Own License)

ai-wright is LLM-agnostic – unlike existing solutions which require either proprietary licenses or supports specific providers only.

You can use your own OpenAI, Claude, Gemini key, or your self-hosted model – avoiding vendor lock-in.

You can choose to use your TestChimp license as well – which will proxy the LLM calls, removing separate token costs for you.

  1. Fully Open Source

Unlike agentic SaaS offerings which are closed source, proprietary solutions, ai-wright is fully open source, giving you complete transparency and community support.

ai-wright lets you inject AI where it matters — the tricky, ambiguous, or dynamic parts of your app — without giving up the speed, determinism, and maintainability of Playwright.

With vision-enabled reasoning, resilient multi-step planning, LLM flexibility, and a fully open source foundation, ai-wright bridges the best of both worlds: reliable, scriptable tests and AI-powered intelligence where you need it most – without any vendor lock-in.

AI where it helps, plain Playwright everywhere else.