When to Use ai.act and ai.verify in Playwright
Short answer
ai.act and ai.verify are for volatile UI where stable data-testid is impractical—AI chat, rotating copy, i18n marketing. Arrange stays on seed routes; Assert stays on probes. Never use semantic steps alone for money, auth, or permissions.
Part of Common E2E testing gotchas.
Decision tree
Is the element a stable form/button with eng owner?
YES → data-testid or getByRole
NO → Does outcome depend on backend state?
YES → probe Assert (+ optional ai.verify for copy)
NO → Is copy/layout intentionally non-deterministic?
YES → ai.act / ai.verify (sparingly)
NO → fix the UI contract (add test id)
Good uses
| Case | Example | Pattern |
|---|---|---|
| AI chat chrome | Bubble layout, chip labels | ai.verify('Assistant offers refund or policy denial') + probe order status |
| i18n marketing | Hero CTA text rotates | ai.act('Start free trial') + probe signup row |
| Agent-built UI churn | Vibe-coded regen changes class names | Hybrid: ai.act on nav, test id on submit |
Bad uses
| Case | Why avoid |
|---|---|
| Pay / Submit on checkout | Must be deterministic—use test id + probe |
| Admin delete | Security—probe 403, not “button gone” |
| Entire spec as ai.act | Cost, flake, untraceable |
See conversational UI guide for AIMock + probes on chat flows.
Hybrid SmartTest shape
// Arrange — deterministic
await request.post('/api/test/seed-order', { data: { runId, orderId: '12345' } });
// Act — semantic only where layout shifts
await ai.act('Open support chat and ask to cancel order 12345');
// Assert — authoritative
await expect.poll(async () => {
const res = await request.get('/api/test/probe-order/12345');
return (await res.json()).status;
}).toMatch(/refunded|policy_denied/);
vs pure agentic testing
Pure agentic runners invoke AI every step—slow and non-deterministic. SmartTests keep Playwright speed with optional AI steps (pure agentic vs SmartTests).
TestChimp workflow
Install the TestChimp skill. /testchimp test on PRs adds hybrid steps only where scenario markdown documents volatility—agents do not sprinkle ai.act on stable checkout.
Anti-patterns
| Anti-pattern | Why it fails | Better approach |
|---|---|---|
| ai.verify for refund truth | Model wording lies | Probe refund status |
| ai.act on login form | Slower than getByLabel | Standard locators |
| No AIMock in CI chat tests | Model variance | AIMock + probes (AI web apps) |
Related
Frequently asked questions
Does ai.act work in CI without an API key?
TestChimp wires AI steps through your configured runtime when tests run—keep deterministic Arrange/Assert so most CI time is plain Playwright. Use AIMock for LLM variance in chat specs.
Can ai.verify replace assertions?
Not for money, auth, or data mutations—use probes. ai.verify suits semantic UI checks (message tone, visible action buttons) alongside structure asserts.
We use Copilot to write tests—how is this different?
Copilot suggests snippets; TestChimp orchestrates per-PR maintenance with scenario-linked SmartTests and disciplined hybrid steps—not open-ended agent wandering.
How do we prevent engineers over-using ai.act?
Code review + // @Scenario: notes explaining why semantic steps exist. Lint or tag @hybrid-ai specs for extra scrutiny.
Is ai.act the same as record-replay?
No—record-replay captures DOM paths. ai.act interprets intent at runtime; still pair with probes for outcomes.
What about accessibility-first apps?
Prefer getByRole—often eliminates need for ai.act on standard flows.
Apply these patterns in your repo
Run `/testchimp init` to connect TestChimp to your repo, then `/testchimp test` on PRs to turn these patterns into maintained SmartTests. Use `/testchimp evolve` when you want to expand coverage as your app grows.