Testing Apps Built with Cursor
Short answer
Cursor excels at writing code fast—including one-off Playwright files. TestChimp adds the QA workflow layer: which scenarios matter, per-PR /testchimp test, fixtures, TrueCoverage, and traceability so agent output compounds across merges.
Who this is for
Teams shipping features via Cursor Composer or Agent mode who merge daily and cannot afford demo-only E2E that rots after the next session.
How teams ship with Cursor
Cursor edits multi-file features, generates Playwright stubs inline, and refactors UI rapidly. Velocity is the point; verification is usually manual or session-scoped.
Common QA gaps
| Risk | What goes wrong |
|---|---|
| Happy-path tests | Pass while backend state is wrong—no probe Assert |
| No scenario link | Tests drift from markdown plans; reviewers cannot see requirement impact |
| Stale suites | Break after cascade refactors without agent repair in CI |
| Session-scoped QA | No gate tying each PR to scenario coverage |
Why E2E with probes is non-negotiable
Composer can generate plausible Playwright that clicks through checkout but never seeds coupons or probes orders. That is the expired-coupon class of bug—UI green, revenue wrong.
The TestChimp loop on every PR
TestChimp does not replace your builder—it orchestrates QA on what agents ship:
| Phase | Command | Outcome |
|---|---|---|
| Bootstrap | /testchimp init | Seed/probe routes, fixtures, Playwright CI, TrueCoverage (init) |
| Per-PR QA | /testchimp test | Agents read markdown plans, author/repair SmartTests, wire // @Scenario: (test) |
| UX risk | /testchimp explore | ExploreChimp on SmartTest pathways (explore) |
| Post-deploy | /testchimp evolve | Close TrueCoverage and plan gaps (evolve) |
Install the TestChimp skill in your agent IDE. SmartTests remain Playwright in Git—standard traces, reporters, and CI (SmartTests).
Three realities TestChimp aligns
| Reality | Without orchestration | With TestChimp |
|---|---|---|
| Planned | Scenarios live in chat or Notion | Markdown plans in Git (test planning) |
| Tested | Session-scoped agent tests | CI SmartTests + test runs (test runs) |
| Production | Unknown coverage holes | TrueCoverage RUM ↔ runs (TrueCoverage) |
Mismatch signals drive the next /testchimp test cycle—not another ad hoc prompt.
Example scenario
Situation: Cursor adds a payment UI in a feature PR; Composer also writes a Playwright spec that clicks Pay.
Expected outcome: Payment is captured and an order row exists with correct amount.
Why UI-only automation breaks: Test passes on toast text while webhook handler is stubbed—production charges fail.
- Arrange: `/testchimp init` seed route creates cart + payment method for this run only.
- Act: Playwright completes checkout on the PR branch.
- Assert: Probe returns order ID and captured amount; `// @Scenario:` links roll up coverage.
TestChimp workflow: Compare `checkout_attempted` events in prod vs test to find untested payment methods.
Same Arrange/Act/Assert pattern as expired-coupon checkout.
Worked example
Cursor adds Stripe Elements while no seed endpoint exists—tests click through but never verify capture. /testchimp test on that PR drives agents to add probes and // @Scenario: links before merge. Domain pattern: checkout flows.
Related
Windsurf · Copilot · TestChimp vs Claude · Agent workflow
Frequently asked questions
How is TestChimp different from asking Cursor to generate tests?
Cursor Composer excels at local refactors and one-off specs, but those outputs rarely track markdown scenarios, shared staging fixtures, or production-behaviour gaps. TestChimp orchestrates per-PR QA: agents read your plans folder, update seed/probe routes from `/testchimp init`, and repair SmartTests when CI or TrueCoverage flags a regression—so regen churn in the IDE does not silently drop coverage on checkout or auth flows.
Is TestChimp replacing Cursor?
No—TestChimp orchestrates QA on top of Cursor. Composer still builds features; `/testchimp test` maintains SmartTests, scenario coverage, and TrueCoverage after each PR.
We already use coding agents—do we still need TestChimp without QA?
Agents alone produce session-scoped tests. TestChimp orchestrates Cursor with markdown plans, CI history, ExploreChimp, and TrueCoverage—`/testchimp test` on every PR so developers drive QA without a separate org.
Agent-written tests failed overnight—how does TestChimp recover?
Because SmartTests live in Git with scenario links, the next `/testchimp test` run sees CI history and TrueCoverage gaps, then opens a fix PR—not a fresh chat thread. Deterministic Arrange/Assert steps fail fast; hybrid AI steps absorb copy or layout churn without rerunning entire agent sessions.
Apply these patterns in your repo
Run `/testchimp init` to connect TestChimp to your repo, then `/testchimp test` on PRs to turn these patterns into maintained SmartTests. Use `/testchimp evolve` when you want to expand coverage as your app grows.