Testing Apps Built with Claude Code
Short answer
Claude Code ships features fast—including one-off Playwright files. TestChimp adds orchestration: markdown plans, /testchimp test on every PR, TrueCoverage, and ExploreChimp so agent output compounds instead of rotting after the first green run.
Who this is for
Engineers using Claude Code terminal loops who want QA to run every merge, not only when someone remembers to ask for tests in chat.
How teams ship with Claude Code
Terminal-native agent loops: implement features, run tests ad hoc, commit. Without a QA system, each session reinvents coverage and ignores production-behaviour gaps.
Common QA gaps
| Risk | What goes wrong |
|---|---|
| Session-scoped tests | Not linked to markdown requirements or evolve loop |
| Happy-path demos | No probe Assert for auth, billing, or permissions |
| Missing seed/probe routes | Parallel CI collides on shared staging users |
| No evolve loop | Production deploys without TrueCoverage-driven expansion |
Why E2E with probes is non-negotiable
Claude Code can generate plausible tests that never assert server truth. Probes catch authorization and ledger bugs UI clicks hide.
The TestChimp loop on every PR
TestChimp does not replace your builder—it orchestrates QA on what agents ship:
| Phase | Command | Outcome |
|---|---|---|
| Bootstrap | /testchimp init | Seed/probe routes, fixtures, Playwright CI, TrueCoverage (init) |
| Per-PR QA | /testchimp test | Agents read markdown plans, author/repair SmartTests, wire // @Scenario: (test) |
| UX risk | /testchimp explore | ExploreChimp on SmartTest pathways (explore) |
| Post-deploy | /testchimp evolve | Close TrueCoverage and plan gaps (evolve) |
Install the TestChimp skill in your agent IDE. SmartTests remain Playwright in Git—standard traces, reporters, and CI (SmartTests).
Three realities TestChimp aligns
| Reality | Without orchestration | With TestChimp |
|---|---|---|
| Planned | Scenarios live in chat or Notion | Markdown plans in Git (test planning) |
| Tested | Session-scoped agent tests | CI SmartTests + test runs (test runs) |
| Production | Unknown coverage holes | TrueCoverage RUM ↔ runs (TrueCoverage) |
Mismatch signals drive the next /testchimp test cycle—not another ad hoc prompt.
Example scenario
Situation: Claude Code adds an admin export API and UI in one session.
Expected outcome: Non-admin users receive 403; no export file is generated.
Why UI-only automation breaks: Test logs in as admin only; regression removes role check silently.
- Arrange: Seed standard user + admin via API; scenarios document RBAC requirements.
- Act: Playwright attempts export as standard user.
- Assert: Probe returns 403 and empty export queue.
TestChimp workflow: Evolve adds tests when TrueCoverage shows export usage spike in prod.
Same Arrange/Act/Assert pattern as expired-coupon checkout.
Worked example
Claude Code adds admin export UI but never creates authorization tests. /testchimp test pulls scenario gaps from plans and adds SmartTests with probe denies for non-admin users.
Related
TestChimp vs Claude · Cursor guide · QA on Autopilot
Frequently asked questions
We already use Claude Code for tests—what does TestChimp add?
Claude Code can author Playwright, but without orchestration you get session-scoped scripts that drift after the first green run. TestChimp connects Claude Code to markdown plans, test-run history, ExploreChimp findings, and TrueCoverage—`/testchimp test` on each PR targets scenarios that actually matter for release, not whichever flow was mentioned last in chat.
Is TestChimp replacing Claude Code?
No—it adds QA orchestration: plans, per-PR SmartTest maintenance, ExploreChimp, and TrueCoverage on top of Claude Code authoring.
We already use coding agents—do we still need TestChimp without QA?
Agents alone produce session-scoped tests. TestChimp orchestrates Claude Code with markdown plans, CI history, ExploreChimp, and TrueCoverage—`/testchimp test` on every PR so developers drive QA without a separate org.
Agent-written tests failed overnight—how does TestChimp recover?
Because SmartTests live in Git with scenario links, the next `/testchimp test` run sees CI history and TrueCoverage gaps, then opens a fix PR—not a fresh chat thread. Deterministic Arrange/Assert steps fail fast; hybrid AI steps absorb copy or layout churn without rerunning entire agent sessions.
Apply these patterns in your repo
Run `/testchimp init` to connect TestChimp to your repo, then `/testchimp test` on PRs to turn these patterns into maintained SmartTests. Use `/testchimp evolve` when you want to expand coverage as your app grows.