QA Workflow for Agent-Built Apps

Short answer

Agent-built apps need a repeatable QA loop, not one-off test files. TestChimp connects markdown plans, SmartTests in Git, test runs, TrueCoverage, and /testchimp commands so every merge improves verified coverage.

Who this is for

Any team shipping with Cursor, Claude Code, Copilot, Windsurf, Lovable, Replit, or Codex who merges daily and cannot rely on session-scoped agent tests alone.

The problem with agent-only QA

Coding agents optimize local success. Without orchestration you get:

Symptom	Business impact
Demo-only tests	Revenue paths untested under real data
No scenario links	Reviewers cannot see requirement regressions
Missing Arrange layer	Parallel CI flakes on shared staging
No TrueCoverage loop	Production gaps discovered in support tickets
Stale suites	Next agent session rewrites unrelated tests

The TestChimp loop

1. Init (once per repo)

/testchimp init scaffolds seed/probe/teardown routes, fixtures, Playwright CI, and TrueCoverage instrumentation (init). This is the world-state layer agent-generated tests must plug into.

2. Test (every feature PR)

/testchimp test — agents read markdown plans, extend SmartTests for the PR diff, wire // @Scenario: links, and run scoped suites (test). Orchestration uses requirement gaps, CI history, and TrueCoverage—not chat memory alone.

3. Explore (UX risk)

/testchimp explore — ExploreChimp analytics on SmartTest pathways; UX findings roll up via the same scenario spine (explore · explorations).

4. Evolve (continuous)

/testchimp evolve — close gaps from TrueCoverage and test run history after deploys (evolve).

Three realities to align

Reality	Source	When misaligned
Planned	Markdown scenarios + `@Scenario` links	Features ship without tests
Tested	CI runs + test run telemetry	False confidence from stale suites
Production	TrueCoverage user events	Untested journeys until incidents

Builder-specific guides

Platform	Guide
Lovable	Testing apps built with Lovable
Cursor	Testing apps built with Cursor
Claude Code	Testing apps built with Claude Code
Replit	Testing apps built with Replit
Copilot	Testing apps built with Copilot
Windsurf	Testing apps built with Windsurf
Codex	Testing apps built with Codex
Vibe-coded	Testing vibe-coded apps

Getting started

Connect Git and install the TestChimp skill
Run /testchimp init
Add markdown scenarios for top revenue paths
Gate the next PR with /testchimp test
Enable TrueCoverage; schedule /testchimp evolve after deploys

Frequently asked questions

Which agent tools work with this workflow?

Cursor, Claude Code, Copilot, Windsurf, Codex, and others via the TestChimp skill and MCP—any agent that edits your repo and runs `/testchimp` commands.

Do we still need markdown test plans?

Yes—they tell agents and reviewers what must stay covered. `/testchimp test` reads scenarios from Git; traceability is the source of truth, not chat memory.

We already use coding agents—do we still need TestChimp without QA?

Agents alone produce session-scoped tests. TestChimp orchestrates agent-built apps with markdown plans, CI history, ExploreChimp, and TrueCoverage—`/testchimp test` on every PR so developers drive QA without a separate org.

Agent-written tests failed overnight—how does TestChimp recover?

Because SmartTests live in Git with scenario links, the next `/testchimp test` run sees CI history and TrueCoverage gaps, then opens a fix PR—not a fresh chat thread. Deterministic Arrange/Assert steps fail fast; hybrid AI steps absorb copy or layout churn without rerunning entire agent sessions.

Apply these patterns in your repo

Run `/testchimp init` to connect TestChimp to your repo, then `/testchimp test` on PRs to turn these patterns into maintained SmartTests. Use `/testchimp evolve` when you want to expand coverage as your app grows.

Start free on TestChimp · Book a demo

Who this is for​

The problem with agent-only QA​

The TestChimp loop​

1. Init (once per repo)​

2. Test (every feature PR)​

3. Explore (UX risk)​

4. Evolve (continuous)​

Three realities to align​

Builder-specific guides​

Getting started​