/testchimp evolve — Close coverage and behaviour gaps

Use /testchimp evolve on a schedule (for example nightly or weekly) or when triggered by events you care about—after a deployment, after merge to main, or whenever you want portfolio-level QA improvement without tying everything to a single PR.
While /testchimp test optimizes for one change set, evolve optimizes for the whole product over time: requirement coverage, execution history, and—when TrueCoverage is enabled—gaps between real user behaviour and what your tests exercise (drop-offs, dwell, funnels, high-traffic paths).
What evolve does
The agent collaborates with the TestChimp platform—the same insights you see in the app—and turns them into an ordered plan that optimizes for business impact, then implements tests, instrumentation, and suite hygiene work.
- Requirement coverage — Call
get_requirement_coveragewith optional scope (tests/...orplans/...platform paths, file paths, branch, environment). Turn on flags for non-covered stories and scenarios when you need an explicit gap list. - Execution history — Call
get_execution_historywith a matching scope to see recent pass/fail, flakes, and errors—so fixes target what is actually failing in CI or scheduled runs. - TrueCoverage (when instrumentation is on) — Use RUM analytics tools: list environments, summarize events and funnels, drill into details, child trees, transitions, time series, and metadata keys. Compare real usage (for example production-like
baseExecutionScope) to test-tagged traffic on a staging or automation scope (comparisonExecutionScopewithautomationEmitsOnly: truewhere appropriate) so manual noise does not inflate “covered.” - Plan — Order work by impact: combine missing requirement coverage, failing history, and high-traffic / high-risk TrueCoverage gaps (for example dropoff and dwell insights).
- Execute — Implement SmartTests, API tests, and instrumentation updates; re-run locally or in CI.
- ExploreChimp (optional, plan-gated) — When the evolve plan calls for it: after new or changed UI tests pass and
markScreenState/ Atlas hygiene meets the same bar as/testchimp testValidate, run ExploreChimp on the listed UI SmartTests so TestChimp can analyze DOM, screenshots, console, network (with regex), and performance along those journeys. TrueCoverage signals (for example drop-off, high duration or demand, automation gaps) help pick which specs to explore; tests added in the same evolve cycle are valid targets once they are stable with markers. RequiresEXPLORECHIMP_ENABLED,TESTCHIMP_API_KEY,TESTCHIMP_BATCH_INVOCATION_ID, and settings persisted underplans/knowledge/ai-test-instructions.md→## ExploreChimp. If the plan recordsN/A, skip this step. See/testchimp explorefor the analytics model. - Hygiene (as needed) — Retire or quarantine obsolete or consistently low-signal tests (after confirming intent), reduce flake by aligning seeds/fixtures, and prune dead instrumentation.
Overall outcome
- Requirement-aligned coverage improves because tests stay linked to scenarios and gaps are visible and actionable.
- User behaviour–aligned coverage improves when TrueCoverage is enabled: the agent prioritizes journeys that matter in production, not only what is easy to script.
- Insights stay in the loop—TestChimp aggregates traceability, runs, and RUM so the team gets a single place to see whether quality is converging, not a one-off spreadsheet per release.
- Optional ExploreChimp — Surfaces UX-oriented issues on high-signal user paths your SmartTests already reach, closing the loop between production behaviour (TrueCoverage) and checkpointed test telemetry.
That is the closed feedback loop: measure (plans + runs + TrueCoverage) → decide (evolve plan) → change (tests and app emits) → optional ExploreChimp on UI slices → measure again.
When to run evolve vs test
| Situation | Prefer |
|---|---|
| A specific PR is ready for QA automation | /testchimp test |
| You want portfolio-level gap closure, post-release hygiene, or periodic quality improvement | /testchimp evolve |
You can run both: test after each meaningful PR; evolve on a cadence or after major releases so strategic coverage catches up with product and real usage.
See also
/testchimp explore— ExploreChimp as the primary command; same telemetry model as optional evolve step 6 above./testchimp test— PR workflow including Phase 5 ExploreChimp when the branch plan opts in.- TrueCoverage intro — Concepts and dashboards.
- Screen-State Annotations —
markScreenStatefixture before enabling ExploreChimp. - QA Intelligence — Broader analytics across plans, tests, and behaviour.