Every software developer knows the sinking feeling that comes with a red line in a test report. Something broke. Maybe it’s your fault; maybe it’s not. Either way, the work starts — find what changed, fix it, and get everything green again.

But what if the tests themselves could handle some of that work?

For decades, testing has been the passive part of development — it reports on the codebase, but doesn’t change or fix it. Yet as AI becomes part of the loop, that is beginning to change.

Today, the same AI that generates and runs the tests can also help repair them.

When the Ground Beneath You Moves

No test exists in a vacuum. Every one of them depends on the shape of the system it’s testing — its user interface, data structures, and logic.

In a project like Divi Booster, even a minor update to the WordPress core, or a small CSS refactor in Divi itself, can cause ripples through my acceptance tests. A renamed class, a new layout, a slightly different sequence of interactions — and suddenly the test can’t find the button it relies on.

Traditionally, I’d open the test, scan the failure, and patch the relevant step by hand. That might mean updating a CSS selector or changing the expected message. It’s not difficult, but it’s tedious — and when you have hundreds of tests, the time adds up fast.

Now, there’s another option: let the system fix itself.

Because the AI already knows how to navigate the environment and generate working steps, it can attempt to regenerate the broken parts of a test on its own.

Regeneration: Starting Over, Smarter

The simplest approach is full regeneration — when a test fails, the agent just rewrites it from scratch.

This works well when the interface has changed significantly. The agent simply re-runs the goal — “hide gallery arrows,” for example — and builds a new test from the current state of the system. It doesn’t need to know what changed; it just needs to rediscover how to accomplish the same task under the new conditions.

It’s the same process we described earlier in the book: observe the state, reason about what to do, perform an action, and validate the result.

The regeneration approach is brute force — but elegant in its simplicity. When the terrain shifts, rebuild the map.

Still, starting over isn’t always necessary. Often, a failure occurs halfway through a test that was otherwise working perfectly. For those cases, more nuanced strategies come into play.

Targeted Repair and Fail-Point Recovery

When a test fails midway, the agent can identify the fail point — the exact step where things stopped working — and focus its attention there.

Rather than recreating everything, it can simply rerun the test up to the last successful step, capture the current state of the page, and begin rebuilding from that point forward.

This is where the system’s ability to capture HTML snapshots and screenshots of the environment really pays off.

Armed with that state, it can begin generating a revised command sequence to achieve the same outcome under the new conditions.

The beauty of the step library mechanisms described earlier means that only the specific failing steps need to be re-generated. Subsequent steps, if they still work, will likely be pulled from the library. The agent, in effect, heals only the part of the test that needs it.

Incremental Validation and Learning

What makes this process powerful is that each successful repair teaches the system something about resilience. When a broken selector is replaced with a more robust one, or when a renamed element is matched semantically instead of literally, that fix becomes part of the library of reusable skills.

Next time a similar issue arises, the agent won’t have to guess; it already knows how to adapt.

Over time, the system begins to avoid the types of changes that tend to cause breakage — the fragile points, the dependencies, the patterns that drift. It can even start to favor test commands that are more robust by design: perhaps locating elements by xPath rather than CSS selectors, or waiting for visible changes instead of arbitrary timeouts.

What begins as patching slowly turns into preventative adaptation.

The Blurring Line Between Debugging and QA

As these repair mechanisms mature, the boundary between testing and debugging starts to dissolve.

If an testing system can identify its own failure, reason about the cause, and repair itself, it’s no longer just a verification tool — it’s a participant in maintenance. And if it can take the next step, using that same reasoning to repair the feature code rather than the test, the distinction disappears entirely.

A failed test isn’t the end of the process; it’s the start of recovery.

For a solo developer, this changes the economics of stability. Instead of spending hours chasing fragile tests after every update, I can let the system handle the small repairs and focus my attention where it’s truly needed — on the big picture, the conceptual shifts that require human judgment.

The test systems, meanwhile, continue their quiet work, healing themselves, adapting, and learning — the invisible caretakers of a living codebase.