The Feedback Loop

When an AI agent writes a step-worth of test code, that step is only a hypothesis — a theory about how the world should respond. It’s not until the code actually runs that we find out whether the hypothesis holds true. This moment — when code meets reality — is where...

Generating Step Code

Once the AI agent decides what it needs to do next, it still has one crucial task left: to turn that decision into code — the executable instructions that actually perform the action in the browser. This is where the test comes to life, each line converting the...

Setting Goals and Sub-Goals

Each acceptance test we task our AI agent with creating needs a high-level goal: a short description of what the test should achieve. At first glance, it might seem enough to say, “Write a test that checks if the gallery arrows can be hidden.” But the precision of...

Sensing the Environment

Every intelligent system needs a world to operate in. For Voyager, that world was Minecraft — a sandbox of forests, caves, and creatures. For our testing agent, the world is a WordPress site and it is just as rich: a landscape of menus, buttons, fields, and pages,...

Translating Voyager to Testing

What made Voyager remarkable wasn’t that it played Minecraft — it’s that it learned through interaction. It explored an environment, noticed what worked, and built on its successes. That same principle applies surprisingly well to software testing. A WordPress site...