When an AI agent writes a step-worth of test code, that step is only a hypothesis — a theory about how the world should respond. It’s not until the code actually runs that we find out whether the hypothesis holds true. This moment — when code meets reality — is where...
Once the AI agent decides what it needs to do next, it still has one crucial task left: to turn that decision into code — the executable instructions that actually perform the action in the browser. This is where the test comes to life, each line converting the...
Each acceptance test we task our AI agent with creating needs a high-level goal: a short description of what the test should achieve. At first glance, it might seem enough to say, “Write a test that checks if the gallery arrows can be hidden.” But the precision of...
Every intelligent system needs a world to operate in. For Voyager, that world was Minecraft — a sandbox of forests, caves, and creatures. For our testing agent, the world is a WordPress site and it is just as rich: a landscape of menus, buttons, fields, and pages,...
What made Voyager remarkable wasn’t that it played Minecraft — it’s that it learned through interaction. It explored an environment, noticed what worked, and built on its successes. That same principle applies surprisingly well to software testing. A WordPress site...