Once the AI agent decides what it needs to do next, it still has one crucial task left: to turn that decision into code — the executable instructions that actually perform the action in the browser. This is where the test comes to life, each line converting the agent’s reasoning into a tangible move on the screen.

The code for a single action may be quite simple. Something like our earlier example:

$I->click('Publish');

But to turn this into a full acceptance test “step”, we’ll extend it a bit. Each step in our acceptance tests is built from these three parts — the action, the wait, and the documentation.

Generating the Action

Every step begins with an action. This is the line that performs the UI interaction itself: clicking a button, typing text into a field, selecting a menu item.

To generate it, I pass a detailed bundle of context to an LLM — usually the latest GPT model from OpenAI (currently GPT 5). That context includes three key things:

  • The current state of the browser (HTML, screenshot, and relevant logs).
  • The next step in the goal sequence, as the agent has just reasoned it.
  • The existing test so far, so it knows what’s already been done and what remains.

Alongside this, I include a set of similar steps drawn from a library of steps from previous tests (more on this later). When the agent sees a familiar operation, it can often adapt an existing snippet instead of inventing a new one from scratch.

The model then generates a Codeception command that it thinks will perform the required action. Each such action is the agent’s best guess at what will work in the state it currently exists in.

Waiting for Completion

Actions alone aren’t enough. The agent also needs to know when they’ve finished.

In Codeception, this is done through wait commands — lines that tell the test to pause until some condition is met. After a “Publish” click, for example, the web page might show a “Page published” message. After opening a settings panel, a specific element might appear on screen.

So for each action the agent writes, it also generates a corresponding wait command, such as:

$I->waitForText('Page published');

This is more than a formality. Without it, the test may charge ahead before the system was ready, clicking buttons that hadn’t yet appeared or validating pages that were still loading. The wait command enforces patience — and keeps the agent and the browser in sync.

It’s also an important part of how the AI learns what success looks like. When the condition it’s waiting for appears, it’s a confirmation: the step worked. When it doesn’t, that absence becomes data — a signal for correction or retry.

Documenting the Step

The final piece of each step is documentation.

When I used to write acceptance tests manually, I found it helpful to have my tests produce a Markdown file with a human-readable summary of what each step was doing: a title, a short paragraph, and a screenshot. It made the tests easier to review and, as we’ll see later, to use as the basis for user-facing documentation. To aid in this, I created a custom Codeception Helper — a set of commands that can be added to tests to write to the corresponding markdown file.

Now, every automatically generated step includes these documentation commands. The agent writes lines like these:

$I->addSubsubtitle("Publish Your Changes");
$I->addParagraph("To apply your edits, locate the “Publish” button in the Divi Builder’s toolbar. Click it once, and Divi will save and publish your updated layout to the live site.");

After the action runs, it also adds a screenshot — sometimes of the entire page, sometimes of the element in question with a highlight box drawn around it.

$I->addScreenshotOfElement('Publish Button', '.et-fb-publish-button');

This layer of documentation transforms the test into something readable, shareable, and useful beyond verification. It becomes a human-readable record of how the feature behaves.

Bringing It All Together

When you look at the complete code for a single step, you can see the pattern clearly:

// Publish the page
$I->addSubsubtitle("Publish Your Changes");
$I->addParagraph("To apply your edits, locate the “Publish” button in the Divi Builder’s toolbar. Click it once, and Divi will save and publish your updated layout to the live site.");
$I->addScreenshotOfElement('Publish Button', '.et-fb-publish-button');
$I->click('Publish');
$I->waitForText('Page published');

There’s an action that performs the task, a wait that confirms it’s done, and a set of documentation lines that describe and illustrate it.

Individually, these are small steps. But collectively, they form the rhythm of an intelligent testing process — one that doesn’t just act, but observes, confirms, and records its understanding as it goes.

Each line becomes a unit of verified knowledge. And when stitched together, these steps form something greater than the sum of their parts: a working, trustworthy acceptance test.