This book is a work in progress, and subject to change.
Part I: The Idea
- Teaching an AI to Write Acceptance Tests
How a time-consuming part of software development inspired an experiment in AI. - An AI Learns to Play Minecraft
How an AI that learned to play Minecraft revealed a powerful general-purpose algorithm for agent design. - Translating Voyager to Testing
How the same loop of goal-setting, acting, and feedback used by Voyager can be applied to automated software testing.
Part II: Building the System
- Sensing the Environment
How capturing browser state, screenshots, and logs lets the AI agent “see.” - Setting Goals and Sub-Goals
How high-level testing objectives are broken into steps toward success. - Generating Step Code
How the agent translates intent into working test actions, waits, and documentation. - The Feedback Loop
How the agent self-corrects failing code. - Building the Library of Reusable Skills
How the agent builds faster and smarter over time. - Case Study: Tencent’s XUAT-Copilot
How Tencent independently developed a similar system for AI-driven testing — and what it reveals about the field’s future.
Part III: The Challenges
- Speed and Scale
How to deal with the slowness of browser-based testing. - Stability and Context Drift
How to keep the agent working reliably as the world around it changes. - Evaluation and Validation
How to ensure that AI-generated tests are correct and trustworthy.
Part IV: Beyond Testing
- Tests as Documentation
How an AI-generated acceptance test can become user-facing documentation and videos. - Self-Healing Tests and Automated Bug Fixing
How to fix broken tests and bugs in the feature being tested. - AI-Assisted Feature Development
How the agent can verify and repair AI-generated feature code. - Synthetic Users and Interface Design
How simulated user personas can turn automated tests into usability studies that improve feature design. - Systems That Learn
How code, tests, and documentation begin to co-evolve — and what it means for developers.
Optional Appendices
- Appendix A: Technical Deep Dive
– Example prompts, system architecture, and test logs.