The Enduring Value of TDD: Verification in the Age of AI

Software Engineering

Explore the paradox of TDD practitioners and AI code generation. Learn why verification is more critical than ever, empowering coding agents to build robust software and ensuring quality.

The question, "how will we test this?" echoes the foundational principles of agile development, a sentiment particularly relevant today. An interesting paradox has emerged in the software industry concerning AI code generation:

Developers proficient in agile engineering practices, such as test-driven development (TDD), often express skepticism about AI code generation, fearing a decline in software quality.
Conversely, these same TDD-experienced developers are frequently the most successful at leveraging coding agents to build exceptional software, precisely because they employ creative techniques that empower agents to verify their own work.

During the late 2000s, a hallmark of a proficient programmer was their immediate inquiry, "how will we test this?" when presented with a complex task. Agile practitioners understood then that the success of nearly everything depended on establishing rapid, reliable, and automated methods to verify code functionality. Without robust tests, aggressive refactoring, frequent deployments, and safe deletions become impossible. While the 2010s saw many adopt patterns and heuristics that sometimes reduced testing rigor for pragmatism, the core skill of devising code verification strategies remained indispensable.

Fast forward to today: with coding agents such as Claude Code and Codex CLI, the only truly critical factor is ensuring they possess the necessary tools to independently verify the correctness of their output.

Why is independent verification so paramount? Without it, an agent tasked with an unverifiable action, much like a human developer, is left to guess. The speed at which agents operate means that an initial guess rapidly leads to a cascade of increasingly speculative assumptions. Often, when reviewing an agent's work after a short period, a perceived "mess" isn't due to the AI becoming unintelligent; rather, it's frequently the result of an environmental failure—like an application server crash or an unresponsive web browser—forcing the agent into speculative and defensive coding.

Many essential tools for agents require access to resources humans often take for granted: a sandboxed computing environment free of arbitrary restrictions, an MCP server for iOS Simulator interaction, or a built-in browser providing real-time visual feedback. Fortunately, for seasoned TDD practitioners ("graybeards"), the ability to discern a lack of sufficient verification is a deeply ingrained instinct. Furthermore, we possess the expertise to know when to apply techniques like image regression testing, mutation testing, or characterization testing, and the skill to configure suitable test harnesses for each. Out-of-the-box, most coding agents—similar to many human developers—lack fundamental understanding of these advanced testing methodologies, let alone how to implement them effectively.

It was therefore gratifying to see Simon Willison emphasize verifiability, signaling a potential readiness within the industry to address this long-standing fundamental. Achieving true "super-intelligence" in coding necessitates genuine reinforcement learning, which in turn depends on our collective ability to significantly improve how we integrate verifiability into our software development workflows.

Therefore, if you are a veteran of the agile era, the craftsmanship movement, or simply passionate about producing high-quality code, and are concerned about AI's impact on software quality, you hold more influence than you might realize. The principles you value are now more critical than ever. Moreover, developers who previously overlooked these practices for years now face an urgent imperative to engage with the very topics you champion. Take heart.