May 28, 2025

The Undervalued QAE and AI

In the last 10 years, I’ve been lucky enough to work with teams that had quality assurance engineers (QAE) and engineers focused on testing. Good QAs are detail-oriented, not afraid of repetitive work, and obsessed with ensuring things work “as designed.” That’s important. Engineers will often take a “close enough” mindset and not even notice when something doesn’t look quite right, doesn’t have the right words, or “interprets” a workflow differently than intended. Not so for a QAE.

I continue to explore AI coding and am getting ever more into vibe coding. There’s an excitement around seeing just how much an AI coder can build in such a short time, but in almost all cases, small misses gradually turn into cracks—and sometimes quite quickly, these become breaking issues. These can range from UI elements not working as expected to the introduction of a new library that, at the first sign of trouble, gets replaced with a completely new and often manual approach. This creates a mess of inconsistent approaches throughout the codebase.

I started looking into custom modes in RooCode/KiloCode, and my solution in a recent experiment was to create a QAE custom mode. I first configured the QA mode (via an initialization workflow in the mode description) to read all project documentation to understand what we were going to build. Then it would create a testing strategy markdown document if one didn’t already exist. This document would capture how we plan to test, what we plan to test, and our testing approaches. I also specified that the QA mode should always reflect on this document before doing anything else—one more way to avoid introducing conflicting ideas with every single task.

I then configured the coding mode to complete work by summarizing what was delivered in a markdown document and handing it over to the QA mode. For anyone who has developed in the real world, this might sound obvious.

I then began working with the orchestrator on a task. The coder developed the feature, generated a summary for review, and the verifier took over. Not only did the QA mode write unit tests, but it also generated manual testing procedures with curl commands for the API. When it came to UI testing, it walked me through manual tests of the interface. It set up the workflow, defined the expected values, and asked me to confirm each step.

For someone who wants to YOLO vibe code, this might sound terrible, but I immediately noticed that while a UI was present, there were CSS issues. This would have most likely been missed by unit tests and even automated UI testing.

I will always lean into the idea that automated testing is better. What this exercise taught me is that it might be worth considering specialized roles that mirror many of the learnings from human software development. I also reall like key documents for grounding various types of development, and that