Jun 3, 2025

Vibe Coding: Time Sinks

Most of what I am now focused on with vibe coding is communication, precision, and expectation management. How can we be very clear about the objective at all levels and deliver work that satisfies both requirements and expectations? I have found a few time sinks in my vibe coding efforts that allow us to explore some of these risks.

Case 1: CSS Frameworks Like Tailwind

Tailwind is such an interesting case. The framework has evolved a lot over the years with breaking changes, but it has also been so popular that most frameworks seem to have developed strategies for integrating it. This is where I have found a challenge. The AI coders know to use a popular CSS framework and even to document that decision in my more modern environments, but there is a high risk of not understanding how it is being introduced and which frameworks it is paired with. Worse is the mix of stale and fresh ideas being applied, often as a reflex to something not working.

Mitigation:

A preferred stack: Having a preferred stack allows you to start refining instructions. I am leaning towards frameworks that have scaffolding scripts and then building prompt instructions (e.g., Cline/RooCode rules) that state how the framework should be managed.
A specialist mode: RooCode/KiloCode both allow for custom modes, which are collections of system prompts/context/rules that are used in that mode. Most examples have roles like Architect and Coder, but when you start to see the general coder struggle, it’s possible to build a specialist. In the real world, this could be a Frontend Engineer, but even more specifically, you can create a framework specialist with all the rules needed to use the framework.
Tech stack documentation: Memory-bank strategies capture a lot of information in markdown form so that it can be added to the context. I think it is worth considering a “tech-stack” document that can be used in the early stages of planning, like the product requirements doc (PRD) and high-level design (HLD). Having a tech stack document as you plan tasks increases the chance of consistent use of technology during implementation.

Case 2: Testing

I’ve been building software for over two decades, and testing is a core part of my mindset. The ability to tap into vibe-coded testing is amazing for its often more comprehensive output than I would have considered. What I am finding, though, is that it is consistently a challenge. The test setup is often more problematic than it should be, just trying to get a framework running. I have found Anthropic to be slightly more confident at setting up and managing tests than Gemini. Both have another risk, though: when I say “make the tests pass,” the success criteria can drift towards “at all costs.” I have had to rapidly hit abort while I watched the AI repeatedly say, “I will update the test to the current implementation.” I later got burned when a test that had been made incorrect repeatedly failed and had the AI stumped over and over, thinking the code was right but the test kept failing. At no point did it consider the test intention. I could see from the test title and the inconsistency that the test was wrong.

Mitigation:

A specialist mode: I am now consistently leveraging a QA Engineer/Verifier custom mode as a way of ensuring there is more emphasis on test quality. This is also a chance to implement more knowledge for how to manage test infrastructure.
A preferred stack/Tech stack doc (as above): As above, if we have a standard tech stack, it is easier to be more specific in the rules about how to implement and manage testing. Explicitly, this includes planning that:
1. When making tests pass, the intention of the test should be considered and its validity assessed.
2. Making tests pass should never resort to just stubbing the test.
3. Making a test pass could involve updating the test, but a proposal should be made to the user and explicit confirmation gained.

Case 3: Refactoring

AI is actually pretty great at refactoring. I noticed in one project that a file was getting too large. The architect custom mode planned a refactoring strategy and produced a much better folder structure with modular files. It looked great. Hitting go, the code was refactored, but I discovered shortly after that it had not considered the tests. They all immediately failed. Asking for the tests to be refactored afterwards went badly, with iterations of failure and damaging changes to tests. Breaking tests in this way really reduces confidence in the code.

Mitigation:

All the mitigations above
Rule enforcers: There is an opportunity to explicitly state in the Cline/RooCode rules that refactors should consider tests and be incremental wherever possible. By documenting the plan before executing, there is a greater chance that if something goes wrong, that plan can be used in discussion of recovery.

Final Thoughts

I mentioned in the beginning the urge to improve communication, precision, and expectations. In each mitigation, that has been the focus. Documentation increases awareness and maintains context. Modification of custom modes is about balancing different versions of success, priorities, and aligning on expectations. For example, when I ask a QA specialist to ensure testing is set up, I have a much greater expectation and emphasis that quality of test is a priority than if I asked a documentation custom mode to do the same thing. It is really interesting to think about how this is closer to building a team and a development methodology than the prompt engineering it used to be.