LabRoundupColumnNews
blog/Articles/Returning to Waterfall in the Age of AI
ai-development-waterfall-return-cover-en

Returning to Waterfall in the Age of AI

In an era of vibe coding, one developer returned to waterfall at a startup. Plan Mode's true identity was a design process completed decades ago. A case for 'hammer your design.'

Column
kkm-horikawa

kkm

Backend Engineer / AWS / Django

2026.04.117 min0 views
Key takeaways

In an era of vibe coding, one developer returned to waterfall at a startup. Plan Mode's true identity was a design process completed decades ago. A case for 'hammer your design.'

A Kitchen Without Recipes

Imagine a chef handling pufferfish for the first time. No license. No knowledge of where the poison sits. No tasting. No health inspection. The plate goes straight to the customer.

Ask why, and the answer is "speed matters."

It sounds like a joke, but something very similar is happening in software development right now. AI has dramatically increased the speed of writing code. As a result, "we can build fast so who needs design," "testing slows us down so let's skip it," and "review? AI wrote it, so it's fine" are becoming increasingly common sentiments.

I work as a freelance engineer, developing systems for several clients. And I've intentionally returned to waterfall development. The methodology that was synonymous with "outdated" just a few years ago.

"Skipping steps because speed matters" — how is that different from skipping the poison check on a pufferfish? This is a story that started with that question.

You're Giving Instructions. But the Phases Are Gone.

There's a term called vibe coding. You give the AI two or three lines of instruction — "add a login feature here," "use one-time passwords," "set up this endpoint as a POST API" — and use whatever code comes out. For prototypes or hackathons, it can be useful. The problem is when this approach gets carried into production.

Here's what happens. AI is an excellent coder — give it instructions and it writes code. But with instructions limited to functional requirements, the design, testing, and review phases disappear entirely. AI can serve as designer, tester, and reviewer, yet it only gets used as a coder. The other phases aren't even acknowledged.

"Leave room for interpretation and AI will use knowledge you don't even have to produce something impressive." You see this in countless first-impression posts. And sure, sometimes it looks that way. But systems almost never have a single correct direction.

Say you ask AI to "make a bento box." Out comes a beautiful sashimi platter. The quality looks high. But that bento is for someone heading to a baseball game in the middle of summer. The sashimi will spoil. The quality demanded for a bento carried outdoors in summer heat and the quality demanded for a dish eaten on the spot are completely different things. Both are "high quality," but the direction is different. AI couldn't see that this food would be carried around outdoors for hours.

A professional chef asks "when," "where," and "who is eating" before cooking. Development is the same. What a professional developer should hammer out in the design phase isn't correctness of code — it's who will use it, under what conditions, and what must not go wrong. Edge cases, operational constraints, security, performance requirements. None of these emerge from functional requirements alone. They require a human eye, time spent imagining scenarios, consensus among stakeholders, and only then do they land in a design document.

Many voices advocating for vibe coding appear to have never experienced a standard development cycle. What should you consider before writing code? What should you do before merging to the main branch? Without that experience, it's impossible to appreciate the value of those phases. That's why they can say "design docs? unnecessary" and "review? just slows things down."

Here's where it gets worse. We've long talked about the problem of "only one person understands this spec" — the bus factor issue. With vibe coding, that person never existed in the first place. The AI writes code within a session, and when that session ends, no human knows why the code was written that way. It's not even a bus factor problem. Code whose author never existed ends up in production.

Even in agile, this level of breakdown was rare before AI. Individual developers wrote their own code, so at minimum they understood the spec. Sprint reviews aggregated information to the scrum master, and some form of documentation was produced. That was the last line of defense. Vibe coding eliminates even that.

A kitchen with no tasting, no recipes left behind. The next day, no one can reproduce the dish.

Plan Mode Was a Reinvention of Recipes

Interestingly, the industry has already started noticing this problem. It's just reinventing the solutions as "something new."

Claude Code has a Plan Mode. Press Shift+Tab twice and it activates, investigating and creating a plan before touching any code. The official recommended workflow is Explore → Plan → Implement → Commit.

SDD (Spec-Driven Development) — writing specifications first and having AI implement them — is also gaining traction. AWS rebuilt an entire IDE around this methodology. Thoughtworks recognized SDD as a key engineering practice of 2025.

Let's line these up.

Waterfall PhaseAI Tool's "New Concept"
Requirements DocumentPrompt / Spec
High-Level DesignCLAUDE.md / AGENTS.md
Detailed DesignTask Split Files
Design ReviewPlan Mode
ImplementationAI Code Generation
Testing & Code ReviewAI Code Review / Automated Testing

Look familiar? Requirements → high-level design → detailed design → implementation → testing. This is waterfall. The name changed, the tools changed, it's written in Markdown now, but the process is the same.

What the industry has been calling a "new recipe management system" is really just the age-old practice of writing the recipe before you cook.

Do Recipes Go Stale?

"Design documents go stale, so they don't belong in the repository." That's a common argument. I've read articles advocating against design docs, arguing that "explanatory documentation becomes stale context that degrades AI performance" and "tests resist rot better than documents."

The point isn't lost on me. Auditing design documents at one workplace revealed that most files didn't match the actual codebase, left untouched since their creation date. This isn't unusual.

But think about it. What actually caused those documents to go stale?

The update step wasn't built into the process. Spec changes weren't reflected. Reviews didn't check alignment with the design. Every single one of these is a breakdown of the development process, not a flaw of design documents as a concept.

Think back to how development teams operated before AI. Design documents are updated during the design phase. Code follows the design. Tests and reviews verify compliance with the design. When this straightforward cycle runs properly, design documents stay current. Staleness isn't a flaw in the methodology — it's a failure in the operation of skipping the update step.

What becomes critical here is the separation of context and session. Accept that conversations with AI (sessions) are ephemeral. Meanwhile, persist design documents (context) in the repository. AI can read git history far faster than any human. That's exactly why having decision records in the repository is worth even more in the age of AI.

What went stale wasn't the recipe — it was the kitchen's operations.

Build the Team, Then Place AI

So what should you do? The answer is straightforward.

Think back to a pre-AI development team. There was a designer, an implementer, a tester, and a reviewer. Each had a clear role, handing off deliverables between phases.

AI can be the designer, the coder, the tester, and the reviewer. But having it play "all roles simultaneously in a single prompt" is where things go wrong.

The right approach: first, draw the team structure as if it were all humans. Which phases need which roles. Then decide "this role goes to AI" and "this phase stays with humans." In other words, build the human-era development cycle first, then decide which seats AI fills.

A German university study produced interesting data on this. They measured generative AI's effectiveness at each waterfall phase, and the results look like this.

PhaseAI EffectivenessNote
Preparation58%
Analysis34%
Design26%Lowest
Coding & Testing87%Highest

Design scored the lowest at 26%. Meanwhile, Coding & Testing dominated at 87%. AI was judged "insufficient in detail" and "shallow in trade-off analysis," confirming that human judgment is indispensable in the design phase.

In short, design is the phase humans should own, and implementation plus testing is where AI excels. Waterfall's structure — humans fixing the design upstream, AI running implementation and testing downstream — aligns perfectly with this data.

The head chef writes the recipe. AI handles the cooking and plating. So far, this arrangement works best.

What You Learned in Culinary School Still Works

You sometimes hear that "pre-AI development knowledge is irrelevant for AI development."

I think that's a stretch. Requirements definition, high-level design, detailed design, test planning. These phases, refined over decades, continue to serve as the foundation even when AI writes the code. The entity writing the code changed from human to AI; the steps of "decide what to build," "design it," and "verify it matches the design" haven't disappeared. And they mustn't.

Waterfall's design documents are "Plan files readable by both humans and AI," polished over decades. The format has changed. Design documents used to live on separate servers from the repository. Now they sit in docs/ within the repo, written in Markdown. CLAUDE.md and AGENTS.md are just variations of design documents. The names are different, but the purpose — "consolidate the project's big picture and decision rationale in one place" — is the same.

"Writing design docs slows you down," I've been told. But what's the benchmark for "slow"? A system built fast but incomprehensible three months later is, in the long run, much slower.

Even when the knife becomes AI, the procedures for preventing food poisoning don't change.

What the Numbers Show

"Skipping design leads to failure" — that's not just my opinion. There's data.

A joint study by MIT Sloan and BCG found that more than 80% of the $684 billion invested in AI in 2025 failed to generate value. The study's primary explanation: "lack of structured planning." Development that skips the design phase on the assumption that AI will handle everything fails at a striking rate — the scale of the losses alone makes the point.

Meanwhile, the market is moving in the opposite direction. Tessl, a startup supporting SDD (Spec-Driven Development) workflows, raised $125 million. SpecKit, which implements SDD workflows, has surpassed 77,000 GitHub stars. On April 8, 2026, an article titled "Spec-Driven Development Is Waterfall in Markdown" made the point explicit: what the industry is reinventing as a new practice is waterfall from 40 years ago.

Field measurements back this up. DeNA restructured their AI development workflow around CLAUDE.md and AGENTS.md. The results in numbers:

MetricBeforeAfterChange
PR review round-trips7.22.7▼62%
Review comment count6.01.9▼68%

Giving AI properly structured design context improved the precision of AI-driven development and sharply reduced human review overhead. The numbers run against the assumption that "thorough design slows you down."

The better your design information, the more accurately and with fewer corrections AI operates. That's not intuition — it's measured.

Hammer Your Design

Back to the pufferfish.

Preparing pufferfish requires a license. You learn where the poison is, process it following the correct procedure, verify the result, and pass the inspection. Nobody calls these steps "too slow." What goes on the customer's plate matters.

Software rarely puts lives at stake. But halting a user's business, losing their data, eroding trust — that happens somewhere every day. And more often than not, the root cause is absent design, skipped testing, or missing review.

Write your design. Have AI implement against it. Run tests and reviews. Waterfall? Maybe. But this isn't "going back to an old method" — it's returning to the fundamentals of building things.

Humans and AI, all hands on deck — hammer your design. That's my answer for the age of AI.

Update note: This article is based on the author's observations and experience as of April 11, 2026. AI development tools and methodologies evolve rapidly; updates and revisions will follow as the landscape changes.