SDLC

Vibe Coding For Realsies: Spec Driven Development

Note: for the actual implementation details on my experiments around SDD, check my demo walkthrough doc.

If you’ve spent any time with AI coding agents, you know the thrill. You ask for an expense tracker, it generates one. You ask for a menu feature, it adds it. You point out a bug, it proposes a fix. It feels like magic—until it doesn’t.

Inevitably, the model starts hallucinating features, forgetting earlier decisions, or creating code paths you never asked for. You try to rein it in with a custom instructions file , but those quickly fall out of date. Before long, you’re wrestling with the very agent that was supposed to save you time.

Spec-Driven Development (SDD) is the antidote to this chaos.

Rather than letting code (and an over-eager LLM) dictate direction, SDD gives us structure, guardrails, and a development rhythm that creates clarity instead of drift.

What Is Spec-Driven Development? (SDD)

Specifications can make or break any project. For example, a few years ago I was in charge of a development team tasked with producing a product catalog Web API. Right on schedule, two weeks before we were due to roll out the resident silverback software architect dropped a bomb on us – a late breaking requirement that would instantly have spiraled the project’s complexity by a factor of ten, and made us months late in delivery. Having a strict written functional requirement I could point to (in this case a response time under 0.2 seconds), and a delivery date in the near term, acted as a magic shield to keep us on track.

Specs can be both a shield and a trap. As I wrote in my book, it’s the nonfunctional requirements- which often aren’t understood well by the project team or are implicit – that can delay or sink a software project. Database standards, authorization requirements and security standards, lengthy and late-breaking deployment policies – the fog of the unknown has us in its grip.

Spec-Driven Development (SDD) is meant to address this gap. Instead of code leading the way and the understood project goals and context falling hopelessly behind – the spec drives everything. Implementation, checklists, and task breakdowns are no longer vague.

The promise is that this uses AI capabilities and agentic development the right way. It amplifies a developers’ effectiveness by automating or handing off repetitive work that an agent can often do much more quickly and effectively – leaving us to do the actual creative work humans do best; refactoring, directing and steering code development, critical thinking around feature best paths.

I like the writeup from the GH blog by one of the speckit coauthors, Den:
Instead of coding first and writing docs later, in spec-driven development, you start with a (you guessed it) spec. This is a contract for how your code should behave and becomes the source of truth your tools and AI agents use to generate, test, and validate code. The result is less guesswork, fewer surprises, and higher-quality code.

Unlike the vibe coding videos I’ve seen, which are mostly greenfield and very POC / MVP level in complexity – I think SDD has the potential to be ubiquitous. It could fit almost anywhere, even with very complex and monolithic app structures. It could help with large existing legacy applications. And it enforces development standards that can prevent a lot of wasted time and effort.

Let’s start with a quick overview of the process.  

Specify, Plan, Tasks, Implement: A Four Step Dance

Software development with SDD follows this lifecycle:

Instead of jumping into coding (or vibe-coding your way into a corner), you follow a four-step loop:

  • 1. /specify — Describe what you want: High-level aims, user value, definitions of success. No tech stack. No architecture.
  • 2. /plan — Decide how to build it. Architecture, constraints, dependencies, standards, security rules, and any nonfunctional requirements.
  • 3. /tasks — Break it down. Small, testable, atomic units of work that the agent can execute safely.
  • 4. /implement — Generate and validate the code. Here TDD is mandated. Tests first, then implementation, then refinement.

Starting with a constitution is a game changer because – as the orig specs state – they’re immutable. Our implementation path might change, and we can even change the LLM of choice  – but these core principles remain constant.. Adding new features shouldn’t render our older system design work invalid.

Step 1: / specify

You start with a constitution: the unchanging principles and standards for your project. It’s the backbone of everything that comes after.

GitHub’s Spec Kit can generate a detailed spec file from even a one-line input like “I need a photo album site that allows me to drop and share photos with friends.” It even marks uncertainties with [NEEDS CLARIFICATION], essentially flagging the traps LLMs usually fall into. And though this is optional – I would highly recommend running /clarify to address each ambiguity, refining your spec until it reflects exactly what you want the system to do.

By the end, you’ve got:

  • A shared understanding of what success looks like
  • A great first draft of what user stories, acceptance criteria, and feature requirements you have for the project.
  • Clear user stories
  • Acceptance criteria
  • Detailed requirements
  • Edge cases you probably wouldn’t have thought of

Step 2: /plan

We just finished our first stab at “why” – this is where the “how” comes in. /plan is where we feed the LLM our tech choices, constraints, standards, and organizational rules.

Backend in Node? React front end? Performance budgets? Security controls? Deployment quirks? Legacy interactions? All of it goes here. As Den notes in the GitHub blog:

Specs and plans become the home for security requirements, design system rules, compliance constraints, and integration details that are usually scattered across wikis, Slack threads, or someone’s brain.

Spec Kit turns all of this into:

  • Architecture breakdowns
  • Data models
  • Test scenarios
  • Quickstart documentation
  • A clean folder structure
  • Research notes
  • Multiple plan options if you request them

Look at that beautiful list of functionality… Including a very nifty app structure tree. OMG!

Step 3: /tasks

The third phase is where you ask the LLM to slice the plan we just created into bite-sized tasks—small enough to be safe, testable, and implementable without hallucination. It also flags “priority” tasks and provides a checklist view for the entire project.

This creates something rare: truly atomic, reviewable, deterministic units of work.

The /analyze command is especially powerful—it has the agent audit its own plan to surface hidden risks or missing pieces.

At first glance – this is a nearly overwhelming amount of work:

This is a lot to go through!! Where to start? Thankfully it tells me which ones are important:

Step 4: /implement

Now the LLM finally writes code. But instead of working from guesswork and half-remembered context, the agent is now writing code from:

  • A clarified spec
  • A vetted plan
  • A task list
  • Test requirements
  • An architectural contract
  • Immutable principles

You can implement by phase or by task range. Smaller ranges work better for large projects (context windows get spicy otherwise).

The best part? Updating a feature is now simple: change the spec, regenerate the plan, regenerate tasks, re-implement. All of the heavy lifting that used to discourage change is gone.

Summing Things Up

The real magic of SDD isn’t the commands—it’s the mindset:

  • Specs are living, executable artifacts
  • Requirements and architecture stay fresh
  • Tests are generated before code
  • LLMs stop improvising
  • Creativity shifts from plumbing to design
  • Consistency is enforced, not hoped for
  • Documentation emerges automatically
  • Adding features becomes a natural loop
  • Legacy modernization becomes sane again

As AWS and GitHub both point out, vibe coding is intoxicating but fragile. It struggles with large codebases, complex tasks, missing context, and unspoken decisions. SDD fixes the brittleness without killing the creativity.

It keeps the fun of vibe coding, but adds discipline, traceability, and clarity—like pairing with a brilliant junior dev who follows instructions with perfect literalness.

I do think Spec Driven Development will be changing very rapidly over the next few years. But it definitely is here to stay! Its in line with how AI coding agents are meant to work, and it allows us to focus on the creative / business implications of what we’re writing – a force multiplier for the innovative developer.

For Future Research

I already mentioned more work to come on having the coding agent generate different approaches for comparison; also what the implementation might look like with different models besides Claude Sonnet.

Some interesting statements by Den in that GH blog article: “Feature work on existing systems, where he calls out that advanced context engineering practices might be needed.” What are these exactly?

A second point follows right after:

“Legacy modernization: When you need to rebuild a legacy system, the original intent is often lost to time. With the spec-driven development process offered in Spec Kit, you can capture the essential business logic in a modern spec, design a fresh architecture in the plan, and then let the AI rebuild the system from the ground up, without carrying forward inherited technical debt.”

I’d like to see this! We need more videos demonstrating splicing on new features to a large existing codebase.

References

  • The source repo and where you should start – SDD with Spec Kit (https://github.com/github/spec-kit)
  • Video Goodness:  the 40 min overview video from Den Delimarsky – The ONLY guide you’ll need for GitHub Spec Kit
  • See the detailed process walkthrough here.
  • Background founding principles, a must read… even if lengthy. It all comes from this. For example, the Development Philosophy section at the end clarifies why testing and SDD are PB&J, and how these guiding principles help move us away from monolithic big balls of mud.
  • I liked this video very much… because it walked through the lifecycle below very clearly. Good sample prompts as well.
  • The Uncommon Engineer spends a half day experimenting with SDD. He found it a frustrating experiment. Some of his conclusions: specifications can actually lead to procrastination (user feedback is the only thing that matters), and start embarrassingly simple with your specs. Testing specification compliance is NOT the same thing as testing user value…
  • Den Delimarsky talks vibe coding and SDD on the GitHub blog. “We treat coding agents like search engines when we should be treating them more like literal-minded pair programmers. They excel at pattern recognition but still need unambiguous instructions.”
  • Dr Werner Vogels, AWS re:Invent 2025 keynote. About 40 minutes in we’re talking about SDD – at 46 minutes in – Kiro and function driven development: “The best way to learn is to fail and be gently corrected. You can study grammar all you like  – but really learning is stumbling into a conversation and somebody helps you to get it right. Software works the same way. You can read documentation endlessly – but it is the failed builds and the broken assumptions that really teaches you how a system behaves.”
  • Tomas Vesely from GH explores writing a Go app using SDD. Interestingly, his compilation was slowing over time.

Culture and Agile – GM and its failed attempt to mimic Toyota

I had a good friend of mine – Mark Taylor – recommend some listening material recently on GM. I’ve been fascinated with Toyota since I first started learning about Agile development practices, and this podcast definitely was worth the time to listen. It’s a fascinating story. Why was Toyota so willing to be so open and revealing with one of its biggest competitors – GM – on its higher quality production processes? Turns out there’s a lot more to making cars than just an assembly line.

This isn’t just history. All successful companies hit a moment of complacency. For people who are interested in improving the quality of their working life – whatever the field – there’s some real lessons here. (And, if you’re still not convinced, think of all the billions of your taxpayer dollars that had to go into bailing American car companies after they went bankrupt!)

Some thoughts I had – in outline form – from this:

  • Culture Matters (are your teams top down or horizontal?)
    • “Back home in Fremont, GM supervisors ordered around large groups of workers. At the Takaoka plant, people were divided into teams of just four or five, switched jobs every few hours to relieve the monotony, and a team leader would step in to help whenever anything went wrong.”
  • Stopping The Line With Defects (how do you handle bugs?)
    • I can’t remember any time in my working life where anybody asked for my ideas to solve the problem. And they literally want to know. And when I tell them, they listen, and then suddenly they disappear, and somebody comes back with the tool that I just described. It’s built, and they say try this. Under the Toyota system, everyone’s expected to be looking for ways to improve the production process all the time, to make the workers’ job easier and more efficient, to shave extra steps and extra seconds off each worker’s job. To spot defects in the cars and the causes of those defects. This is the Japanese concept of kaizen, continuous improvement. When a worker makes a suggestion that saves money, he gets a bonus of a few hundred dollars or so…. And if you look around the Toyota plant, you can see the result of all those improvements. Hanging shelves that travel along with the car and the worker, carrying the parts and bolts they need within easy reach. Special cushions they throw into the car frames when they have to kneel inside. Workers’ tasks have been streamlined to the fewest possible steps, each step timed down to the second.
    • In contrast, in GM plants, workers could never stop the line – because they’re lazy, you know? “So now we tell the plant floor, don’t you worry about the production volume. You worry about quality. The last thing we want is to have a lot of defects flowing down the line that we have to repair later.”
  • It Takes Brains – You Can’t Just Mimic
    • (after a failed trainsplant) “For this workforce, there were no trips to Japan, no tearful sushi parties. And from the start, workers were skeptical…. This was one of the biggest differences between Fremont and Van Nuys. Van Nuys hadn’t been shut down. Turns out it’s a lot easier to get workers to change if they’ve lost their jobs, and then you offer them back. Without that, many union members just saw the Toyota system as a threat.”
    • “…much of the Japanese system happened off the factory floor, it answered something that had never quite made sense to {one of the managers}. Why had Toyota been so open with GM in showing its operations? We didn’t understand this bigger picture thing. All of our questions were focused on the floor, you know? The assembly plant. What’s happening on the line. That’s not the real issue. The issue is, how do you support that system with all the other functions that have to take place in the organization?”
    • “I remember one of the GM managers was ordered from a very senior level– it came from a vice president– to make a GM plant look like NUMMI. And he said, I want you to go there with cameras and take a picture of every square inch. And whatever you take a picture of, I want it to look like that in our plant. There should be no excuse for why we’re different than NUMMI, why our quality is lower, why our productivity isn’t as high, because you’re going to copy everything you see. Immediately, this guy knew that was crazy. We can’t copy employee motivation. We can’t copy good relationships between the union and management. That’s not something you can copy, and you can’t even take a photograph of it.”
  • Its Not Just The Assembly Line
    • “The team concept stressed continuous improvement. If a team got a shipment of parts that didn’t fit, they’d alert their bosses, who’d then go to the suppliers to fix the problem. Sometimes they’d realize the problem was in the part’s design, and Toyota engineers would go back to the drawing board and remake the part to address the problem workers were having on the assembly line. All the departments in the company worked together. …. But Ernie’s suppliers had never operated in a system like that. If he asked for fixes, they blew him off. And if he called Detroit and asked them to redesign a part that wasn’t working, they’d ask him, why was he so special? They didn’t have to change it for any other plant. Why should they change it for him?”
  • The High Cost of Complacency
    • “One of the ironies of GM was that in the moment it went bankrupt, it was probably a better company than it had ever been. In the factories, they had really dramatically closed the productivity gap that they had had for many, many years. And on the new products, they have much better quality. So the company that failed was actually doing better than it had ever done. But it was too late, and that’s really sort of hard to forgive– that if you take 30 years to figure it out, chances are you’re going to get run over. And they got run over.”
    • “They sold junk for a while. Just any kind of piece of crap they could roll out there, they did. And they paid a tremendous price for it. And even when they turned the corner in quality, people didn’t trust them. They’d say, well, gee, they’re building a good car now. Why aren’t they buying them?”

Give Yourself Nine Months to Fail.

(Note – this is a Greatest Hits posting from my previous blog. Enjoy!)

Babies aren’t born in one month.

Implementing Scrum Means Making Mistakes. Lots and Lots of Mistakes.

When I started on at my current employer – even after nine months as a team lead – I had very little to boast about by way of making change. I remember hearing a presentation from another manager that had the title, “Keeping The Lights On” – WOW! – And honestly that was how I felt about my job. Keeping the lights on, reacting to events – not getting ahead of them, and not able to control them. I was very disconnected from the work my team was doing. This changed as we moved out developers that were not contributing to the team and not being transparent about their work; and, as we got new projects coming in, I could cherrypick the fun ones and start participating in writing specfications and deploying solutions. Beyond taking on new work, though, Agile is the biggest reason why I’m still around. Without it, I’d be like the manager at my previous company – completely isolated from the daily work my team is doing, trying to defend our existence without the facts I need to prove that we’re delivering value.

I started thinking about my company – which seems to love mountains – and how every company’s definition of Agile is a little different. At the keynote I met an old compatriot – we had worked on a project together that was a failed Agile project. Everyone hated the DSU’s, which were 15+ minutes long, there was no target in sight since releases were pushed out to “never”, we went through constant rewrites as the technical team constantly refactored working code to get it “perfect”… it was a case study in how to do Agile wrong. After 18 months of development, they had to scrap the entire project and outsourced it to an offshore team – not one line of code ever saw the light of day. I believe a big reason why we failed was, we tried to change everything at once – and the team never gelled or considered itself invested in the outcome. In contrast, almost by accident, by doing things step by step – and rolling back when things weren’t working – we were successful in my current assignment. The path below took almost two years to implement, step by step – but it was done with the team setting the pace, and almost by accident we reached our goals.

I started out by talking about the fears I felt after a few months on the job. Overwhelmed, disconnected. I said, “I feel at times like I wasn’t as much in control as I need to be. I wasn’t in command of all the facts I need to support my case. I didn’t have enough visibility of what’s going on across the organization. I wasn’t giving my team all the tools and resources they need to thrive. And I wasn’t providing enough proof of delivering value aligned with what my company’s priorities are.”