Specification Is the New Literacy

A hand drawing an intricate arrangement of interlocking gears and a golden spiral on aged parchment by candlelight — the act of specifying mechanism before building it

Something has shifted.

For most of the history of software, the bottleneck was production. You had an idea, you knew what you wanted, and the hard part was making the computer do it. Writing the code, debugging the code, shipping the code. The scarce resource was engineering labor. The expensive skill was implementation.

That constraint is collapsing. AI agents can write code fast, write a lot of it, and, increasingly, write it correctly. The cost of turning a specification into working software is dropping toward zero.

The bottleneck has moved. The scarce resource is no longer “can we build this?” It’s “do we know what to build?” And more precisely: can we specify what we want precisely enough that it gets built correctly?

That’s a different skill. I didn’t have it when I started, and building it changed everything about how I work.

The Production Cost Collapse

When building was expensive, cost itself was a forcing function for specification. You thought carefully about what you wanted because mistakes were costly. Changing direction mid-build meant throwing away weeks of work. The economic pressure created a natural brake: if you couldn’t justify the cost, you didn’t build it. If you were uncertain, you resolved the uncertainty before committing resources.

That natural brake is vanishing. When an AI agent can produce a working implementation in minutes, the cost of building the wrong thing drops to near zero for the first build. But the wrong thing compounds. Architecture decisions made on bad assumptions persist. Wrong data models create downstream problems. Wrong user flows get deployed and create expectations. The build is cheap. The consequences of building wrong are just as expensive as they ever were.

This is the trap. The feeling of speed masks the absence of specification. You can iterate so fast it feels like you’re converging on the right thing, when what you’re actually doing is building confidently in the wrong direction. Five iterations of the wrong idea, each produced in minutes, isn’t progress. It’s expensive in a way that doesn’t show up on a time sheet.

Testability as Specification

If you cannot describe how to verify something, you have not specified it.

The test is not an artifact that follows specification: it is the act that reveals whether specification has occurred. Attempting to define verification for something undefined exposes the gaps that prose descriptions hide.

The sequence works like this:

Describe intent. What are you trying to build? This can be informal, conversational, even vague. The intent is the starting point, not the finish line.

Write testable assertions. This is the specification act. “The system should be fast” is not testable. “The API responds in under 200ms at the 95th percentile” is. “The user experience should be intuitive” is not testable. “A new user can complete the onboarding flow without documentation” is.

Discover which assertions are untestable. This is where specification happens. You try to write the test and discover you can’t. The requirement is ambiguous, contradictory, or depends on something you haven’t decided yet. The untestable assertion is a specification gap made visible.

Refine intent. Now you know what you don’t know. The gaps exposed by failed test assertions become questions to answer. Once answered, the tests become writable. And now you have both a spec and verification criteria, derived from the same process.

Build. Only now do you have something precise enough to build. The specification and the verification exist simultaneously, because they were created by the same act.

Writing test assertions is concrete work. “What does the user see?” “What does the API return?” “What breaks if this input is malformed?” It’s closer to building than to bureaucracy. It feels like making progress, not writing documents. Specification that feels like overhead will be skipped, and skipped specification is how you build the wrong thing fast.

Code as Opaque Weights

A framing that makes this concrete:

When a human writes code, they can explain every line. You can review the code, understand the logic, trace the execution path. The code is transparent: it’s both the implementation and the specification of how the implementation works.

When an AI agent writes code, that transparency evaporates. The code works (or doesn’t), but the internal structure is largely arbitrary. The agent didn’t choose that variable name because it means something. It chose it because the token probability was high. The architecture isn’t the result of deliberate design decisions; it’s the result of pattern matching against training data.

This is exactly the situation in machine learning. You don’t evaluate a neural network by reading the weights. You evaluate it by measuring its behavior against known inputs. The weights are opaque. The evaluation is behavioral.

Agent-written code is the same. The code is opaque weights. Correctness is inferred from externally observable behavior: does the system do what the specification says it should do? Internal structure is irrelevant. Review the behavior, not the implementation.

This changes the operator’s role. You’re not a code reviewer anymore. You’re a scenario designer. Your job isn’t to read the code and verify it’s correct — your job is to design scenarios that would expose incorrect behavior, and then verify the system passes them.

That’s specification work.

A Victorian engineer in a top hat standing before an elaborate mechanical apparatus, blueprints pinned to the walls around him — the gap between the plan on paper and the machine that was built from it

Context as Specification

A watercolor illustration of an ornate keyhole — through it, a vivid detailed landscape is visible, surrounded by rough sketches and scattered notes on the page around it

The industry is converging on a useful distinction: prompt engineering versus context engineering. Prompt engineering is optimizing the phrasing of a query. Context engineering is building the complete informational environment the agent operates within: the system prompt, retrieved documents, tool definitions, conversation history, injected metadata, prior corrections. (Andrew Ng articulated the concept when defining the foundational agentic patterns; Antonio Gulli expands on it in his Springer book Agentic Design Patterns.) The two are not the same thing, and conflating them obscures where specification actually happens.

Context engineering is specification work. Which documents you inject decides what the agent knows about your codebase. Which tools you expose defines the scope of what it can do. Surface the relevant conversation history and architectural decisions and you’re shaping the constraints the agent reasons inside. None of this is infrastructure plumbing. It’s specification that rarely gets written down or treated as such.

Narrow the context to just the current file and you get an agent that optimizes locally, ignores system-wide constraints, and reinvents patterns that already exist elsewhere in the codebase. Broaden it to include the full repository, conversation history, architectural decision records, and prior corrections and you get an agent that reasons about the whole system: one that knows why the data model looks the way it does, what mistakes have already been tried, what the intended direction is.

The context was the specification.

The connection to testability is direct. If you can’t describe how to verify something, you haven’t specified it. And if you haven’t specified the context — if you haven’t decided deliberately what the agent knows and doesn’t know — then you haven’t specified the agent. You’ve just hoped the implicit environment is sufficient. It usually isn’t. The gaps in context become gaps in behavior, and those gaps show up as outputs that are technically correct given what the agent knew and deeply wrong given what you intended.

Context engineering is where the specification conversation has to go next.

The Operator’s New Role

If building is cheap and code is opaque, what does the human actually do?

Design scenarios, not review code. A scenario is an end-to-end user story: “user logs in, navigates to dashboard, creates an entry, entry persists across refresh.” It describes observable behavior, not implementation details. It’s what “correct” looks like from the outside.

Own the verification criteria. The scenarios must live outside the thing being tested. An agent that writes both the code and the tests can game the tests; the simplest implementation that passes a narrow test isn’t necessarily correct. The validation criteria have to be independent of the implementation. Separation of concerns, applied to the specification process itself.

Manage the intent-to-verification loop. Start with fuzzy intent. Sharpen it through test assertions. Discover gaps. Resolve them. Feed the refined spec to the builder. Validate the output against the scenarios. Iterate. This loop is the operator’s core workflow, and it’s entirely a specification activity.

Calibrate confidence, not inspect code. Instead of line-by-line code review, the operator asks: do my scenarios cover the important behaviors? Are there edge cases I haven’t specified? What would it look like if this were wrong in a way my current scenarios don’t detect?

The hard thinking was always specification — we just used to do it implicitly while writing code. Now the code writes itself and the thinking has to be explicit.

What This Means in Practice

The shift is already happening. People who are effective with AI coding tools aren’t the best prompters. They’re the best specifiers. They know what they want before they ask for it, can describe the expected behavior in concrete terms, and know what “done” looks like before they start.

People who struggle aren’t struggling with the AI. They’re struggling with the specification. They know they want something but can’t articulate what, precisely. So they iterate in vague directions, hoping to recognize the right answer when they see it. Sometimes that works. Usually it produces something that looks right but is subtly wrong in ways that won’t surface until much later.

The fix isn’t better prompts. It’s better specification habits:

Before asking an AI to build something, describe how you’d test it. If you can’t describe the test, you haven’t specified the thing. That’s not a failure — it’s information. It means you need to think more before you build.

Treat “I’ll know it when I see it” as a specification gap. You might be right, maybe you will recognize the correct output. But relying on recognition instead of specification means every iteration is a coin flip. Specification turns the coin flip into a convergent process.

Write the acceptance criteria before the implementation request. Three bullet points describing what “done” looks like will produce better AI output than three paragraphs describing what you want. Concrete beats comprehensive.

When the output is wrong, ask whether the specification was right. Often the AI built exactly what you asked for. The problem was that what you asked for wasn’t what you wanted. The specification was the failure point, not the implementation.

The Literacy Metaphor

There’s a direct parallel.

Before widespread literacy, the ability to write wasn’t a general skill — it was a specialized profession. Scribes wrote. Everyone else dictated. The bottleneck was production: turning thought into written text.

Then printing collapsed the production cost. Word processors collapsed it further. The internet made distribution essentially free. And the bottleneck shifted from “can I produce text?” to “can I write clearly?” The scarce skill moved from production to composition.

We’re watching the same shift happen with software. The production cost is collapsing. The scarce skill is shifting from implementation to specification. Literacy became a universal requirement not because everyone became a professional writer, but because clear written communication became essential for participating in modern life. Specification literacy is heading the same direction.

Not everyone will be a professional specifier. But everyone who works with AI tools will need to be literate enough to describe what they want in terms that are precise, testable, and verifiable.

What you can’t specify, you can’t build correctly. What you can’t verify, you can’t trust.