← Back to Blog

The #1 Problem with AI Coding Agents Isn't the Code — It's the Spec

Two of today's top Hacker News posts independently arrived at the same conclusion: AI agents write code faster than teams can write specs. Here's the practical fix — a spec validator agent that guards your pipeline.

Published by GetClawCloud · May 7, 2026

Simon Willison published "Vibe coding and agentic engineering are getting closer than I'd like" today — and it's #3 on Hacker News with 399 points and 432 comments. Meanwhile, another post sat at #1 for hours with over 500 points: "The bottleneck was never the code."

Both hit the same nerve from different angles.

Willison's concern: as coding agents get more reliable, even experienced engineers stop reviewing every line. They trust the agent for routine work — JSON endpoints, SQL queries, boilerplate. The guilt creeps in when they realize they haven't read the code they're shipping to production.

The second post goes deeper: "The bottleneck was never the code. For fifty years the residue was expensive enough to keep our attention on it. With coding agents the cost has fallen far enough that we can see what's underneath: people trying to agree."

        "What slows down a team where agents do the implementation is the
        production of specifications precise enough for an agent to pick up and
        run. Engineers are not waiting on other engineers anymore. They are
        waiting on the next well-formed spec."
      

This is the hidden tax of vibe coding: a feature that takes 10 minutes to code might take 2 hours to spec properly. And if the spec is wrong, the agent happily writes 300 lines of the wrong thing — at full speed.

Jevons Paradox Has Entered the Chat

One of the most insightful points from the HN thread: Jevons Paradox. When code gets 10x cheaper to write, teams don't write 10% of the code for the same result. They write more code. Internal tools for problems nobody quite had. Prototypes that would've been "not worth the time" three months ago.

Steve Jobs (1997): "Focus is saying no." The discipline of saying no gets harder when every "yes" costs one prompt instead of three days of engineering.

The fix isn't to stop using coding agents. It's to build a validation layer before the agent touches any code.

How an AI Spec Validator Changes the Game

Instead of sending a vague prompt directly to Claude Code, Cursor, or Codex, route it through a spec validator agent first. The validator:

Checks for ambiguity — vague terms like "handle", "process", "optimize" get flagged
Identifies missing constraints — what's the input format? The error state? The edge case?
Suggests test scenarios — before a line of code is written, the spec should define how you'll know it works
Estimates scope — is this a 10-line change or a multi-file refactor? The spec should match the effort

One Telegram message, one agent, one validation pass — and you never waste an agent's time on a broken spec again.

The Prompt: AI Spec Validator Agent

Copy-paste this into your OpenClaw-powered Telegram bot, then send it any feature request, ticket, or prompt you'd give to a coding agent.

How to use:

Deploy OpenClaw on GetClawCloud
Paste the prompt as your first message
Send any spec, ticket, or prompt — the agent validates it

You are an AI Specification Validator and Scope Auditor. Your job is to analyze feature requests, tickets, and prompts before they go to a coding agent. ## Input User will provide one of: - A feature description ("Build a user dashboard that shows...") - A Jira/Linear ticket or PRD excerpt - A direct prompt intended for Claude Code, Cursor, or Codex - A natural language request ("Can the AI add a search bar?") - A complete spec they want reviewed ## Validation Workflow ### Phase 1: Ambiguity Scan Scan the input for vague or underspecified terms. Flag words like: - "handle", "process", "manage" (what precisely does this mean?) - "optimize", "improve", "better" (against what metric?) - "support", "allow", "enable" (with what interface?) - "sometime", "eventually", "when needed" (what triggers it?) - "properly", "nicely", "well" (not testable) For each flagged term, propose a concrete replacement. ### Phase 2: Completeness Check Check if the spec defines: - [ ] Input format (what data comes in, what shape is it?) - [ ] Output format (what gets produced, how is it consumed?) - [ ] Error states (what happens when the DB is down? Input is malformed?) - [ ] Edge cases (empty state, max size, concurrency, rate limits) - [ ] Success criteria (how do you know it works?) - [ ] Dependencies (does this touch the auth module? payment system?) - [ ] Non-goals (what is explicitly NOT in scope?) For each missing item, flag it with a severity: 🔴 Critical (will produce wrong output) / 🟡 Warning (will require rework) / 🔵 Info (nice to have). ### Phase 3: Scope Estimation Estimate the actual scope: - 🔵 Trivial — 1 file, no new dependencies, <20 lines - 🟢 Small — 1-2 files, existing patterns, <100 lines - 🟡 Medium — 3-5 files, new API calls, some state management - 🟠 Large — 5+ files, new integrations, data migrations - 🔴 Complex — multiple systems, external APIs, auth/security implications If the scope level doesn't match the request (e.g. "just add a search bar" → 🟠 Large because it needs indexing, auth filtering, and pagination), explain the gap. ### Phase 4: Test Scenario Generation Generate 3-5 test scenarios the coding agent should handle: 1. Happy path (the obvious case) 2. Sad path (error case) 3. Edge case (unusual but possible) 4. Performance boundary (if applicable) 5. Security consideration (if applicable) For each scenario, describe the input, expected behavior, and how you'd verify it. ## Output Format ## Spec Validation Report ### 🚩 Ambiguities Found - "handle errors" → Replace with: "On HTTP 500 response, log the error to /var/log/app.log and return a 502 status with JSON body { 'error': 'upstream_failure' }" ### 📋 Completeness (X/Y items defined) 🔴 Missing: input format, error states 🟡 Missing: edge cases 🔵 Nice to have: non-goals ### 📏 Scope: [Level] [Explanation of scope gap if any] ### 🧪 Test Scenarios ... ### ✅ Verdict: [PASS / CONDITIONAL / REJECT] - PASS: Spec is clear, send to coding agent - CONDITIONAL: Fix the 🔴 items first, then proceed - REJECT: The spec needs significant work before a coding agent can be productive ## Rules - Be ruthless but constructive. The goal is to prevent wasted agent time. - Always provide concrete replacements for vague terms, not just "this is vague." - Cite real scenarios where ambiguous specs have caused bugs. - If the user provides a direct coding agent prompt, show them the revised version. - Output in plain text with clear section headers.

💡 Works in any OpenClaw agent. Paste, send any spec, and the agent validates it before it reaches your coding agent.

Real Scenarios This Agent Handles

📝 "Validate this ticket before I assign it"
Paste a Linear ticket. The agent flags missing acceptance criteria, vague terms, and hidden complexity. Use this before every sprint planning session.

🤖 "Check my prompt before I send it to Claude Code"
Share the prompt you're about to send. The agent reviews it for ambiguity and suggests concrete revisions that save you a feedback cycle.

📐 "Estimate this feature"
Give a one-line feature request ("add dark mode"). The agent surfaces the actual scope: CSS variables, theme switching, persistence, system preference detection, accessibility contrast checks.

🧪 "Generate test scenarios for this spec"
Already have a spec but want to make sure it's testable? The agent generates happy-path, error, and edge case scenarios that your coding agent should handle.

🔄 "Review my PRD"
Paste an entire product requirements document. The agent scans every section and produces a structured validation report with concrete improvement suggestions.

Why Two Hacker News Posts Agree (And What to Do About It)

Today's top HN stories share a thesis:

Post	Score	Key Insight
"Vibe coding and agentic engineering are getting closer"	399	Trusting agents for production code means we review less. The safety net moves upstream.
"The bottleneck was never the code"	511	Writing code is now the cheapest part. Spec-writing, negotiating, and agreeing are the bottleneck.

The practical response isn't to stop vibe coding — it's to add a spec validation step between the idea and the implementation. One agent that checks your spec, another that writes the code. Both feed back into each other.

This is the pattern that's working for teams shipping at speed without accumulating technical debt from 12 half-baked features.

⚠️ A spec validator doesn't replace code review. Simon Willison's point still stands: when you ship AI-generated code to production, you're responsible for every line. The spec validator prevents you from building the wrong thing — but you still need to verify that the right thing was built correctly.

Spec Validation as a Daily Habit

The best teams are adding spec validation to their daily workflow:

Before sprint planning: Run all tickets through the validator. Surface hidden scope before commitments are made.
Before each prompt: Route every Claude Code / Cursor prompt through the validator first. One extra message, zero wasted agent turns.
Weekly spec health check: Cron job that reviews the week's completed tickets against their original specs — are you building what you planned?

With OpenClaw's cron scheduling, you can automate recurring checks delivered directly to Telegram — no dashboards to watch.

How to Use It

Deploy an OpenClaw agent on GetClawCloud — no VPS, no Docker, no config, 1-minute setup
Paste the spec validator prompt above into your Telegram bot
Send any feature request, ticket, or coding prompt for validation

One agent, one prompt, one message — and your coding agents never waste a turn on a broken spec again.

        The hottest discussion on Hacker News today isn't about which coding
        agent is fastest. It's about what happens after speed stops being the
        bottleneck. The answer: build a spec validation layer before your agents
        touch any code. Paste the prompt above and start validating today.
      

Deploy Your Spec Validator in 1 Minute

Launch OpenClaw on the cloud, connect Telegram, and paste the validation prompt. No server setup, no complex pipeline config.

Start with GetClawCloud →