← Back to Blog

AI Vulnerability Code Audit Agent: Find Kernel-Level Bugs Like Claude Did

The top of Hacker News today is a story we've seen before but it never stops mattering: CVE-2026-28952 — an integer overflow in Apple's macOS Tahoe 26.5 kernel, found by Claude in collaboration with Calif.io and Anthropic Research. Meanwhile, Nolan Lawson's post "Using AI to write better code more slowly" (145 points) argues the opposite of the slop-cannon narrative: AI is at its best when it carefully audits your code, not when it generates as much as possible.

Published by GetClawCloud · May 26, 2026

💥 This week's Hacker News signal:

Two stories. One unified truth: AI is an exceptional vulnerability finder — not because of speed, but because of thoroughness.

Nolan's post captures this perfectly. He runs a Claude sub-agent, Codex, and Bugbot on every PR, each finding bugs ranked by severity, then cross-references for false positives. The result? "Always finds tons of bugs, false positive rate near zero." Apple's security team ran Claude against macOS kernel source and found an integer overflow that could cause unexpected system termination across five major OS versions.

This isn't about using AI to write more code. It's about using AI to find the vulnerabilities in the code you already have — before attackers do.

Why Code Auditing Needs AI (Not Just More Developers)

The best security researchers are thorough. They don't skim — they check every path, every edge case, every integer boundary. But human thoroughness doesn't scale:

The difference between a good team and a great one isn't how fast they ship — it's how thoroughly they validate what they shipped.

The Problem: Most AI Code Review Is Surface Level

"Run this through GPT for a code review" produces one thing: generic advice. "Use proper error handling." "Add input validation." "Consider edge cases." It's noise — the kind of review that makes developers ignore AI code review entirely.

The approach that actually works — proven by Apple's CVE disclosure and Nolan's multi-model workflow — is structured, deep, multi-perspective auditing:

The Prompt: AI Vulnerability Code Audit Agent

This prompt turns any OpenClaw-powered Telegram bot into a dedicated code vulnerability auditor. It follows the same multi-model, deep-audit approach that found CVE-2026-28952 — but tuned for your specific codebase.

⚠️ Prerequisites: Your OpenClaw agent needs access to your codebase. Connect via GitHub, local files, or paste code snippets directly into Telegram. For private repos, use a personal access token.

How to Use It

  1. Deploy OpenClaw on GetClawCloud — one click, no server setup
  2. Paste this prompt to your bot
  3. Point it at a file, a commit diff, or a full repo and say "audit this for vulnerabilities"
You are an AI Vulnerability Code Audit Agent. Your job is to audit code for security vulnerabilities the same way security researchers found CVE-2026-28952 in the macOS kernel — thorough, multi-perspective, ranked by exploitability, with near-zero false positives. ## Workflow ### Phase 1: Scope & Inventory Before auditing, confirm the scope: 1. What codebase or file is being audited? (language, framework, purpose) 2. What trust boundary does this code live at? (user-facing API, kernel module, internal tool, frontend JS) 3. What kind of audit depth? (quick surface scan / deep structural audit / full multi-model cross-reference) ### Phase 2: Structured Vulnerability Scan For the provided code, check each vulnerability class in order: **A. Memory & Integer Safety (Kernel-Level)** - Check every integer operation for overflow/underflow (like CVE-2026-28952: integer overflow due to lack of input validation) - Check buffer allocations for off-by-one errors - Check unsafe pointer arithmetic, null dereferences, use-after-free - Check for missing bounds checks before array/vector access **B. Input Validation & Injection** - Check all user-controlled inputs for injection (SQL, XSS, command injection, path traversal) - Check for missing or insufficient input sanitization - Check for type confusion — accepting unexpected types that bypass validation - Check for prototype pollution in JavaScript/Node.js **C. Authentication & Authorization** - Check for missing auth checks on sensitive endpoints - Check for privilege escalation paths - Check for hardcoded credentials, tokens, API keys - Check for insecure session handling **D. Cryptographic & Data Handling** - Check for weak or custom cryptography - Check for hardcoded secrets, insecure random number generation - Check for data exposure through error messages, debug endpoints, logs - Check for missing encryption at rest or in transit **E. Business Logic Flaws** - Check for race conditions (TOCTOU, async state mutations) - Check for insecure direct object references - Check for lack of rate limiting on critical operations - Check for logic gaps that could bypass security controls ### Phase 3: Multi-Model Cross-Reference For critical and high findings: 1. Re-examine the finding from a different angle (e.g., if model A flags an integer overflow, verify the exact execution path that triggers it) 2. Only report findings you are confident are real vulnerabilities 3. Mark any finding where the exploitation path is unclear as "requires manual verification" — do not include it in the critical/high list ### Phase 4: Rank & Deliver Briefing Format your delivery as: 🚨 **Vulnerability Audit Report** **Code:** [file/repo name] **Depth:** [surface / deep / cross-reference] **Model:** [your model name] 🔴 **Critical** (0-3 items, proven exploit path, fix immediately) - [Vulnerability] in [location] - Impact: [what an attacker can do] - Exploitation: [how to trigger, step by step] - Fix: [specific code change recommendation] 🟡 **High** (proven but harder to exploit) - Findings with enough detail to reproduce - Impact and recommended fix 🟠 **Medium** (likely vulnerable, needs manual verification) - Findings with partial evidence - What to test manually ⚪ **Low / Informational** (best practices, hardening) - Notes for long-term improvement ✅ **All Clear** (if nothing was found) - "No vulnerabilities detected. [Top 3 strengths of this code]." ## Rules - If a finding lacks a clear exploitation path, mark it as "requires manual verification" — do not include in critical - Always explain why a vulnerability is exploitable, not just that it exists - For each critical finding, provide the exact code change to fix it - Cross-reference with a different mental model before reporting anything critical - Reference CVE-2026-28952 style: integer overflows from missing bounds validation are the most common deep bugs - "Nothing found" is a valid result — do not fabricate findings - Output in plain text with clear headers, suitable for Telegram delivery

💡 For best results, paste actual code files or commit diffs. The agent performs better with more context — include the full function or file, not just snippets.

Real Audit Examples You Can Run Right Now

🔍 Kernel / Systems Code Audit
"Audit this C function for integer overflows and buffer overruns. Check every arithmetic operation for missing bounds validation — like CVE-2026-28952."
Best for: embedded code, kernel modules, file parsers, protocol implementations.

🏗️ Pull Request Security Review
"Review this PR diff. I need a vulnerability-focused audit — ignore style issues and best practices, focus purely on security: injection, auth bypass, data exposure, race conditions."
Best for: pre-merge security gates on production code.

📦 npm Package Audit
"Here's the source of a third-party npm package I want to use. Audit it for malicious code, data exfiltration, prototype pollution, supply chain risks."
Best for: vetting dependencies before adding them to your project.

Why This Beats Traditional SAST Tools

✅ Exploit-path reasoning — not just "this line is vulnerable", but "here's exactly how an attacker triggers it"

✅ Business logic awareness — catches flaws SAST tools miss because they don't understand intent

✅ Multi-model cross-reference — kills false positives by checking from different analytical perspectives

✅ No false urgency — "nothing found" is a valid report. No vendor wants you to chase ghosts

✅ Delivered to Telegram — no dashboard, no pipeline config, no false-positive floods

Apple's security team spent months auditing macOS 26.5. An AI agent can do a deep audit of your codebase in minutes — and flag the same class of bugs that earned a CVE.

Deploy Your Vulnerability Code Audit Agent in 60 Seconds

OpenClaw on GetClawCloud gives you a Telegram AI agent with code reading, web search, and customizable prompts — no server setup, no Docker, no config files. Paste the prompt above and start auditing your codebase for the same class of vulnerabilities that Claude found in the macOS kernel.

Deploy on GetClawCloud →