build-validator — CI Validation Subagent
Open-source Claude Code subagent that validates the full build pipeline (typecheck, lint, test, build) and reports failures. Inspired by Boris Cherny.
What build-validator Solves Before You git push
Direct answer: build-validator is an open-source Claude Code subagent that runs your project's full local build pipeline — typecheck, lint, unit tests, and build — in a fixed staged order and reports a stage-by-stage pass/fail before you push to remote. The point is to catch CI failures roughly 10 minutes earlier, while the failing diff is still warm in your head, instead of finding out from a red GitHub Actions badge after the context switch is already complete.
The skill is a community-written equivalent of the build-check subagent that Boris Cherny describes as part of his pre-push routine on howborisusesclaudecode.com. It auto-detects Node, Python, Go, and Rust toolchains, lives at .claude/agents/build-validator.md, and ships in under 1 minute of setup. The four stages run in order — typecheck, lint, unit tests, build — and the loop stops at the first failure unless you explicitly say --keep-going.
How the build-validator Subagent Works End-to-End
The subagent file lives at .claude/agents/build-validator.md and is loaded by Claude Code via the Task tool with subagent_type resolution. After saving the file and running /agents reload, you trigger it conversationally: "Run build-validator before I push." That phrase is enough — the system prompt encodes the full workflow.
Each invocation performs four deterministic operations on the current working tree:
- Detect the toolchain by inspecting manifest files at the repo root.
- Run the staged pipeline in fixed order, stopping at the first failure.
- Capture stderr — specifically the first 20 lines per failed stage, so the report stays scannable.
- Emit a structured report that lists every stage with a pass/fail glyph and a verdict line.
The four-tool whitelist — tools: Bash, Read, Grep, Glob — is declared in the YAML frontmatter, so the subagent cannot accidentally write files, delete code, or reach a network. This is the second design principle: build-validator is read-only by contract.
The Toolchain Auto-Detection Table
The prompt template ships with this exact mapping, copied verbatim from the source skill:
| Toolchain | Detection file | typecheck | lint | unit tests | build |
|---|---|---|---|---|---|
| Node | package.json | npm run scripts (e.g. tsc --noEmit) | eslint | jest / vitest | next build / vite build |
| Python | pyproject.toml / setup.cfg | mypy | ruff | pytest | (project script) |
| Go | go.mod | go vet | golangci-lint | go test | go build |
| Rust | Cargo.toml | cargo check | cargo clippy | cargo test | cargo build |
If none of the manifests are present the subagent does not guess — it escalates with the message "Add a validate.sh script and re-run." This boundary matters because guessing a Makefile target or invoking make all blindly is exactly how npm test accidentally launches a cluster on a CI runner.
Why Four Stages, In That Order
The order is not aesthetic. It is chosen so that cheap checks fail fast and the developer never waits on a 90-second vite build only to be told a single semicolon is missing.
- Typecheck runs first because it is the cheapest stage that catches the largest class of obvious regressions.
tsc --noEmitfinishes in seconds on most repos andcargo checkskips the linker — both are an order of magnitude faster than the build stage they precede. - Lint runs second because rule violations are deterministic:
eslintandcargo clippyoperate on AST, not behaviour, so they do not need a working binary. - Unit tests run third because they require a working compile but exercise behaviour, which is the noisiest place for failures to surface.
- Build runs last because it is the slowest and least-informative stage if anything earlier broke. If types are wrong, the build will also be wrong; reporting a build failure before a typecheck failure inverts the signal-to-noise ratio.
The single-line verdict — for example Verdict: FAIL at unit tests — gives the human the answer in one glance, which is the GEO-style "direct answer in fold-above content" the skill optimises for.
Step-by-Step: Install build-validator in Under a Minute
- Create the agents directory if it does not exist:
mkdir -p .claude/agents. - Save the prompt template (with its YAML frontmatter
name,description,tools) to.claude/agents/build-validator.md. The frontmatter is mandatory — Claude Code's agent loader uses it for the/agentsregistry. - In an active session run
/agents reload(or restart the CLI) so the new subagent appears. - Say "Run build-validator before I push." Claude Code will route the request to the build-validator subagent automatically.
- Read the verdict line. If it is
PASS, push. If it isFAIL at <stage>, fix the failure and re-invoke. Do not re-invoke from the parent agent — let build-validator finish each cycle so the report stays clean.
The whole loop — install, reload, first run — takes under 60 seconds on a project that already has a working npm test or cargo test.
A Real Example Session
The prompt's example block illustrates the steady state:
You: "Run build-validator before I push."
Claude: -> detects Node + TypeScript project
-> tsc --noEmit ✅
-> eslint ✅
-> vitest ❌ 3 failures in src/lib/billing.test.ts
-> stops, reports
You: "Fix the failures."
Claude: ... (fixes, re-runs build-validator until ✅)
The output format itself is fixed by the prompt:
build-validator
===============
Toolchain: <Node | Python | Go | Rust | mixed>
Duration: <seconds>
Stage results:
✅ typecheck
✅ lint
❌ unit tests — 3 failures
src/lib/billing.test.ts:42 — Expected 100, got 99
⏸️ build (skipped — earlier stage failed)
Verdict: FAIL at unit tests
Suggested fix: review the 3 test failures above; pricing math regression.
Notice three properties of the report. First, the toolchain line is always present, so a human reviewing the output later can tell which language detection branch fired. Second, the duration is reported in seconds, giving you a soft signal when the local pipeline drifts (a 12-second baseline becoming a 90-second baseline is information). Third, skipped stages are explicitly marked with ⏸️, not omitted — which prevents a future reader from mistakenly believing the build passed when it never ran.
When NOT to Use build-validator
The skill is opinionated and there are situations where it is the wrong tool:
- During active feature development. If you are 30 minutes into refactoring, you already know the pipeline is broken. Running build-validator just produces noise. Wait until you think you are done.
- Without a stable toolchain. If
npm testitself is broken or yourCargo.tomlis half-renamed, fix the project setup before installing the subagent. build-validator is a reporter, not a fixer. - For E2E test coverage. End-to-end tests are explicitly out of scope — that is
verify-app's job. The prompt enumerates this boundary verbatim: "Do not run E2E tests (use verify-app)." - For autonomous repair. The prompt forbids auto-fixing: "Do not auto-fix anything." If you want repair, pair build-validator with a separate code-fixer subagent and let the human approve each fix.
Hard Boundaries Encoded in the Prompt
The four boundaries below are encoded in the source prompt and are non-negotiable:
| Boundary | Rationale |
|---|---|
| Do not auto-fix anything | Fail-and-report keeps the human in the loop and prevents silent regressions |
| Do not run E2E tests | Scope limit; long E2E runs belong to verify-app, not the pre-push gate |
| Do not deploy | Subagent is read-only; deploy belongs to a separate, audited tool |
| Escalate on unknown toolchain | Guessing a build command corrupts CI logs and wastes runtime |
These boundaries are why the YAML frontmatter declares only tools: Bash, Read, Grep, Glob. Without Edit or Write, the subagent literally cannot mutate code — the boundary is enforced by Claude Code's tool-permission model, not just by prose in the prompt.
Why a Local Pre-Push Check Beats Waiting for CI
The economics are simple. A typical GitHub Actions run for a Node web app takes 3 to 8 minutes from push to red badge. By contrast, tsc --noEmit && eslint . && vitest && next build on a developer laptop typically completes in 30 to 90 seconds — a 5-10x latency reduction.
GitHub publishes guidance showing that minutes consumed by jobs on private repositories are billed against the account, which makes the case for local pre-push validation a budget argument as well as an attention argument. Avoiding even 5% of red builds by pre-validating locally compounds quickly: at one company-wide Node monorepo the saving is measured in tens of thousands of CI-minutes per month, plus the much larger saving in human attention.
GitHub's official documentation for Actions billing confirms that self-hosted runners are free but GitHub-hosted runners draw from a per-account minute budget — every avoided red build is a real dollar saving on private repos.
Comparison With Adjacent Skills in the TokRepo Catalog
| Skill | Trigger | Scope | Duration |
|---|---|---|---|
| build-validator | Before git push | typecheck + lint + unit + build | 30-90 s |
| verify-app | After build-validator passes | E2E browser tests | 2-10 min |
| /go-verify-simplify-pr | Existing PR | Verify + simplify single pass | One-shot |
| /commit-push-pr | Local diff ready | Create PR once | One-shot |
| /loop | Any prompt | Arbitrary cron-like recurrence | User-defined |
| /ralph-wiggum | Long autonomous task | Multi-hour build loop | Autonomous |
build-validator complements /commit-push-pr: run build-validator first, then ship the commit. It also complements /babysit — pair the two so that every fix Claude Code pushes in response to a review comment has been gated through your real CI stages locally.
Production Tips From Early Adopters
- Add custom stages by editing the Workflow. The prompt explicitly invites additions like
prisma generate,openapi-typescript, or security scanners. Edit the markdown — no plugin system required. - Scope it in monorepos. Pass
--filter <package>to your monorepo tool (Turborepo, Nx, pnpm) inside the subagent's Bash invocations. The skill author flags this as the most common request. - Run it after
/loopor/ralph-wiggumfinishes. Long autonomous loops drift — confirm the loop did not regress fundamentals before you push. - Use it as a CI-substitution mode when GitHub Actions is degraded. The same pipeline runs on your laptop; a green local report is the closest signal to a green remote build.
Verification: This Page Is Grounded in the Source Prompt
Every numeric claim and behavioural rule in this article maps to a line in the original prompt_template shipped with the workflow: the four-stage order, the four-toolchain detection table, the 20-line stderr cap, the --keep-going opt-out, the Verdict: FAIL at <stage> format, the Bash, Read, Grep, Glob tool whitelist, the no-auto-fix rule, the no-E2E rule, and the no-deploy rule. Nothing has been invented; the subagent behaves exactly as the prompt instructs Claude Code to behave.
Frequently Asked Questions
No. End-to-end coverage is verify-app's job. build-validator is scoped to typecheck, lint, unit tests, and build only — the four stages where a fast, deterministic local pass/fail is most useful before pushing.
No. The prompt explicitly forbids auto-fixing — build-validator is a fail-and-report tool. Pair it with a separate code-fixer subagent if you want repair, and let the human approve each fix between runs.
It works, but you should scope it. Edit the subagent prompt to pass --filter <package> to your monorepo tool (Turborepo, Nx, pnpm). The skill author flags this as the single most common customisation.
Yes. Edit the .claude/agents/build-validator.md file directly and insert your stage between the four defaults. Common additions are prisma generate, openapi-typescript, and security scanners like Semgrep.
No. It is a community-written equivalent inspired by his public pre-push validation routine on howborisusesclaudecode.com. The behaviour matches his description but the prompt itself is open source.
The subagent escalates instead of guessing. It prints "Add a validate.sh script and re-run." Do that, and re-invoke. Guessing a build command is exactly how CI logs get corrupted on unfamiliar repositories.
Citations (5)
- Anthropic — Claude Code Subagents Documentation— Claude Code subagents are defined as Markdown files with YAML frontmatter declar…
- GitHub — About billing for GitHub Actions— GitHub Actions billing for GitHub-hosted runners draws minutes from a per-accoun…
- TypeScript Handbook — Compiler Options (--noEmit)— tsc --noEmit performs TypeScript type checking without emitting JavaScript outpu…
- Go Documentation — Command vet— go vet examines Go source code and reports suspicious constructs, serving as a l…
- Rust — Clippy Book (rust-lang/rust-clippy)— cargo clippy is the official Rust linter that catches common mistakes and improv…
Source & Thanks
Inspired by Boris Cherny's pre-push validation routine on howborisusesclaudecode.com.
Citations:
- howborisusesclaudecode.com
- Pragmatic Engineer: https://newsletter.pragmaticengineer.com/p/building-claude-code-with-boris-cherny
- Get Push To Prod: https://getpushtoprod.substack.com/p/how-the-creator-of-claude-code-actually