TOKREPO · ARSENAL

New · this week

Refactor a Heavy Codebase

Ten picks for the engineer staring down a god class, a monolith split, or a move-to-typed migration. Coverage gate first, then codemods, then AI agent, then diff review — land without breaking prod.

10 assets

About this pack

What this pack solves

You inherited the file. It's 4,300 lines, named OrderService.ts, and nobody on the team will touch it. You need to split it, type it, or kill it — and the existing test suite covers 18%. Letting an LLM rewrite the whole thing from a prompt is how prod goes down at 2am. Letting yourself rewrite it by hand is how the quarter disappears.

This pack is the pipeline working engineers use instead: a coverage gate first, structural codemods for the 80% of mechanical changes, an AI agent for the ambiguous surgery, and a structural diff review before merge. Every tool here is for a specific stage. Install in order, use in order.

The five-stage pipeline

Stage 1 — Lock the behavior (coverage gate)

Tidy First (Kent Beck's discipline, packaged as a skill) — separates structural changes (rename, extract, move) from behavioral changes (logic). The discipline: never do both in the same commit. The skill enforces it. Read this before you touch anything.
Technical Debt Manager (Claude Code agent) — scans the target file/module and produces a ranked list: what to fix first, what the blast radius looks like, where coverage gaps will hide regressions. Use the output to write characterization tests for the current behavior before you change anything. If your coverage on the touched paths isn't ≥80%, stop and write tests first. This is the gate.

Stage 2 — Mechanical changes (codemods, not LLMs)

ast-grep — structural search-and-replace using tree-sitter. Find every callsite of getUserById(string) and rewrite it to getUserById({ id: string }) in one command, across 2,000 files, with zero false positives from string matching. This is the right tool for the 80% of a refactor that is mechanical.
GritQL — declarative pattern rewrites, slightly higher-level than ast-grep. Better when the transformation involves moving code between blocks (extract method, inline variable). Pick GritQL when the rule reads like "if you see X near Y, rewrite as Z".
Codemod — AI-powered migration CLI. Use for migrations the open-source community has already encoded (React class→hooks, Mocha→Vitest, Node http→fetch). Check the registry before writing your own codemod.

Stage 3 — The ambiguous surgery (AI agent)

Refactoring Specialist (Claude Code agent) — install path development-tools/refactoring-specialist. Hand it the file with the failing test and clear instructions: "extract these three methods into a PricingPolicy class, keep the public interface of OrderService unchanged, do not touch behavior". The agent will produce a diff. You review.
code-simplifier — Anthropic's official cleanup subagent. After the agent does the heavy lift, run code-simplifier across the diff to collapse trivial verbosity (nested ternaries, repeated guards). Surgical, scoped.

Stage 4 — Remove the bodies

Unused Code Cleaner (Claude Code agent) — after the split, dead imports, unused exports, and orphaned helpers will litter the diff. This agent finds them with cross-reference, not regex. Run it last, before the diff review — never before, because mid-refactor the "unused" stuff is often about to be wired up again.
Legacy Modernizer (Claude Code agent) — for the move-to-typed leg specifically: untyped JS → TS, Python → typed Python, callback → async. Use after Stage 2 codemods if the refactor includes a language/idiom upgrade.

Stage 5 — The merge gate (structural diff review)

code-review-graph MCP — gives your code-review agent (or you) a graph-aware view of the diff: which callers are affected, what changed in the public interface, where the test coverage gaps now sit. This is the last gate before merge. If the graph shows untested paths in the diff, send it back to Stage 1.

How they fit together

[Stage 1] Tidy First ──> Technical Debt Manager ──> characterization tests
                                                          │
                                                          ▼ (coverage ≥80%)
[Stage 2] ast-grep / GritQL / Codemod  ◀─── 80% mechanical changes
                                                          │
                                                          ▼
[Stage 3] Refactoring Specialist ──> code-simplifier  ◀─ ambiguous surgery
                                                          │
                                                          ▼
[Stage 4] Unused Code Cleaner ──> Legacy Modernizer (if typed migration)
                                                          │
                                                          ▼
[Stage 5] code-review-graph MCP  ─────────────────> merge

The critical rule: don't skip Stage 1. Every horror story about an AI agent "refactoring" a codebase into rubble starts with a missing characterization test. The agent moved code that looked dead and was actually load-bearing. Tests catch that. Nothing else does.

Tradeoffs you'll hit

ast-grep vs GritQL — ast-grep is faster, simpler, ships everywhere. GritQL is more expressive when patterns need control flow. Default to ast-grep; reach for GritQL when the rule needs to say "after the assignment, before the return".
Refactoring Specialist vs writing the diff yourself — for splits under 200 lines, the agent is overkill. For splits over 1,000, hand-writing is overkill. Sweet spot: 200-1,500 line surgical extractions where you know the target shape.
Unused Code Cleaner timing — if you run it during Stage 2, it will delete code you're about to wire back up in Stage 3. Run last. Always.
Codemod registry coverage — for popular migrations (React, Mocha, Node), the registry is gold. For your internal API rename, you'll write your own ast-grep rule. Don't fight it.

Common pitfalls

Coverage-as-vanity-metric — 80% line coverage on the touched files, not on the repo overall. The agent only cares about the blast radius. Repo-wide coverage is a manager's number.
Letting an LLM rewrite a whole file — paste the file into Claude, get back "cleaner" code, ship it. This is the route to a 2am page. The pipeline above is the alternative: small, reviewed, gated diffs.
Forgetting to commit between stages — every stage produces a reviewable diff. Commit at every stage boundary. Bisecting later requires it.
Running Unused Code Cleaner on partial refactor — it will delete the helpers Stage 3 is about to call. Last step, every time.

INSTALL · ONE COMMAND

$ tokrepo install pack/refactor-heavy-codebase

hand it to your agent — or paste it in your terminal

What's inside

10 assets in this pack

Skill#01

Tidy First — AI Code Refactoring Skill for Agents

Skill teaching AI agents Kent Beck's Tidy First methodology. Make small structural improvements before behavior changes to keep codebases clean and maintainable over time.

by Skill Factory·136 views

$ tokrepo install tidy-first-ai-code-refactoring-skill-agents-905bfdbf

Skill#02

Claude Code Agent: Technical Debt Manager

Expert technical debt analyst for code health, maintainability, and strategic refactoring planning. Use PROACTIVELY when codebase shows complexity growth, when planning...

by TokRepo精选·25 views

$ tokrepo install claude-code-agent-technical-debt-manager-6285fca6

Skill#03

ast-grep — Structural Code Search and Rewrite Tool

A fast CLI tool for searching and transforming code using abstract syntax tree patterns instead of regex, supporting JavaScript, TypeScript, Python, Rust, Go, and more.

by AI Open Source·99 views

$ tokrepo install ast-grep-structural-code-search-rewrite-tool-0a991711

Script#04

Codemod — AI-Powered Code Migration CLI

Scaffold, share, and run large-scale code migrations with AI. First-class ast-grep support, multi-step YAML workflows, community codemod registry. Apache-2.0, 970+ stars.

by Script Depot·103 views

$ tokrepo install codemod-ai-powered-code-migration-cli-a414acda

Script#05

GritQL — Declarative Code Rewrite CLI

GritQL is a declarative language + CLI for searching and rewriting codebases with snippet-like patterns, making large refactors repeatable and reviewable.

by Script Depot·21 views

$ tokrepo install gritql-declarative-code-rewrite-cli

Skill#06

Claude Code Agent: Refactoring Specialist

Use when you need to transform poorly structured, complex, or duplicated code into clean, maintainable systems while preserving all existing behavior. Specifically:\ \ \...

by TokRepo精选·24 views

$ tokrepo install claude-code-agent-refactoring-specialist-8ffed3e3

Skill#07

code-simplifier — Anthropic Official Cleanup Subagent

Anthropic's open-source post-task cleanup agent that Boris Cherny runs after every Claude Code session. Refactors for clarity without changing behavior.

by Skill Factory·196 views

$ tokrepo install code-simplifier-anthropic-official-cleanup-subagent-1304ff4c

Skill#08

Claude Code Agent: Unused Code Cleaner

Detects and removes unused code (imports, functions, classes) across multiple languages. Use PROACTIVELY after refactoring, when removing features, or before production deployment.

by TokRepo精选·22 views

$ tokrepo install claude-code-agent-unused-code-cleaner-fb99be5d

Skill#09

Claude Code Agent: Legacy Modernizer

Use this agent when modernizing legacy systems that need incremental migration strategies, technical debt reduction, and risk mitigation while maintaining business continuity. Specifically:\\n\\n<example>\\nContext: A development team has a 15-year-old mono...

by TokRepo精选·23 views

$ tokrepo install claude-code-agent-legacy-modernizer-dccb0175

MCP#10

code-review-graph — MCP Context for Smarter Reviews

code-review-graph builds a Tree-sitter code graph and exposes minimal review context via MCP; verified 16,364★ and claims ~8.2× token reduction on 6 repos.

by MCP Hub·68 views

$ tokrepo install code-review-graph-mcp-context-for-smarter-reviews

FAQ

Frequently asked questions

How big a refactor is this pipeline for?

Sweet spot is 500-5,000 lines of changed code across 5-50 files. Below that, manual editing with one AI agent is fine. Above that, you need to split the refactor into multiple passes through this pipeline — don't try to land a 50k-line diff in one go, no review process catches issues at that scale. The pipeline scales by repetition, not by enlargement.

Why ast-grep instead of just asking Claude to rewrite the file?

Because ast-grep is deterministic, idempotent, and won't introduce a behavior change unless you ask for one. An LLM rewriting a file can silently change a === to ==, drop a try/catch, or rename a variable in the import but not the usage. ast-grep can't — it only does what its pattern says. Use the LLM for the 20% that needs judgment, not the 80% that's mechanical.

What if my codebase has near-zero test coverage?

Then Stage 1 is the whole project for the first sprint. Write characterization tests against the current behavior (whatever it is — bugs included). Once you have tests around the blast radius, the refactor itself is the easy part. Skipping this is the single most common cause of refactor disasters.

Do I need all ten tools or can I pick three?

Most refactors only touch six: Tidy First (mental model), Technical Debt Manager (scoping), ast-grep (mechanical), Refactoring Specialist (surgery), Unused Code Cleaner (cleanup), code-review-graph MCP (gate). The other four are for specific cases: GritQL for control-flow rewrites, Codemod for known migrations, code-simplifier for verbosity, Legacy Modernizer for typed migrations. Pick the stage tools, skip the optional ones.

How do I review a refactor diff that touches 80 files?

Don't review file-by-file — review by transformation. Group the diff: "these 40 files are the rename from getUserById to findUser, these 25 are the extract of PricingPolicy, these 15 are the dead-code removal". The code-review-graph MCP helps generate that grouping. If you're scrolling through 80 files in PR review, the diff wasn't gated through this pipeline — send it back.

12 packs · 80+ hand-picked assets

Browse every curated bundle on the home page

Back to all packs