TOKREPO · Arsenal de IA

Estable

Pack de Generación de Tests con IA + E2E

Diez picks para el dev que quiere que la IA escriba los tests unitarios que faltan, genere property tests desde especificación, controle un navegador real para E2E y triage snapshots en CI. Test Engineer agent + Vitest + Jest + pytest + Hypothesis + MSW + Playwright + Playwright MCP + Playwright Tester agent + verify-app. En orden de instalación.

10 recursos

Sobre este pack

What's in this pack

This is the pack for the engineer who finally accepted that AI is faster than they are at writing the boring tests — the ones for the validator that takes 6 string permutations, the integration where the response body has 14 fields, the E2E that clicks through onboarding. You don't want to author those by hand anymore. You want a system: an agent that reads the file and proposes the test plan, a runner that doesn't make you wait, a property-based gen for the cases your brain misses, a mock so tests don't hit prod, a real browser an agent can drive, and a CI subagent that triages the red lights.

The ten picks below are the install order for that system. JavaScript/TypeScript is the spine (most of the modern web), Python is the second track (most of the AI/data side), and the bridge between them is the Test Engineer agent that picks the right runner per language. Every pick is open-source and lives on TokRepo so an AI coding agent can install it from inside a session.

Who this is for: a dev with a real codebase that has <40% coverage, who has tried to write tests during sprint time and given up, and who now wants Claude / Codex / Cursor to do it under supervision. By the end of step 10 you have unit + integration + E2E + snapshot tests running on every commit, with a subagent that reads failed runs and explains what broke.

Install in this order

Claude Code Agent: Test Engineer — Start here. This is the meta-agent that reads your codebase, picks the runners, drafts the test plan, and delegates the rest. Without it you'll install eight tools and never wire them together. Invoke it as @test-engineer and let it propose the strategy before you install any specific framework.
Vitest — The fast unit runner for anything Vite-flavored (Nuxt, modern React, Svelte, Solid). Native ESM, TypeScript out of the box, Jest-compatible API, HMR-style watch mode that reruns in ~50 ms. Install this first because it's the lowest-friction win — a Vitest suite is a one-file vitest.config.ts away.
Jest — The fallback. Pre-Vite codebases (Create React App, older Node, anything CommonJS) still ride Jest. Same expect-API as Vitest, so tests written for one mostly port to the other. Install only if Vitest's Vite assumption doesn't fit; otherwise skip.
pytest — The Python side. Fixtures, parameterization, plugins, the works. AI agents love pytest because the assertion failure messages are unusually readable — when Claude reads a failed pytest run, it knows exactly what to fix.
Hypothesis — Property-based testing for Python. Instead of writing 20 example inputs, you write the property ("reversing a list twice gives the original") and Hypothesis generates the inputs, shrinking failures to the minimal reproducer. This is where AI test generation goes from "plausible" to "actually finds bugs." Pairs with pytest natively.
MSW (Mock Service Worker) — Network layer mocking. Intercepts fetch and XHR at the service worker, so your integration tests don't hit production. The AI angle: when an agent writes a test that should hit /api/users, MSW gives it a deterministic response without secrets / rate limits / flakes. This is the boundary between unit-style and E2E-style.
Playwright — The E2E framework. Cross-browser (Chromium, Firefox, WebKit), auto-wait so flakes drop ~90%, video + trace on failure for debugging. If you only install one E2E tool, install this. Generates tests from codegen recording, which an agent can then refine.
Playwright MCP — The MCP server that exposes Playwright to AI agents. Now Claude Code / Cursor / Codex can navigate pages, fill forms, click buttons, take snapshots — driving a real browser, not guessing about the DOM. This is what makes "AI runs my E2E" actually work versus the agent hallucinating selectors.
Claude Code Agent: Playwright Tester — A specialist subagent that writes Playwright specs. Feed it a user flow ("signup → onboard → first project"), it produces a .spec.ts file with proper locators, auto-waits, and assertions. Without this, you're typing page.click(...) by hand; with it, you're code-reviewing tests an agent drafted.
verify-app — E2E Test Subagent for Claude Code — The CI layer. After a Claude Code session that touched the codebase, verify-app runs the relevant E2E tests on the changed surface, triages failures, and reports back in plain English ("the signup test failed because the button moved from data-cta to data-test-id"). This is the closing loop: tests run, tests fail, agent explains, you fix, agent reruns.

How they fit together

Test Engineer (#1)
   │
   └─ reads codebase, picks runners, drafts plan
         │
Unit layer:
   Vitest (#2)  ←  primary (Vite-flavored)
   Jest (#3)    ←  fallback (legacy / non-Vite)
   pytest (#4)  ←  Python side
         │
Property layer:
   Hypothesis (#5) generates inputs from spec
         │
Integration boundary:
   MSW (#6) mocks the network so tests stay hermetic
         │
E2E layer:
   Playwright (#7) — framework
   Playwright MCP (#8) — agent drives the real browser
   Playwright Tester agent (#9) — writes the specs
         │
CI layer:
   verify-app subagent (#10) — runs E2E on diff, triages failures

The Test Engineer + Hypothesis + verify-app trio is the agentic backbone. Take those three away and the rest is just a normal test stack. Keep them and the loop closes: agent plans, generators surface edge cases, runner reports, subagent triages, you make decisions instead of typing assertions.

Tradeoffs you'll hit

Vitest vs Jest — Vitest is faster, native ESM, no transform config — but it assumes Vite. Jest is older, slower, but works in every JS environment ever shipped. Rule of thumb: new project = Vitest; codebase you inherited = whatever's already there.
Hypothesis vs example-based tests — Property tests catch bugs your examples never would, but they're harder to write and read. Use Hypothesis for pure functions (parsers, validators, math) and stick with examples for I/O-heavy code (the property is just "it doesn't crash," which isn't useful).
MSW vs real test server — MSW is faster and deterministic, but it lies — your code passes against a mock that doesn't match prod schemas. Combat this by generating MSW handlers from your OpenAPI spec, so the mock and prod can't drift silently.
Playwright vs Playwright MCP — The framework runs your scripted tests. The MCP server lets an agent improvise. Both, not either: scripted Playwright runs in CI for regression; Playwright MCP is for ad-hoc "agent, click through the new onboarding and tell me what's broken."
Test Engineer agent vs writing your own plan — The agent is right ~80% of the time on strategy and wrong about your specific business invariants. Treat its plan as a draft; edit before executing.

Common pitfalls

Installing all 10 at once — Don't. Pick your stack (JS or Python), install steps 1-2-5-6-7-10, ship a green pipeline, then add the rest. A pack is a menu, not a mandate.
Letting the agent write 200 tests on day one — Quality matters more than count. Have the Test Engineer agent draft 10 critical-path tests, code-review each, then expand. 200 mediocre tests is technical debt with a friendly wrapper.
MSW handlers that never refresh — Treat MSW handlers like type definitions: regenerate when the API changes. Stale mocks are how "all tests green" still ships broken code.
Playwright MCP in CI — Don't. MCP is for interactive sessions where an agent explores. CI should run scripted Playwright specs (faster, reproducible, no LLM cost per run). Use Playwright Tester (#9) to write the spec; let CI run it deterministically.
Skipping verify-app because "my CI already runs tests" — Your CI runs tests. verify-app explains failures. The first time a Claude Code session breaks a test and verify-app tells you which selector changed, you'll see the difference.
Hypothesis on impure code — Property-testing a function that touches the database is a recipe for flakes. Refactor the pure logic out first, property-test that, leave the I/O for example-based pytest cases.

INSTALAR · UN COMANDO

$ tokrepo install pack/ai-test-generation-e2e

pásalo a tu agente — o pégalo en tu terminal

Qué incluye

10 recursos listos para instalar

Skill#01

Claude Code Agent: Test Engineer

Test automation and quality assurance specialist. Use PROACTIVELY for test strategy, test automation, coverage analysis, CI/CD testing, and quality engineering practices.

by TokRepo精选·86 views

$ tokrepo install claude-code-agent-test-engineer-f3c765fa

Skill#02

Vitest — Next Generation Testing Framework Powered by Vite

Vitest is a blazing-fast unit testing framework powered by Vite, with native ESM, TypeScript, and JSX support. Jest-compatible API, instant HMR for tests, and in-source testing make it the go-to test runner for Vite projects.

by AI Open Source·361 views

$ tokrepo install vitest-next-generation-testing-framework-powered-vite-267275ed

Skill#03

Jest — Delightful JavaScript Testing Framework

Jest is a delightful JavaScript testing framework with a focus on simplicity. Zero-config for most JS/TS projects, snapshot testing, mocking, code coverage, and parallel test execution. Created by Facebook and used to test React, Instagram, and many large codebases.

by AI Open Source·323 views

$ tokrepo install jest-delightful-javascript-testing-framework-4240433c

Skill#04

pytest — The Python Testing Framework That Scales

pytest makes it easy to write small tests, yet scales to support complex functional testing. Fixtures, parameterization, plugins, markers, and a rich assertion introspection system. The de facto testing standard for the Python ecosystem.

by AI Open Source·289 views

$ tokrepo install pytest-python-testing-framework-scales-42405aa1

Skill#05

Hypothesis — Property-Based Testing for Python

Hypothesis is a Python testing library that generates test cases automatically based on specifications you define, finding edge cases that hand-written tests miss.

by AI Open Source·303 views

$ tokrepo install hypothesis-property-based-testing-python-a0e81556

Skill#06

MSW — API Mocking of the Next Generation

Mock Service Worker intercepts network requests at the service worker layer, letting you mock REST and GraphQL APIs for tests and development without stubbing fetch. The same mocks work in Node, jsdom, browsers, and React Native.

by Script Depot·303 views

$ tokrepo install msw-api-mocking-next-generation-dc65506c

Skill#07

Playwright — Cross-Browser End-to-End Testing Framework

Reliable end-to-end testing for modern web apps across Chromium, Firefox, and WebKit with a single API.

by Microsoft AI·308 views

$ tokrepo install playwright-cross-browser-end-end-testing-framework-af656d8e

MCP#08

Playwright MCP — Browser Automation for Agents

Playwright MCP exposes browser automation via MCP with device emulation; verified 5,510★ and documents 143 device profiles plus `playwright install` setup.

by MCP Hub·274 views

$ tokrepo install playwright-mcp-browser-automation-for-agents

Skill#09

Claude Code Agent: Playwright Tester

Testing mode for Playwright tests

by TokRepo精选·111 views

$ tokrepo install claude-code-agent-playwright-tester-cf9884c5

Skill#10

verify-app — E2E Test Subagent for Claude Code

Open-source Claude Code subagent that runs end-to-end tests on recent changes and triages failures. Inspired by Boris Cherny's verify-app setup.

by Skill Factory·352 views

$ tokrepo install verify-app-e2e-test-subagent-for-claude-code-203ea157

Preguntas frecuentes

Do I really need both Vitest AND Jest in the same project?

No. Pick one. The pack lists both because different projects ride different stacks — Vite-flavored codebases install Vitest, legacy CRA / Node CommonJS codebases stay on Jest. If you're starting fresh, pick Vitest and skip step 3 entirely. The Test Engineer agent (#1) can inspect your repo and tell you which one applies in 30 seconds.

What's the difference between Playwright (#7), Playwright MCP (#8), and the Playwright Tester agent (#9)?

Playwright is the framework — runs .spec.ts files in CI. Playwright MCP is a server that lets a coding agent drive a real browser interactively (great for exploration, terrible for CI cost). Playwright Tester agent is a specialist that writes the .spec.ts files for you. The pipeline: Tester agent writes specs → CI runs Playwright on them → MCP is for ad-hoc when something weird needs a real browser session right now.

Why include Hypothesis when AI can generate examples directly?

AI generates examples it can imagine. Hypothesis generates examples derived from the property — random strings with unicode edge cases, integers near boundaries, lists with shared references — the cases a human (or an LLM) wouldn't think to try. Pairs well: have Claude propose the property statement, let Hypothesis hunt counter-examples, ship the test.

Which three would you install if I only have an afternoon?

Test Engineer agent (#1), Playwright (#7), and verify-app (#10). The agent plans, Playwright runs the highest-leverage tests (E2E catches bugs unit tests never will), and verify-app explains the failures. Add Vitest (#2) or pytest (#4) on day two depending on language. The middle picks (MSW, Hypothesis, Playwright MCP) are upgrades that pay off in week two.

Does this pack assume Claude Code specifically?

The two subagents (#1 Test Engineer, #10 verify-app) and #9 Playwright Tester are Claude Code-native. Everything else (Vitest, Jest, pytest, Hypothesis, MSW, Playwright, Playwright MCP) is language- and tool-agnostic — works under Cursor, Codex CLI, Cline, Roo, and plain CLI runs. If you're not on Claude Code, swap the three agent picks for the equivalent in your toolchain and the rest of the install order still holds.

MÁS DEL ARSENAL

12 packs · 80+ recursos seleccionados

Explora todos los packs curados en la página principal

Volver a todos los packs