TOKREPO · ARSENAL

Stable

AI HR + Recruiting Stack

Ten picks for the recruiter or HR lead putting AI into the funnel: source candidates, parse and screen resumes, prep interview questions, capture and summarise calls, draft offer letters, and onboard — with a bias-audit pass before any decision touches a human. ATS connectors via MCP, not chatbots.

10 assets

About this pack

What's in this pack

This is the stack a recruiter or HR lead would actually wire up to handle a hiring round end-to-end — not a 50-vendor demo day. Every pick here does one job in the funnel a real opening goes through: find the candidates, parse the resumes, screen them against the JD, prep the interview, capture the call, draft the offer, onboard the new hire. And one tool that sits across every step: the bias-audit pass that runs before a human gets a recommendation.

The stack is agent-driven on purpose. The recruiter spends the morning wiring it up; from then on the agents do the grunt work — boolean searches, resume reformatting, screening summaries, transcript notes — and the recruiter only steps in where judgment is required (the call itself, the negotiation, the close). Critically: no auto-reject. Every screening step produces a ranked list with reasons, never a hidden filter that drops a candidate before a human sees them.

Install in this order

Tavily Search — the search engine your sourcing agent calls. Pull "senior React engineers who blogged about Server Components in 2026," "compensation benchmarks for staff PM in Berlin," or "why Acme just laid off their growth team." Free tier covers 1,000 queries/month — enough for a small team.
Apify MCP Server — 8,000+ pre-built scrapers (LinkedIn-style profiles, job boards, GitHub, Stack Overflow Careers) exposed as MCP tools. Use this once Tavily's text snippets aren't enough and you need structured candidate rows you can dedupe and rank.
Jina Reader — https://r.jina.ai/<url> returns clean markdown of any page. The unglamorous workhorse: paste a candidate's personal site, a competitor's careers page, or a 40-page benefits PDF, get back text the LLM can actually reason over.
Reactive Resume — open-source resume builder that doubles as a parser. Candidates submit raw resumes; you export them through Reactive into a consistent JSON Resume schema before any LLM screen. Same fields every time = comparable screening signals.
Docling — IBM-grade document parser for PDFs, DOCX, scanned resumes. Handles the messy real-world cases Reactive can't (1990s scanned CVs, two-column EU formats, image-only PDFs). Output is structured markdown your screening agent can chew on.
Phoenix Evals — LLM-as-judge library with built-in templates. This is where the actual screening happens: define your scorecard (years of relevant experience, domain match, communication clarity), Phoenix runs the same prompt against every candidate, returns numeric rubric scores with rationale. Auditable, reproducible.
Anarlog — open-source local AI meeting notes. Records and transcribes screening calls on the recruiter's machine — candidate audio never leaves the laptop. Output is a summary + action items you can drop into the ATS without uploading to a third-party SaaS.
Faster Whisper — 4x faster than OpenAI Whisper, runs locally. The transcription engine Anarlog and your batch interview pipeline use under the hood. Switch to this when you have 20 phone screens a week and need turnaround in minutes, not hours.
Prompt Perfect — system prompt engineering templates. Use it to keep your offer-letter prompt, your rejection prompt, your reference-check prompt under version control. "Generic friendly tone, no comp numbers, mention next step" should produce the same letter on Monday as on Friday.
Claude Code Agent: AI Ethics Advisor — the gate. Before any shortlist goes to a hiring manager, the Ethics Advisor reviews the screening rubric and the resulting ranking for protected-class proxies (zip code, school name, graduation year, photo). Flags get sent back to the recruiter, never auto-applied. This is the only step that's allowed to block a downstream action.

How they fit together

            ┌─ Tavily ─── Apify MCP ───┐
            │ (search)   (scrape)       │   SOURCE
            └─────────┬─────────────────┘
                      ▼
              Jina Reader (URL → text)
                      │
                      ▼
         Reactive Resume ── Docling           SCREEN
         (JSON schema)    (messy PDFs)
                      │
                      ▼
              Phoenix Evals
           (LLM rubric, scored)
                      │
                      ▼ ────────────────────────┐
              AI Ethics Advisor                 │
             (bias audit, gate)                 │
                      │                         │
                      ▼                         │
         Anarlog + Faster Whisper       INTERVIEW
        (record + transcribe call)              │
                      │                         │
                      ▼                         │
              Prompt Perfect                OFFER
         (offer letter, rejection,        + ONBOARD
          reference check templates)

The non-obvious join is Phoenix Evals → Ethics Advisor: Phoenix gives you a defensible, repeatable scorecard; the Ethics Advisor inspects that scorecard for proxy variables before the ranking is shown to anyone. Without the gate, an LLM-as-judge pipeline can silently re-encode every bias in the training data. With it, you have a paper trail.

Tradeoffs you'll hit

Reactive Resume vs Docling — Reactive is opt-in (candidate uses the builder); Docling is mandatory (you parse whatever comes in). Run both: Reactive for clean schema when the candidate cooperates, Docling for the 40% of inbound that arrives as a scanned PDF from 2014.
Anarlog (local) vs cloud meeting bots — Anarlog keeps the audio on the recruiter's laptop. Cloud bots (Fireflies, Otter) are faster to set up but log candidate audio in a US-based vendor that may or may not be GDPR-cleared for your region. For EU candidates specifically, default to local.
Phoenix Evals vs hand-graded screens — Phoenix is reproducible and fast; a recruiter reading every resume is irreplaceable for the top-of-funnel signal a rubric can't capture. The right mix is Phoenix for the first pass (cut 200 → 30), human for the second (30 → 8).
Auto-applying Ethics Advisor flags — don't. The Ethics Advisor is a reviewer, not an enforcer. Auto-rejecting candidates because the model flagged a proxy is exactly the failure mode you're trying to avoid. Flags go to the recruiter; the recruiter decides.

Common pitfalls

Letting the screening rubric live in someone's head — Phoenix wants a written rubric. If you can't articulate "3 points for direct domain experience, 2 for adjacent, 1 for transferable, 0 for unrelated," the screen isn't reproducible and the bias audit can't catch anything. Write the rubric before you wire up the agent.
Sending candidate PII through a third-party LLM — the SaaS Claude/OpenAI endpoints log prompts. For resume content that includes name, email, address, school, default to a local model (Ollama + a 12B Llama variant) for the screening step, and only send the rubric score upstream. Reserve cloud calls for the offer letter, not the screen.
ATS "AI integration" claims — most ATS vendors are reselling GPT calls with a UI. The point of this pack is that you own the prompt, the rubric, and the audit trail — not that you outsource them to a vendor's locked surface. Use the ATS's MCP / webhook layer; skip the bundled "AI screening."
No human-in-the-loop on rejections — even with a perfect rubric, automate-then-send rejection emails is the single fastest way to a discrimination complaint. Every rejection touches a human reviewer before it leaves the building.
Forgetting to delete candidate data on schedule — most jurisdictions cap how long you can hold an applicant's data. Wire a cron into your pipeline that purges resumes + transcripts at the retention cliff. Don't rely on "we'll do it manually."

INSTALL · ONE COMMAND

$ tokrepo install pack/ai-hr-recruiting-stack

hand it to your agent — or paste it in your terminal

What's inside

10 assets in this pack

Agent#01

Tavily Search — Search API Built for AI Agents

Tavily Search returns LLM-ready answers from the web — not link lists. One call gets snippets, citations, optional generated answer. Free tier 1K/mo.

by Tavily·309 views

$ tokrepo install tavily-search-search-api-built-for-ai-agents

MCP#02

Apify MCP Server — 8,000+ Web Scrapers for Agents

Apify MCP Server connects agents to Apify Actors via a hosted endpoint (mcp.apify.com) or local run, turning thousands of web scrapers into callable tools.

by MCP Hub·289 views

$ tokrepo install apify-mcp-server-8-000-web-scrapers-for-agents

Skill#03

Jina Reader — Convert Any URL to LLM-Ready Text

Convert any URL to clean, LLM-friendly markdown with a simple prefix. Just prepend r.jina.ai/ to any URL. Handles JS-rendered pages, PDFs, and images. 10K+ stars.

by Script Depot·7368 views

$ tokrepo install jina-reader-convert-any-url-llm-ready-text-a9cbbc61

Skill#04

Reactive Resume — AI-Powered Open-Source Resume Builder

Free open-source resume builder with AI integration. Supports Claude, GPT, Gemini for content generation. Drag-and-drop, PDF export, self-hostable, privacy-first. MIT, 36,000+ stars.

by AI Open Source·494 views

$ tokrepo install reactive-resume-ai-powered-open-source-resume-builder-0d39058c

Script#05

Docling — Document Parsing for AI

IBM document parsing library. Converts PDFs, DOCX, PPTX, images, and HTML into structured markdown or JSON. Built for RAG pipelines and LLM ingestion.

by Script Depot·316 views

$ tokrepo install docling-document-parsing-ai-443e86c2

Skill#06

Phoenix Evals — LLM-as-Judge Library with Built-in Templates

Phoenix Evals runs LLM-as-judge on traces or datasets. Pre-built templates: hallucination, relevance, toxicity, QA. Outputs scored DataFrames.

by Arize AI·249 views

$ tokrepo install phoenix-evals-llm-as-judge-library-with-built-in-templates

Skill#07

Anarlog — Open-Source AI Meeting Notes That Stay on Your Machine

A privacy-first, local-first meeting note application built with Rust and Tauri that transcribes, summarizes, and organizes your meetings without sending data to the cloud.

by AI Open Source·285 views

$ tokrepo install anarlog-open-source-ai-meeting-notes-stay-your-machine-0e1eab94

Skill#08

Faster Whisper — 4x Faster Speech-to-Text

Faster Whisper is a reimplementation of OpenAI Whisper using CTranslate2, up to 4x faster with less memory. 21.8K+ GitHub stars. GPU/CPU, 8-bit quantization, word timestamps, VAD. MIT licensed.

by Script Depot·375 views

$ tokrepo install faster-whisper-4x-faster-speech-text-24576b2c

Prompt#09

Prompt Perfect — System Prompt Engineering Templates

Battle-tested system prompt templates for building LLM personas, agents, and workflows. Structured formats for role definition, constraints, and output control. 4,000+ GitHub stars.

by Prompt Lab·290 views

$ tokrepo install prompt-perfect-system-prompt-engineering-templates-11680977

Skill#10

Claude Code Agent: AI Ethics Advisor

AI ethics and responsible AI development specialist. Use when reviewing an AI system for bias, fairness violations, or regulatory compliance gaps; when generating a model card,...

by TokRepo精选·139 views

$ tokrepo install claude-code-agent-ai-ethics-advisor-9b79e28a

FAQ

Frequently asked questions

Is this stack legal to use for hiring decisions in the EU, NYC, or California?

The stack itself is neutral — what makes it compliant or not is how you use it. EU AI Act treats hiring algorithms as high-risk: you owe documentation, human oversight, and the ability to explain individual decisions. NYC Local Law 144 requires an annual bias audit and candidate notification when an automated employment decision tool is used. California's draft regulations are heading in the same direction. The Ethics Advisor + Phoenix Evals combination produces the audit trail those laws want, but only if you run them and keep the logs. Talk to your employment counsel before going live.

How does this connect to my actual ATS (Greenhouse, Lever, Workable)?

Through MCP or webhooks, not through a chatbot. Most modern ATS systems expose a REST API with candidate / application / interview endpoints; you wrap that as an MCP server (or use a community one) and your agents call it the same way they call Tavily or Apify. Avoid the temptation to install the ATS vendor's bundled "AI assistant" — you lose control of the prompt and the data trail. Keep the ATS as the system of record and the agents as a layer that reads from it, ranks, and writes back structured notes.

What does the bias audit actually check?

The AI Ethics Advisor inspects the screening rubric for variables that correlate with protected classes without measuring the actual job requirement — common proxies are zip code (race), school prestige (class), graduation year (age), employment gaps (caregiving), photo presence (everything). It also runs the resulting ranking against the input pool to flag disparate impact: if 40% of applicants are women and 10% of the top-20 are women, that's a flag worth a human eyeballing the rubric. It does not, and should not, make the decision itself.

What's the smallest possible version of this pack I can run this week?

Three picks: Docling (parse whatever PDF lands in your inbox), Phoenix Evals (one written rubric, one LLM call per resume), and AI Ethics Advisor (review the rubric and the output ranking before anyone sees it). That's a defensible AI-assisted screen in roughly a day of setup. Add Anarlog + Faster Whisper the week you have more than 10 phone screens. Add Tavily + Apify only when sourcing volume justifies it — most small recruiting teams don't need a sourcing agent.

How much does this whole stack cost to run for a recruiting team?

Realistic baseline: $30-100/month for a small in-house team. Tavily free tier covers 1K queries; Apify pay-as-you-go runs $5-30/mo for typical scraping volume; Jina Reader has a generous free tier; Anarlog, Faster Whisper, Reactive Resume, and Docling are self-hosted and free; Phoenix Evals is open source but the LLM calls it issues are billed to your Claude/OpenAI account (budget $20-50/mo at a few hundred resumes/week). The Ethics Advisor is a Claude Code subagent — included if you already use Claude Code. The hidden cost is the time to write and version your rubric and prompt templates; budget half a day per role family.

12 packs · 80+ hand-picked assets

Browse every curated bundle on the home page

Back to all packs