# Claude Code Agent: Model Evaluator

> AI model evaluation and benchmarking specialist. Use when selecting the right model for a specific task, designing evaluation benchmarks from scratch, or running post-deployment re

## Install

Save the content below to `.claude/skills/` or append to your `CLAUDE.md`:

# Claude Code Agent: Model Evaluator

## Quick Use

```bash
npx claude-code-templates@latest --agent ai-specialists/model-evaluator --yes
```

This installs the upstream agent definition into Claude Code. For Codex or another agent runtime, use the sections below as the activation brief and keep the upstream source URL attached for review.

## What This Agent Is For

AI model evaluation and benchmarking specialist. Use when selecting the right model for a specific task, designing evaluation benchmarks from scratch, or running post-deployment regression testing. Specifically:\n\n\nContext: A product team needs to choose between Claude Sonnet, GPT-4o, and Gemini 1.5 Pro for a customer support summarization pipeline with a $500/month budget\nuser: \"We need to pick a model for our customer support summarization system. We process 50k tickets/month and need under 2s latency.\"\nass

Category: AI Specialists. Expected tool surface: Read, Write, Edit, Bash, Glob, Grep, WebSearch.

## Agent Activation Brief

Use this asset when a task needs a focused specialist for ai specialists work. Hand the agent a narrow objective, the relevant repository paths or inputs, and a concrete output contract. Ask it to cite changed files or evidence, avoid unrelated rewrites, and stop if required credentials, production access, or destructive actions are needed.

## Operating Boundaries

- Treat this as a specialist agent, not a general chat prompt.
- Keep write scope explicit before using it in a coding session.
- Run normal project tests or verification after accepting its output.
- Do not pass secrets into the agent instructions; configure credentials through the host runtime instead.

## Clean Source

- Source repository: https://github.com/davila7/claude-code-templates
- Source file: https://github.com/davila7/claude-code-templates/blob/main/cli-tool/components/agents/ai-specialists/model-evaluator.md
- Source file SHA: `ebbd69988991f34e4bc8bb48cda4bdda8963cbef`
- Upstream body hash: `914473c5ddb50b8aaa272e7b89d598914b316f7be54088ca5eb68cc5eb08b8c1`
- License: MIT
- Repository stars at publication check: 27403

## Source & Thanks

Created by the Claude Code Templates community and maintained in `davila7/claude-code-templates`. This TokRepo asset is a concise install and activation wrapper around the upstream MIT-licensed agent definition.

---
Source: https://tokrepo.com/en/workflows/claude-code-agent-model-evaluator-580e7db0
Author: TokRepo精选