Skills2026年5月19日·1 分钟阅读

Claude Code Agent: Model Evaluator

AI model evaluation and benchmarking specialist. Use when selecting the right model for a specific task, designing evaluation benchmarks from scratch, or running post-deployment re

Agent 就绪

这个资产可以被 Agent 直接读取和安装

TokRepo 同时提供通用 CLI 命令、安装契约、metadata JSON、按适配器生成的安装计划和原始内容链接,方便 Agent 判断适配度、风险和下一步动作。

Stage only · 35/100Stage only
Agent 入口
任意 MCP/CLI Agent
类型
Agent
安装
Single
信任
信任等级:Established
入口
ai-specialists/model-evaluator
通用 CLI 安装命令
npx tokrepo install 580e7db0-bfaa-4879-ac81-b8b5e58394aa

What This Agent Is For

AI model evaluation and benchmarking specialist. Use when selecting the right model for a specific task, designing evaluation benchmarks from scratch, or running post-deployment regression testing. Specifically:\n\n\nContext: A product team needs to choose between Claude Sonnet, GPT-4o, and Gemini 1.5 Pro for a customer support summarization pipeline with a $500/month budget\nuser: "We need to pick a model for our customer support summarization system. We process 50k tickets/month and need under 2s latency."\nass

Category: AI Specialists. Expected tool surface: Read, Write, Edit, Bash, Glob, Grep, WebSearch.

Agent Activation Brief

Use this asset when a task needs a focused specialist for ai specialists work. Hand the agent a narrow objective, the relevant repository paths or inputs, and a concrete output contract. Ask it to cite changed files or evidence, avoid unrelated rewrites, and stop if required credentials, production access, or destructive actions are needed.

Operating Boundaries

  • Treat this as a specialist agent, not a general chat prompt.
  • Keep write scope explicit before using it in a coding session.
  • Run normal project tests or verification after accepting its output.
  • Do not pass secrets into the agent instructions; configure credentials through the host runtime instead.

Clean Source

🙏

来源与感谢

Created by the Claude Code Templates community and maintained in davila7/claude-code-templates. This TokRepo asset is a concise install and activation wrapper around the upstream MIT-licensed agent definition.

讨论

登录后参与讨论。
还没有评论,来写第一条吧。

相关资产