# AgentEval — .NET Toolkit for Agent Evaluation > AgentEval is a .NET evaluation toolkit for AI agents that validates tool usage, scores RAG quality, compares models, and exports regression-ready reports. ## Install Save as a script file and run: ## Quick Use 1. Install / run: ```bash dotnet add package AgentEval --prerelease ``` 2. Start / smoke test: ```bash dotnet test ``` 3. Verify: - Run the Getting Started guide; confirm at least one evaluation asserts tool usage and produces a report artifact in CI. ## Intro AgentEval is a .NET evaluation toolkit for AI agents that validates tool usage, scores RAG quality, compares models, and exports regression-ready reports. - **Best for:** .NET teams building tool-using agents who want evaluation code that lives next to unit tests - **Works with:** .NET 8+ apps; integrates with agent frameworks and CI pipelines; ships as a NuGet package - **Setup time:** 15 minutes ## Practical Notes - Setup time ~15 minutes (add NuGet + run one starter eval) - Runs alongside tests: the fastest check is `dotnet test` with evaluation assertions enabled - GitHub stars + forks (verified): see Source & Thanks AgentEval is most useful when you treat tool usage as a contract. Instead of only judging final text, assert that: - The agent called the expected tools (and did not call forbidden ones). - The tool inputs are well-formed and minimally scoped. - Retrieval answers are grounded (your RAG checks pass consistently). Because this repo is explicitly labeled as preview/experimental, pin versions in CI and keep an upgrade checklist (baseline scores + golden traces) before bumping. ### FAQ **Q: Is this production-ready?** A: The repo warns it is preview/experimental. Use it in CI with pinned versions and your own validation before shipping. **Q: Can I evaluate tool calls, not just text?** A: Yes — tool usage validation is a first-class goal in the project description. **Q: How do I start fast?** A: Add the NuGet package, follow the Getting Started guide, and turn one high-risk workflow into an eval test. ## Source & Thanks > Source: https://github.com/AgentEvalHQ/AgentEval > License: MIT > GitHub stars: 89 · forks: 8 --- ## 快速使用 1. 安装 / 运行: ```bash dotnet add package AgentEval --prerelease ``` 2. 启动 / 冒烟测试: ```bash dotnet test ``` 3. 验证: - 按 Getting Started 跑通最小评测;确认至少 1 项断言能校验工具使用,并在 CI 里产出可保存的报告产物。 ## 简介 AgentEval 是面向 .NET 的 Agent 评测工具箱:可校验工具调用、衡量 RAG 质量、做随机性/记忆基准,并输出可审计报告,让评测像单测一样进入工程流程与回归体系,适合 .NET 8+ 团队。 - **适合谁:** 用 .NET 构建工具型 Agent 的团队,希望把评测像单测一样纳入工程流程 - **可搭配:** .NET 8+ 应用;可与 agent 框架与 CI 集成;以 NuGet 包交付 - **准备时间:** 15 分钟 ## 实战建议 - 准备时间约 15 分钟(引入 NuGet + 跑通一个最小评测) - 可与单测同跑:最快的回归入口是 `dotnet test` + 评测断言 - GitHub stars / forks(已核验):见「来源与感谢」 AgentEval 的关键价值在于:把“工具调用”变成契约。不要只评估最终文本,而要断言: - Agent 是否调用了期望的工具(同时没有调用禁用工具)。 - 工具入参是否结构化且最小权限。 - 检索回答是否有依据(RAG 检查能稳定通过)。 由于该项目在 README 中明确提示为预览/实验性质,建议在 CI 固定版本,并在升级前准备好基线分数与 golden traces 作为对照。 ### FAQ **能直接上生产吗?** 答:README 提示为预览/实验性质。建议先在 CI 里使用并固定版本,自行验证后再用于关键流程。 **能评估工具调用而不是只看文本吗?** 答:可以。工具使用校验是项目的核心目标之一。 **怎么最快落地?** 答:引入 NuGet,按 Getting Started 跑通,然后把 1 个高风险工作流做成评测用例。 ## 来源与感谢 > Source: https://github.com/AgentEvalHQ/AgentEval > License: MIT > GitHub stars: 89 · forks: 8 --- Source: https://tokrepo.com/en/workflows/agenteval-net-toolkit-for-agent-evaluation Author: Agent Toolkit