SkillsMay 11, 2026·2 min read

AgentEval — .NET Toolkit for Agent Evaluation

AgentEval is a .NET evaluation toolkit for AI agents that validates tool usage, scores RAG quality, compares models, and exports regression-ready reports.

Agent ready

Ready-to-run agent install

This asset can be installed after the agent chooses its runtime, checks the plan, and runs the matching command.

Native · 98/100Policy: allow
Agent surface
Any MCP/CLI agent
Kind
Skill
Install
Single
Trust
Trust: Established
Entrypoint
Asset
Direct install command
npx -y tokrepo@latest install 19beb569-331b-4aa8-a6f4-fe45cb89b6f3 --target codex

Run after dry-run confirms the install plan.

Intro

AgentEval is a .NET evaluation toolkit for AI agents that validates tool usage, scores RAG quality, compares models, and exports regression-ready reports.

  • Best for: .NET teams building tool-using agents who want evaluation code that lives next to unit tests
  • Works with: .NET 8+ apps; integrates with agent frameworks and CI pipelines; ships as a NuGet package
  • Setup time: 15 minutes

Practical Notes

  • Setup time ~15 minutes (add NuGet + run one starter eval)
  • Runs alongside tests: the fastest check is dotnet test with evaluation assertions enabled
  • GitHub stars + forks (verified): see Source & Thanks

AgentEval is most useful when you treat tool usage as a contract. Instead of only judging final text, assert that:

  • The agent called the expected tools (and did not call forbidden ones).
  • The tool inputs are well-formed and minimally scoped.
  • Retrieval answers are grounded (your RAG checks pass consistently).

Because this repo is explicitly labeled as preview/experimental, pin versions in CI and keep an upgrade checklist (baseline scores + golden traces) before bumping.

FAQ

Q: Is this production-ready? A: The repo warns it is preview/experimental. Use it in CI with pinned versions and your own validation before shipping.

Q: Can I evaluate tool calls, not just text? A: Yes — tool usage validation is a first-class goal in the project description.

Q: How do I start fast? A: Add the NuGet package, follow the Getting Started guide, and turn one high-risk workflow into an eval test.

🙏

Source & Thanks

Source: https://github.com/AgentEvalHQ/AgentEval > License: MIT > GitHub stars: 89 · forks: 8

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets