ScriptsMay 11, 2026·2 min read

AgentEval — .NET Toolkit for Agent Evaluation

AgentEval is a .NET evaluation toolkit for AI agents that validates tool usage, scores RAG quality, compares models, and exports regression-ready reports.

Intro

AgentEval is a .NET evaluation toolkit for AI agents that validates tool usage, scores RAG quality, compares models, and exports regression-ready reports.

  • Best for: .NET teams building tool-using agents who want evaluation code that lives next to unit tests
  • Works with: .NET 8+ apps; integrates with agent frameworks and CI pipelines; ships as a NuGet package
  • Setup time: 15 minutes

Practical Notes

  • Setup time ~15 minutes (add NuGet + run one starter eval)
  • Runs alongside tests: the fastest check is dotnet test with evaluation assertions enabled
  • GitHub stars + forks (verified): see Source & Thanks

AgentEval is most useful when you treat tool usage as a contract. Instead of only judging final text, assert that:

  • The agent called the expected tools (and did not call forbidden ones).
  • The tool inputs are well-formed and minimally scoped.
  • Retrieval answers are grounded (your RAG checks pass consistently).

Because this repo is explicitly labeled as preview/experimental, pin versions in CI and keep an upgrade checklist (baseline scores + golden traces) before bumping.

FAQ

Q: Is this production-ready? A: The repo warns it is preview/experimental. Use it in CI with pinned versions and your own validation before shipping.

Q: Can I evaluate tool calls, not just text? A: Yes — tool usage validation is a first-class goal in the project description.

Q: How do I start fast? A: Add the NuGet package, follow the Getting Started guide, and turn one high-risk workflow into an eval test.

🙏

Source & Thanks

Source: https://github.com/AgentEvalHQ/AgentEval > License: MIT > GitHub stars: 89 · forks: 8

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets