Practical Notes
- Setup time ~15 minutes (add NuGet + run one starter eval)
- Runs alongside tests: the fastest check is
dotnet testwith evaluation assertions enabled - GitHub stars + forks (verified): see Source & Thanks
AgentEval is most useful when you treat tool usage as a contract. Instead of only judging final text, assert that:
- The agent called the expected tools (and did not call forbidden ones).
- The tool inputs are well-formed and minimally scoped.
- Retrieval answers are grounded (your RAG checks pass consistently).
Because this repo is explicitly labeled as preview/experimental, pin versions in CI and keep an upgrade checklist (baseline scores + golden traces) before bumping.
FAQ
Q: Is this production-ready? A: The repo warns it is preview/experimental. Use it in CI with pinned versions and your own validation before shipping.
Q: Can I evaluate tool calls, not just text? A: Yes — tool usage validation is a first-class goal in the project description.
Q: How do I start fast? A: Add the NuGet package, follow the Getting Started guide, and turn one high-risk workflow into an eval test.