Esta página se muestra en inglés. Una traducción al español está en curso.
ScriptsMar 31, 2026·2 min de lectura

DeepEval — LLM Testing Framework with 30+ Metrics

DeepEval is a pytest-like testing framework for LLM apps with 30+ metrics. 14.4K+ GitHub stars. RAG, agent, multimodal evaluation. Runs locally. MIT.

Introducción

DeepEval is an open-source testing framework for LLM applications, functioning like pytest but specialized for AI evaluation. With 14,400+ GitHub stars and MIT license, it provides 30+ evaluation metrics including G-Eval, RAG metrics (answer relevancy, faithfulness, contextual precision), agentic metrics (task completion, tool correctness), and multimodal evaluations. DeepEval supports component-level testing via the @observe decorator, integrates with OpenAI, LangChain, LlamaIndex, CrewAI, and Anthropic, and runs all evaluations locally on your machine.

Best for: Teams who want pytest-style testing for their LLM applications with comprehensive metrics Works with: Claude Code, OpenAI Codex, Cursor, Gemini CLI, Windsurf Integrations: OpenAI, LangChain, LlamaIndex, CrewAI, Anthropic


Key Features

  • 30+ metrics: G-Eval, RAG, agentic, multimodal, custom metrics
  • pytest-compatible: deepeval test run works like pytest
  • Component tracing: @observe decorator for per-component evaluation
  • Benchmark suite: MMLU, HellaSwag, DROP, and more in minimal code
  • Local execution: All metrics run on your machine
  • Framework support: OpenAI, LangChain, LlamaIndex, CrewAI, Anthropic

FAQ

Q: What is DeepEval? A: DeepEval is a pytest-like LLM testing framework with 14.4K+ stars. 30+ metrics for RAG, agents, multimodal. Runs locally. MIT licensed.

Q: How do I install DeepEval? A: pip install -U deepeval. Write test cases with LLMTestCase, run with deepeval test run.


🙏

Fuente y agradecimientos

Created by Confident AI. Licensed under MIT. confident-ai/deepeval — 14,400+ GitHub stars

Discusión

Inicia sesión para unirte a la discusión.
Aún no hay comentarios. Sé el primero en compartir tus ideas.

Activos relacionados