Key Features
- RAG metrics: Faithfulness, answer relevancy, context precision/recall
- Test data generation: Auto-create evaluation datasets from documents
- LLM + traditional metrics: Both AI-judged and deterministic scoring
- Production feedback loops: Use real data to improve quality
- LangChain integration: Evaluate chains and retrievers directly
- Async scoring: Fast parallel evaluation with any LLM provider
FAQ
Q: What is Ragas? A: Ragas is an LLM evaluation toolkit with 13.2K+ stars. Objective metrics for RAG (faithfulness, relevancy), auto test generation, LangChain integration. Apache 2.0.
Q: How do I install Ragas?
A: pip install ragas. Quick start with ragas quickstart rag_eval -o ./my-project.