# Evidently — ML & LLM Monitoring with 100+ Metrics > Evaluate, test, and monitor AI systems with 100+ built-in metrics for data drift, model quality, and LLM output. 7.3K+ stars. ## Install Copy the content below into your project: # Evidently — ML & LLM Monitoring with 100+ Metrics ## Quick Use ```bash pip install evidently ``` ```python from evidently.report import Report from evidently.metric_preset import TextEvals import pandas as pd # Evaluate LLM outputs data = pd.DataFrame({ "question": ["What is RAG?", "Explain fine-tuning"], "answer": ["RAG is Retrieval Augmented Generation...", "Fine-tuning adjusts..."], "context": ["RAG combines retrieval with generation...", "Fine-tuning is a process..."] }) report = Report(metrics=[TextEvals()]) report.run(current_data=data, reference_data=None) report.save_html("llm_eval_report.html") ``` Launch the monitoring dashboard: ```bash evidently ui --workspace ./my_workspace ``` --- ## Intro Evidently is an open-source ML and LLM observability framework with 7,300+ GitHub stars, providing 100+ built-in metrics for evaluating, testing, and monitoring any AI-powered system. It covers the full lifecycle — from offline evaluation (test LLM outputs before deployment) to production monitoring (detect data drift and quality degradation in real-time). Evidently generates rich HTML reports and dashboards, integrates with CI/CD for automated testing, and works with both traditional ML models and LLM applications. Works with: Any ML model, LLM applications (OpenAI, Claude, etc.), Pandas DataFrames, MLflow, Airflow, Grafana. Best for ML/AI teams who need comprehensive model and data monitoring. Setup time: under 3 minutes. --- ## Evidently Capabilities ### LLM Evaluation ```python from evidently.report import Report from evidently.metrics import ( TextLength, Sentiment, NonLetterCharacterPercentage, OOVWordsPercentage, RegExp ) report = Report(metrics=[ TextLength(column="answer"), Sentiment(column="answer"), RegExp(column="answer", reg_exp=r"I don't know", top=5), ]) report.run(current_data=production_data) ``` ### Data Drift Detection ```python from evidently.report import Report from evidently.metric_preset import DataDriftPreset report = Report(metrics=[DataDriftPreset()]) report.run(reference_data=training_data, current_data=production_data) # Detects: feature distribution shifts, new categories, outliers ``` ### Test Suites for CI/CD ```python from evidently.test_suite import TestSuite from evidently.tests import ( TestColumnDrift, TestShareOfMissingValues, TestMeanInNSigmas ) suite = TestSuite(tests=[ TestColumnDrift(column="prediction"), TestShareOfMissingValues(column="input", lt=0.05), TestMeanInNSigmas(column="score", n=2), ]) suite.run(reference_data=ref, current_data=curr) # Use in CI/CD assert suite.as_dict()["summary"]["all_passed"] == True ``` ### 100+ Built-in Metrics | Category | Metrics | |----------|--------| | **Data Quality** | Missing values, duplicates, outliers, data types | | **Data Drift** | Distribution shift, feature importance drift | | **Classification** | Accuracy, precision, recall, F1, AUC, confusion matrix | | **Regression** | MAE, RMSE, MAPE, residuals | | **Text/LLM** | Length, sentiment, toxicity, regex patterns, embedding drift | | **Ranking** | NDCG, MAP, MRR | ### Monitoring Dashboard ```python from evidently.ui.workspace import Workspace ws = Workspace.create("my_workspace") project = ws.create_project("LLM App") project.dashboard.add_panel("Text Quality Over Time") # Add snapshots over time for batch in daily_batches: report = Report(metrics=[TextEvals()]) report.run(current_data=batch) ws.add_report(project.id, report) ``` --- ## FAQ **Q: What is Evidently?** A: Evidently is an open-source ML/LLM monitoring framework with 7,300+ GitHub stars providing 100+ metrics for evaluation, testing, and production monitoring of AI systems. **Q: How is Evidently different from MLflow?** A: MLflow tracks experiments and models. Evidently monitors model and data quality in production — detecting drift, degradation, and LLM output issues. They complement each other: MLflow for experiment tracking, Evidently for production monitoring. **Q: Is Evidently free?** A: Yes, open-source under Apache-2.0. Evidently also offers a managed cloud product. --- ## Source & Thanks > Created by [Evidently AI](https://github.com/evidentlyai). Licensed under Apache-2.0. > > [evidently](https://github.com/evidentlyai/evidently) — ⭐ 7,300+ --- ## 快速使用 ```bash pip install evidently ``` ```python from evidently.report import Report from evidently.metric_preset import TextEvals report = Report(metrics=[TextEvals()]) report.run(current_data=data) report.save_html("report.html") ``` --- ## 简介 Evidently 是一个拥有 7,300+ GitHub stars 的开源 ML/LLM 可观测性框架,提供 100+ 内置指标用于评估、测试和监控 AI 系统。覆盖数据漂移检测、模型质量和 LLM 输出监控。 --- ## 来源与感谢 > Created by [Evidently AI](https://github.com/evidentlyai). Licensed under Apache-2.0. > > [evidently](https://github.com/evidentlyai/evidently) — ⭐ 7,300+ --- Source: https://tokrepo.com/en/workflows/1aa244dc-3770-4626-b1f7-26ad63e0ee0b Author: AI Open Source