WorkflowsApr 3, 2026·2 min read

Evidently — ML & LLM Monitoring with 100+ Metrics

Evaluate, test, and monitor AI systems with 100+ built-in metrics for data drift, model quality, and LLM output. 7.3K+ stars.

AI
AI Open Source · Community
Quick Use

Use it first, then decide how deep to go

This block should tell both the user and the agent what to copy, install, and apply first.

pip install evidently
from evidently.report import Report
from evidently.metric_preset import TextEvals
import pandas as pd

# Evaluate LLM outputs
data = pd.DataFrame({
    "question": ["What is RAG?", "Explain fine-tuning"],
    "answer": ["RAG is Retrieval Augmented Generation...", "Fine-tuning adjusts..."],
    "context": ["RAG combines retrieval with generation...", "Fine-tuning is a process..."]
})

report = Report(metrics=[TextEvals()])
report.run(current_data=data, reference_data=None)
report.save_html("llm_eval_report.html")

Launch the monitoring dashboard:

evidently ui --workspace ./my_workspace

Intro

Evidently is an open-source ML and LLM observability framework with 7,300+ GitHub stars, providing 100+ built-in metrics for evaluating, testing, and monitoring any AI-powered system. It covers the full lifecycle — from offline evaluation (test LLM outputs before deployment) to production monitoring (detect data drift and quality degradation in real-time). Evidently generates rich HTML reports and dashboards, integrates with CI/CD for automated testing, and works with both traditional ML models and LLM applications.

Works with: Any ML model, LLM applications (OpenAI, Claude, etc.), Pandas DataFrames, MLflow, Airflow, Grafana. Best for ML/AI teams who need comprehensive model and data monitoring. Setup time: under 3 minutes.


Evidently Capabilities

LLM Evaluation

from evidently.report import Report
from evidently.metrics import (
    TextLength, Sentiment, NonLetterCharacterPercentage,
    OOVWordsPercentage, RegExp
)

report = Report(metrics=[
    TextLength(column="answer"),
    Sentiment(column="answer"),
    RegExp(column="answer", reg_exp=r"I don't know", top=5),
])
report.run(current_data=production_data)

Data Drift Detection

from evidently.report import Report
from evidently.metric_preset import DataDriftPreset

report = Report(metrics=[DataDriftPreset()])
report.run(reference_data=training_data, current_data=production_data)
# Detects: feature distribution shifts, new categories, outliers

Test Suites for CI/CD

from evidently.test_suite import TestSuite
from evidently.tests import (
    TestColumnDrift, TestShareOfMissingValues,
    TestMeanInNSigmas
)

suite = TestSuite(tests=[
    TestColumnDrift(column="prediction"),
    TestShareOfMissingValues(column="input", lt=0.05),
    TestMeanInNSigmas(column="score", n=2),
])
suite.run(reference_data=ref, current_data=curr)

# Use in CI/CD
assert suite.as_dict()["summary"]["all_passed"] == True

100+ Built-in Metrics

Category Metrics
Data Quality Missing values, duplicates, outliers, data types
Data Drift Distribution shift, feature importance drift
Classification Accuracy, precision, recall, F1, AUC, confusion matrix
Regression MAE, RMSE, MAPE, residuals
Text/LLM Length, sentiment, toxicity, regex patterns, embedding drift
Ranking NDCG, MAP, MRR

Monitoring Dashboard

from evidently.ui.workspace import Workspace

ws = Workspace.create("my_workspace")
project = ws.create_project("LLM App")
project.dashboard.add_panel("Text Quality Over Time")

# Add snapshots over time
for batch in daily_batches:
    report = Report(metrics=[TextEvals()])
    report.run(current_data=batch)
    ws.add_report(project.id, report)

FAQ

Q: What is Evidently? A: Evidently is an open-source ML/LLM monitoring framework with 7,300+ GitHub stars providing 100+ metrics for evaluation, testing, and production monitoring of AI systems.

Q: How is Evidently different from MLflow? A: MLflow tracks experiments and models. Evidently monitors model and data quality in production — detecting drift, degradation, and LLM output issues. They complement each other: MLflow for experiment tracking, Evidently for production monitoring.

Q: Is Evidently free? A: Yes, open-source under Apache-2.0. Evidently also offers a managed cloud product.


🙏

Source & Thanks

Created by Evidently AI. Licensed under Apache-2.0.

evidently — ⭐ 7,300+

Discussion

Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.

Related Assets