Evidently AI
Evidently is an open-source Python library (currently v0.7.21) for evaluating, testing, and monitoring machine learning and LLM systems in production. It offers 100+ built-in metrics to detect data drift, model performance issues, data quality problems, and LLM-specific evaluations. The library is actively developed with frequent releases, providing both an open-source framework for offline evaluations and a UI for continuous monitoring, with additional features available through Evidently Cloud.
Warnings
- breaking Evidently v0.7 introduced a major API overhaul, making the new API the default. This includes changes to how Reports are imported (`from evidently.report import Report`) and the introduction of explicit `Dataset` and `DataDefinition` objects which replace the older `column_mapping` approach. Code written for versions prior to 0.7 (especially 0.6.7 and older) will likely break.
- breaking Support for Python 3.9 was dropped in Evidently v0.7.21. Users on Python 3.9 will encounter errors or compatibility issues.
- gotcha When using Evidently alongside DVC (Data Version Control), there can be a dependency conflict with `pathspec`. Evidently v0.7.20 explicitly locked `pathspec` to `<1` for DVC compatibility. Ensure your `pathspec` version respects this constraint if you encounter issues with DVC.
- gotcha Evidently v0.7.21 added explicit support for Pandas 3. While this is an improvement, users of older Evidently versions might experience compatibility issues or unexpected behavior when using Pandas 3. Conversely, newer Evidently features might rely on specific Pandas 3 functionalities.
- deprecated Evidently Cloud v1 entered read-only mode after May 31, 2025, for free users. Users of Evidently Cloud must migrate to Cloud v2 and use Evidently library version `0.7.0` or newer to continue sending data and accessing features.
Install
-
pip install evidently
Imports
- Report
from evidently.report import Report
- DataDriftPreset
from evidently.metric_preset import DataDriftPreset
- Dataset
from evidently.core.datasets import Dataset
- DataDefinition
from evidently.core.data_definition import DataDefinition
Quickstart
import pandas as pd
from sklearn import datasets
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset
from evidently.options import DataDriftOptions # To demonstrate custom options
from evidently.core.datasets import Dataset
from evidently.core.data_definition import DataDefinition
# Prepare a toy dataset (Adult dataset from OpenML)
reference_data_frame = datasets.fetch_openml(name="adult", version=2, as_frame="auto").frame
current_data_frame = reference_data_frame.sample(n=5000, random_state=0)
# Define data schema using DataDefinition (required from v0.7)
data_definition = DataDefinition(
prediction_features=["income"],
target_names="income",
categorical_features=[
'workclass', 'education', 'marital-status', 'occupation',
'relationship', 'race', 'sex', 'native-country'
]
)
# Create Evidently Dataset objects
reference_dataset = Dataset(reference_data_frame, data_definition)
current_dataset = Dataset(current_data_frame, data_definition)
# Create and run a Data Drift Report
data_drift_report = Report(metrics=[
DataDriftPreset(
# Example of custom options, e.g., for statistical tests
# data_drift_options=DataDriftOptions(threshold=0.1)
)
])
data_drift_report.run(reference_data=reference_dataset, current_data=current_dataset)
# To display in a Jupyter notebook or save to HTML
# data_drift_report.show()
# data_drift_report.save_html("data_drift_report.html")
print("Data drift report generated successfully!")