Evidently AI

0.7.21 · active · verified Sat Apr 11

Evidently is an open-source Python library (currently v0.7.21) for evaluating, testing, and monitoring machine learning and LLM systems in production. It offers 100+ built-in metrics to detect data drift, model performance issues, data quality problems, and LLM-specific evaluations. The library is actively developed with frequent releases, providing both an open-source framework for offline evaluations and a UI for continuous monitoring, with additional features available through Evidently Cloud.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to generate a Data Drift Report using Evidently. It fetches a sample dataset, creates Evidently `Dataset` and `DataDefinition` objects (essential since v0.7), and then runs a `DataDriftPreset` report. The report can be displayed interactively in a notebook or saved as an HTML file.

import pandas as pd
from sklearn import datasets

from evidently.report import Report
from evidently.metric_preset import DataDriftPreset
from evidently.options import DataDriftOptions # To demonstrate custom options
from evidently.core.datasets import Dataset
from evidently.core.data_definition import DataDefinition

# Prepare a toy dataset (Adult dataset from OpenML)
reference_data_frame = datasets.fetch_openml(name="adult", version=2, as_frame="auto").frame
current_data_frame = reference_data_frame.sample(n=5000, random_state=0)

# Define data schema using DataDefinition (required from v0.7)
data_definition = DataDefinition(
    prediction_features=["income"],
    target_names="income",
    categorical_features=[
        'workclass', 'education', 'marital-status', 'occupation',
        'relationship', 'race', 'sex', 'native-country'
    ]
)

# Create Evidently Dataset objects
reference_dataset = Dataset(reference_data_frame, data_definition)
current_dataset = Dataset(current_data_frame, data_definition)

# Create and run a Data Drift Report
data_drift_report = Report(metrics=[
    DataDriftPreset(
        # Example of custom options, e.g., for statistical tests
        # data_drift_options=DataDriftOptions(threshold=0.1)
    )
])
data_drift_report.run(reference_data=reference_dataset, current_data=current_dataset)

# To display in a Jupyter notebook or save to HTML
# data_drift_report.show()
# data_drift_report.save_html("data_drift_report.html")

print("Data drift report generated successfully!")

view raw JSON →