PyOD

2.1.0 · active · verified Sat Apr 11

PyOD is a comprehensive and scalable Python library for outlier detection (anomaly detection), offering over 50 detection models. It provides a unified API, making it easy to use and compare various algorithms. The library is currently at version 2.1.0, with frequent minor releases addressing compatibility and adding new features, including recent advancements in multi-modal anomaly detection using foundation model embeddings.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to generate synthetic data and use the k-Nearest Neighbors (KNN) algorithm from PyOD to detect outliers. It shows the basic steps of initialization, fitting the model, and retrieving binary outlier labels and raw anomaly scores.

from pyod.models.knn import KNN
from pyod.utils.data import generate_data
import numpy as np

# Generate random data with 20% outliers
X_train, y_train = generate_data(n_train=200, n_features=2, n_outliers=20, random_state=42)

# Initialize and train a kNN detector
clf = KNN(contamination=0.1) # Set contamination based on expected outlier ratio
clf.fit(X_train)

# Get the prediction labels (0: inliers, 1: outliers)
y_train_pred = clf.labels_

# Get the raw outlier scores
y_train_scores = clf.decision_scores_

print(f"Number of training samples: {len(X_train)}")
print(f"Number of predicted outliers: {np.count_nonzero(y_train_pred)}")

view raw JSON →