{"id":3763,"library":"pyod","title":"PyOD","description":"PyOD is a comprehensive and scalable Python library for outlier detection (anomaly detection), offering over 50 detection models. It provides a unified API, making it easy to use and compare various algorithms. The library is currently at version 2.1.0, with frequent minor releases addressing compatibility and adding new features, including recent advancements in multi-modal anomaly detection using foundation model embeddings.","status":"active","version":"2.1.0","language":"en","source_language":"en","source_url":"https://github.com/yzhao062/pyod","tags":["anomaly-detection","outlier-detection","machine-learning","deep-learning","pytorch","embeddings","nlp","computer-vision"],"install":[{"cmd":"pip install pyod","lang":"bash","label":"Basic Installation"},{"cmd":"pip install pyod[text,image]","lang":"bash","label":"For EmbeddingOD (v2.1.0+)"}],"dependencies":[{"reason":"Core machine learning utilities and base estimators.","package":"scikit-learn"},{"reason":"Required for deep learning-based models, replacing TensorFlow since v2.0.2.","package":"torch"},{"reason":"Optional, required for text embeddings in EmbeddingOD (v2.1.0+).","package":"sentence-transformers","optional":true},{"reason":"Optional, for HuggingFace model embeddings in EmbeddingOD (v2.1.0+).","package":"transformers","optional":true},{"reason":"Optional, for OpenAI model embeddings in EmbeddingOD (v2.1.0+).","package":"openai","optional":true}],"imports":[{"symbol":"KNN","correct":"from pyod.models.knn import KNN"},{"symbol":"generate_data","correct":"from pyod.utils.data import generate_data"},{"note":"New in v2.1.0 for multi-modal anomaly detection, requires additional dependencies like `sentence-transformers`.","symbol":"EmbeddingOD","correct":"from pyod.models.embedding_od import EmbeddingOD"}],"quickstart":{"code":"from pyod.models.knn import KNN\nfrom pyod.utils.data import generate_data\nimport numpy as np\n\n# Generate random data with 20% outliers\nX_train, y_train = generate_data(n_train=200, n_features=2, n_outliers=20, random_state=42)\n\n# Initialize and train a kNN detector\nclf = KNN(contamination=0.1) # Set contamination based on expected outlier ratio\nclf.fit(X_train)\n\n# Get the prediction labels (0: inliers, 1: outliers)\ny_train_pred = clf.labels_\n\n# Get the raw outlier scores\ny_train_scores = clf.decision_scores_\n\nprint(f\"Number of training samples: {len(X_train)}\")\nprint(f\"Number of predicted outliers: {np.count_nonzero(y_train_pred)}\")","lang":"python","description":"This quickstart demonstrates how to generate synthetic data and use the k-Nearest Neighbors (KNN) algorithm from PyOD to detect outliers. It shows the basic steps of initialization, fitting the model, and retrieving binary outlier labels and raw anomaly scores."},"warnings":[{"fix":"If you relied on TensorFlow-based models, either pin your PyOD version to <2.0.2 or refactor your code to use PyTorch-based models available in PyOD v2.0.2 and later.","message":"PyOD removed all TensorFlow and Keras code, migrating deep learning models entirely to PyTorch.","severity":"breaking","affected_versions":"<2.0.2"},{"fix":"Existing VAE models trained with older PyOD versions may produce different results. Consider re-training models or explicitly setting parameters like `output_activation` to maintain backward compatibility if migrating.","message":"Default parameters for some models, notably VAE, have changed (e.g., output activation for VAE to `identity`).","severity":"breaking","affected_versions":"<2.0.7"},{"fix":"Understand that `contamination` is the proportion of outliers in the data. It's used for thresholding (`predict` and `labels_`). If unknown, careful validation or methods like `predict_proba` might be needed. Setting it too high or too low can lead to misclassifications.","message":"The `contamination` parameter is crucial and can significantly impact detection results and thresholds, especially when not set accurately.","severity":"gotcha","affected_versions":"All versions"},{"fix":"It's generally recommended to keep PyOD updated, or ensure compatibility between your `pyod` and `scikit-learn` versions to avoid unexpected behavior, especially after major `scikit-learn` releases.","message":"PyOD frequently updates its internal dependencies and sometimes makes adjustments for `scikit-learn` breaking changes.","severity":"gotcha","affected_versions":"All versions, particularly pre-2.0.6 with newer scikit-learn"},{"fix":"Install the necessary optional dependencies using `pip install pyod[text]` or `pip install pyod[image]` or individual packages as required by your chosen embedding model (e.g., `pip install sentence-transformers`).","message":"The new `EmbeddingOD` framework (v2.1.0+) requires additional, potentially large, third-party libraries (e.g., `sentence-transformers`, `openai`, `transformers`) that are not installed by default.","severity":"gotcha","affected_versions":"2.1.0+"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}