scikit-multilearn

raw JSON →
0.2.0 verified Mon Apr 27 auth: no python maintenance

A BSD-licensed library for multi-label classification built on top of scikit-learn. Current version is 0.2.0. The project appears to be in maintenance mode with no recent releases; last PyPI release was in 2018.

pip install scikit-multilearn
error ModuleNotFoundError: No module named 'skmultilearn'
cause Trying to import the module before installing or using wrong name.
fix
Run 'pip install scikit-multilearn' then use 'import skmultilearn'.
error TypeError: init() got an unexpected keyword argument 'require_dense'
cause Using an older version of scikit-multilearn (<0.2.0) where require_dense was not introduced.
fix
Upgrade to 0.2.0: 'pip install scikit-multilearn==0.2.0'.
error ValueError: Input data is not a dense numpy array or a sparse scipy array
cause passing a list or other format where dense array expected.
fix
Convert input to numpy array: 'X = np.array(X)'.
deprecated The library has not been updated since 2018; consider using scikit-multilearn v2 (if available) or alternatives like 'skmultilearn' fork.
fix Check for newer forks or use 'pip install scikit-multilearn==0.2.0' (still old).
gotcha Many methods require `require_dense` parameter to be set to [True, True] for classifiers that expect dense arrays, otherwise confusing errors occur.
fix Always pass `require_dense=[True, True]` to adapters like BinaryRelevance when using sklearn classifiers.
gotcha The module name in imports is 'skmultilearn', not 'scikit_multilearn' or 'sklearn_multilearn'.
fix Use 'import skmultilearn' or 'from skmultilearn import ...'.
breaking Version 0.2.0 dropped Python 2 support; Python 3.5+ required.
fix Upgrade to Python 3.5 or later.

Quick example using BinaryRelevance with a RandomForest base classifier.

import numpy as np
from skmultilearn.problem_transform import BinaryRelevance
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_multilabel_classification

X, Y = make_multilabel_classification(n_samples=100, n_features=20, n_classes=5, random_state=42)
classifier = BinaryRelevance(classifier=RandomForestClassifier(), require_dense=[True, True])
classifier.fit(X, Y)
predictions = classifier.predict(X)
print(predictions.shape)