Machine Learning Library Extensions (mlxtend)
mlxtend is a Python library of useful tools for machine learning tasks, offering a diverse range of functionality including feature selection, frequent pattern mining, model stacking, and plotting utilities. It builds upon popular libraries like scikit-learn, NumPy, and pandas. The current version is 0.24.0, and it maintains an active release cadence, frequently pushing updates for bug fixes and compatibility with its core dependencies.
Warnings
- breaking mlxtend frequently updates to maintain compatibility with newer versions of `scikit-learn` and `pandas`. Older mlxtend versions may not work correctly with the latest releases of these core dependencies, leading to API errors or unexpected behavior. For example, `scikit-learn`'s `set_output` method integration and changes to `LinearRegression`'s `normalize` parameter required updates.
- breaking NumPy's deprecated type aliases like `np.float_`, `np.int_`, `np.bool_` were removed in recent NumPy versions. mlxtend versions prior to v0.23.4 might use these aliases, causing `AttributeError` in newer NumPy environments.
- breaking Python 3.12+ removed the `distutils` package. Older mlxtend versions depending on `distutils` might fail to install or run on Python 3.12 and above.
- breaking The `meta_features` handling in `StackingCVClassification` and `StackingCVRegression` was modified to ensure compatibility with `scikit-learn` versions 1.4 and above.
- gotcha The behavior and internal workings of `association_rules` underwent fixes and improvements in recent versions. Code relying on specific older behaviors or encountering issues with rule generation should be re-evaluated.
Install
-
pip install mlxtend
Imports
- StackingClassifier
from mlxtend.classifier import StackingClassifier
- SequentialFeatureSelector
from mlxtend.feature_selection import SequentialFeatureSelector
- plot_decision_regions
from mlxtend.plotting import plot_decision_regions
- TransactionEncoder
from mlxtend.preprocessing import TransactionEncoder
- apriori
from mlxtend.frequent_patterns import apriori
- association_rules
from mlxtend.frequent_patterns import association_rules
Quickstart
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from mlxtend.classifier import StackingClassifier
# Generate a synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10, n_redundant=10, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Initialize base classifiers
clf1 = DecisionTreeClassifier(random_state=42)
clf2 = LogisticRegression(random_state=42, solver='liblinear')
# Initialize meta-classifier
lr = LogisticRegression(random_state=42, solver='liblinear')
# Initialize StackingClassifier
sclf = StackingClassifier(classifiers=[clf1, clf2], meta_classifier=lr, use_probas=True, verbose=0)
# Train and evaluate
sclf.fit(X_train, y_train)
score = sclf.score(X_test, y_test)
print(f"Stacking Classifier Test Accuracy: {score:.4f}")