Automatic Piecewise Linear Regression

10.22.0 · active · verified Thu Apr 16

APLR (Automatic Piecewise Linear Regression) is a Python library for building predictive and interpretable regression or classification machine learning models. It implements the Automatic Piecewise Linear Regression methodology, often achieving predictive accuracy comparable to tree-based methods while offering smoother, more interpretable predictions. The library is actively maintained with frequent releases, currently at version 10.22.0, and supports Python versions 3.8 and above.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize, train, and make predictions with an `APLRRegressor` model using synthetic data. It includes a categorical feature to highlight APLR's automatic preprocessing capabilities for `pandas.DataFrame` inputs. The `validation_ratio` parameter is used for faster internal hyperparameter tuning, which supersedes `cv_folds`.

import numpy as np
import pandas as pd
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from aplr import APLRRegressor

# Generate synthetic data
X, y = make_regression(n_samples=1000, n_features=10, n_informative=5, random_state=42)
X_df = pd.DataFrame(X, columns=[f'feature_{i}' for i in range(X.shape[1])])
y_series = pd.Series(y)

# Add a categorical feature for testing APLR's auto-preprocessing
X_df['categorical_feature'] = np.random.choice(['A', 'B', 'C'], size=1000)

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_df, y_series, test_size=0.2, random_state=42)

# Initialize and train the APLRRegressor model
# preprocess=True (default) enables automatic handling of categorical features and missing values
model = APLRRegressor(random_state=42, m=2000, n_jobs=-1, validation_ratio=0.1)
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model (e.g., using R-squared from scikit-learn)
from sklearn.metrics import r2_score
r2 = r2_score(y_test, y_pred)
print(f"R-squared: {r2:.3f}")

view raw JSON →