CTBoost

raw JSON →
0.1.50 verified Sat May 09 auth: no python

CTBoost is a GPU-accelerated gradient boosting library that uses Conditional Inference Trees (CIT) as base learners, providing a statistically principled alternative to standard regression trees. Current version is 0.1.50, with frequent releases.

pip install ctboost
error ImportError: cannot import name 'CTBoostClassifier' from 'ctboost'
cause Attempting to import from a wrong path or an older version where the class had a different name.
fix
Ensure you are using from ctboost import CTBoostClassifier. In versions <0.1.0, the class was named CTBClassifier. Upgrade to latest: pip install --upgrade ctboost
error ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
cause CTBoost does not handle missing or infinite values.
fix
Clean your data: drop or impute NaN and infinity values before calling .fit().
error RuntimeError: CUDA is not available. Please install CuPy or set use_gpu=False.
cause The `use_gpu` parameter was set to True (or default auto-detection) but no CUDA-enabled GPU or CuPy is installed.
fix
Install CuPy via pip install cupy-cuda12x (matching your CUDA version) or set use_gpu=False explicitly.
deprecated The parameter `verbose` was renamed to `verbosity` in version 0.1.40. Using `verbose` raises a deprecation warning and will be removed in the future.
fix Replace `verbose=True` with `verbosity=1` or higher.
breaking In version 0.1.35, the default value of `use_gpu` changed from `False` to `None`. `None` means auto-detect GPU. Code that relied on `use_gpu=False` being default may now unexpectedly attempt GPU usage on machines with NVIDIA GPUs.
fix Explicitly set `use_gpu=False` if CPU execution is required.
gotcha CTBoost currently does not support missing values (NaN). Passing a DataFrame with NaN will raise an error. Pre-impute missing values before fitting.
fix Use `sklearn.impute.SimpleImputer` or a similar imputation strategy.
pip install ctboost[cuda]

Quickstart example with CTBoostClassifier on synthetic data, CPU mode.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from ctboost import CTBoostClassifier

X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = CTBoostClassifier(n_estimators=100, learning_rate=0.1, use_gpu=False)
model.fit(X_train, y_train)
print(f"Accuracy: {model.score(X_test, y_test):.3f}")