LightGBM

4.6.0 · active · verified Sun Apr 05

LightGBM (Light Gradient Boosting Machine) is an open-source, high-performance gradient boosting framework developed by Microsoft. It uses tree-based learning algorithms and is designed for efficiency, scalability, and high accuracy, particularly with large datasets. Key innovations like Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) contribute to its faster training speeds and lower memory usage. The library is actively maintained, with frequent releases, and is currently at version 4.6.0.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to train a binary classification model using LightGBM's scikit-learn compatible API (`LGBMClassifier`). It covers data preparation, model initialization, training with early stopping, prediction, and evaluation. For non-scikit-learn API, `lgb.Dataset` and `lgb.train` are used.

import numpy as np
import lightgbm as lgb
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Generate some dummy data
X = np.random.rand(1000, 10)
y = np.random.randint(0, 2, 1000)

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train the LGBMClassifier
# Using scikit-learn API for convenience
model = lgb.LGBMClassifier(objective='binary', random_state=42)
model.fit(X_train, y_train,
          eval_set=[(X_test, y_test)],
          callbacks=[lgb.early_stopping(10)]) # Early stopping after 10 rounds without improvement

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.4f}")

view raw JSON →