Lifetimes
Lifetimes is a Python library for measuring customer lifetime value (CLV) using probabilistic models. It provides implementations of various statistical models like BG/NBD, Pareto/NBD, and Gamma-Gamma, along with utilities for data preparation, fitting, and prediction. The current version is 0.11.3, and it has a moderate release cadence with several updates per year.
Common errors
-
TypeError: fit() got an unexpected keyword argument 'n_custs'
cause The parameter `n_custs` in `BetaGeoBetaBinomFitter.fit()` was renamed to `weights` in `lifetimes` version 0.10.0.fixUse `weights` instead of `n_custs` in the call to `fit()`: `model.fit(..., weights=your_weights)`. -
ValueError: q must be > 1.0
cause The `GammaGammaFitter` produced a `q` parameter less than or equal to 1, which leads to an infinite mean. This typically happens with certain dataset distributions.fixWhen initializing `GammaGammaFitter`, add the `q_constraint=True` argument: `gmf = GammaGammaFitter(q_constraint=True)`. This enforces `q > 1` during the fitting process. -
AttributeError: 'Series' object has no attribute 'keys'
cause Attempting to use `OrderedDict`-specific methods (like `keys()`) on the `params_` attribute of a fitted model. Since version 0.11.0, `params_` is a `pandas.Series`.fixAccess parameters using `Series` or dictionary-like methods (e.g., `model.params_.index` to get keys, or `model.params_.values` for values). Individual parameters can be accessed like `model.params_['r']` or `model.params_.r`. -
KeyError: 'date'
cause `summary_data_from_transaction_data` expects a column named 'date' (or specified by `datetime_col`) and 'customer_id' (or specified by `customer_id_col`) in the input DataFrame, and they were not found or not correctly specified.fixEnsure your input DataFrame for `summary_data_from_transaction_data` has a column with transaction dates (e.g., named 'date') and customer identifiers (e.g., 'customer_id'), and these columns are correctly passed to `datetime_col` and `customer_id_col` respectively. Also, ensure the date column is of `datetime` type.
Warnings
- breaking Since version 0.11.0, the `params_` attribute on fitted models (e.g., `BetaGeoFitter.params_`) is no longer an `OrderedDict` but a `pandas.Series`. Code expecting `OrderedDict`-specific methods or behavior (e.g., `keys()`, `values()` in a specific order) will break.
- breaking In `BetaGeoBetaBinomFitter.fit()`, the parameter `n_custs` was renamed to `weights` in version 0.10.0 to align with other statistical libraries. Using `n_custs` will result in a `TypeError`.
- gotcha The `GammaGammaFitter` can produce an infinite mean for certain datasets if its `q` parameter converges to a value less than 1. This can lead to nonsensical CLV predictions.
- breaking The minimum required version for Pandas increased to `pandas>=0.24.0` as of `lifetimes` version 0.11.1. Older Pandas versions may cause unexpected errors or compatibility issues.
Install
-
pip install lifetimes
Imports
- BetaGeoFitter
from lifetimes import BetaGeoFitter
- GammaGammaFitter
from lifetimes import GammaGammaFitter
- ParetoNBDFitter
from lifetimes import ParetoNBDFitter
- summary_data_from_transaction_data
from lifetimes.utils import summary_data_from_transaction_data
- plot_period_transactions
from lifetimes.plotting import plot_period_transactions
Quickstart
import pandas as pd
from lifetimes import BetaGeoFitter
from lifetimes.utils import summary_data_from_transaction_data
# Sample transaction data (replace with your actual data)
transactions = pd.DataFrame({
'customer_id': ['A', 'A', 'B', 'B', 'C', 'D'],
'transaction_id': [1, 2, 3, 4, 5, 6],
'date': pd.to_datetime(['2023-01-01', '2023-01-15', '2023-02-01', '2023-02-10', '2023-03-01', '2023-01-05']),
'price': [10.0, 20.0, 15.0, 25.0, 30.0, 5.0]
})
# Convert transaction data to RFM (Recency, Frequency, Monetary) format
# The observation_period_end can be adjusted to your data's last date
rfm_data = summary_data_from_transaction_data(
transactions,
customer_id_col='customer_id',
datetime_col='date',
observation_period_end=pd.to_datetime('2023-03-31')
)
print("RFM Data:\n", rfm_data.head())
# Initialize and fit the BetaGeoFitter model
bgf = BetaGeoFitter(penalizer_coef=0.1) # Add a penalizer for stability
bgf.fit(rfm_data['frequency'], rfm_data['recency'], rfm_data['T'])
print("\nModel Parameters:\n", bgf.params_)
# Predict future purchases for the next 7 periods
# (e.g., if T is in days, this is 7 days)
prediction_days = 7
predicted_purchases = bgf.predict(
prediction_days, rfm_data['frequency'], rfm_data['recency'], rfm_data['T']
)
print(f"\nPredicted purchases in the next {prediction_days} days:\n", predicted_purchases.head())
# Calculate customer probability of being 'alive'
# This is useful for understanding customer churn/retention
alive_prob = bgf.conditional_probability_of_being_alive(
rfm_data['frequency'], rfm_data['recency'], rfm_data['T']
)
print("\nProbability of being alive:\n", alive_prob.head())