Implicit Collaborative Filtering
Implicit is a Python library that provides fast Python implementations of popular collaborative filtering recommendation algorithms for implicit feedback datasets. It includes models like Alternating Least Squares (ALS), BPR (Bayesian Personalized Ranking), and various Nearest-Neighbours models. The library leverages Cython, NumPy, and SciPy for performance, with optional GPU acceleration using CUDA. The current version is 0.7.2, and new versions are released periodically, often every few months, with a focus on performance, new features, and bug fixes.
Warnings
- breaking The API for `implicit` underwent substantial breaking changes in v0.5.0. Code written for versions prior to 0.5.0 will need to be rewritten.
- gotcha Model training methods (e.g., `model.fit()`) often require input matrices to be in `scipy.sparse.csr_matrix` format for optimal performance and correctness.
- gotcha Using GPU acceleration requires specific setup, including installing the `implicit[gpu]` extra and having a compatible CUDA Toolkit installed and configured on your system.
- gotcha When running on multi-core CPUs with BLAS/LAPACK libraries (like OpenBLAS, MKL), implicit threading can sometimes lead to oversubscription and performance degradation.
Install
-
pip install implicit -
pip install implicit[gpu]
Imports
- AlternatingLeastSquares
from implicit.als import AlternatingLeastSquares
- CosineRecommender
from implicit.nearest_neighbours import CosineRecommender
- BM25Recommender
from implicit.nearest_neighbours import BM25Recommender
- FactorizationMachines
from implicit.fm import FactorizationMachines
from implicit.factorization_machines import FactorizationMachines
Quickstart
import numpy as np
from scipy.sparse import csr_matrix
from implicit.als import AlternatingLeastSquares
# Sample data: user-item interactions (user_id, item_id, strength)
data = np.array([1, 1, 1, 1, 1, 1])
rows = np.array([0, 0, 1, 1, 2, 2]) # User IDs
cols = np.array([0, 1, 1, 2, 0, 2]) # Item IDs
# Create a sparse user-item matrix (users x items)
# This is typically a CSR matrix for performance and compatibility.
user_items = csr_matrix((data, (rows, cols)), dtype=np.float32)
# Initialize and train the AlternatingLeastSquares model
model = AlternatingLeastSquares(factors=64, regularization=0.01, iterations=20, random_state=42)
model.fit(user_items) # Model expects user_items (users x items) matrix
# Recommend items for a specific user (e.g., user 0)
user_id = 0
# The recommend method takes the user_id and the user_items matrix for that user.
recommended_items, scores = model.recommend(user_id, user_items[user_id])
print(f"Recommended items for user {user_id}: {recommended_items}")
print(f"Scores: {scores}")
# Get similar items for a specific item (e.g., item 0)
item_id = 0
similar_items, scores = model.similar_items(item_id)
print(f"Items similar to item {item_id}: {similar_items}")
print(f"Scores: {scores}")