Rusket
Rusket is an ultra-fast Python library designed for building recommender engines (collaborative filtering) and performing market basket analysis (association rules). It leverages Rust for its core computational logic, offering significant performance advantages, especially with large datasets. The current version is 0.1.90, and it is under active development with a focus on speed and efficiency.
Common errors
-
ModuleNotFoundError: No module named 'rusket'
cause The `rusket` library is not installed in your current Python environment.fixEnsure you have activated the correct virtual environment (if any) and run `pip install rusket`. -
ValueError: transaction_col 'wrong_transaction_id' not found in DataFrame.
cause The column name specified for `transaction_col` (or `item_col`, `user_col`) does not exist in the input pandas DataFrame.fixVerify that the column names in your DataFrame exactly match the strings passed to the `transaction_col`, `item_col`, or `user_col` arguments in the `fit` method. Use `df.columns` to inspect available columns. -
No association rules found. Try lowering min_support or min_confidence.
cause This is not an error but a common message when `MarketBasketAnalyzer.get_rules()` returns an empty DataFrame. It indicates that no itemsets met the specified `min_support` and `min_confidence` thresholds in your dataset, likely due to a sparse dataset or too-high thresholds.fixReduce the values for `min_support` and/or `min_confidence` when initializing `MarketBasketAnalyzer`. For very small datasets, these values might need to be very low (e.g., 0.01 for both).
Warnings
- breaking As Rusket is currently in early development (version 0.1.x), its API is subject to change without strict backward compatibility guarantees. Expect potential breaking changes between minor versions (e.g., 0.1.x to 0.2.x) until a stable 1.0 release.
- gotcha Rusket requires input data as a pandas DataFrame with explicitly named columns for transaction/user and item IDs. Incorrect column names or data types will lead to errors.
- gotcha While Rusket provides Pythonic interfaces, its core logic is implemented in Rust. If you encounter unexpected performance issues or unhandled exceptions that aren't typical Python errors, it might be related to the underlying Rust implementation. Error messages might sometimes be less verbose than pure Python errors.
Install
-
pip install rusket
Imports
- Recommender
from rusket import Recommender
- MarketBasketAnalyzer
from rusket import MarketBasketAnalyzer
Quickstart
import pandas as pd
from rusket import MarketBasketAnalyzer
# Sample transactional data for Market Basket Analysis
data = {
'transaction_id': [1, 1, 1, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6],
'item_id': ['Apple', 'Banana', 'Orange', 'Apple', 'Grape', 'Banana', 'Orange', 'Apple', 'Banana', 'Grape', 'Milk', 'Bread', 'Eggs', 'Milk', 'Cheese', 'Butter']
}
df = pd.DataFrame(data)
# Initialize and fit the Market Basket Analyzer
# Lower min_support/min_confidence for small sample data to ensure rules are found
mba = MarketBasketAnalyzer(min_support=0.01, min_confidence=0.01)
mba.fit(df, transaction_col='transaction_id', item_col='item_id')
# Get association rules
rules = mba.get_rules()
print("Generated Association Rules (first 5):")
if not rules.empty:
print(rules.head())
else:
print("No rules found. Try adjusting min_support or min_confidence.")