TensorFlow Recommenders
TensorFlow Recommenders (TFRS) is a library for building recommender system models using TensorFlow. It helps with the full workflow of building a recommender system: data preparation, model formulation, training, evaluation, and deployment. It's built on Keras and aims to have a gentle learning curve while still giving you the flexibility to build complex models.
Common errors
-
ImportError: DLL load failed while importing _pywrap_tensorflow_internal: A dynamic link library (DLL) initialization routine failed.
cause This error typically indicates an issue with the TensorFlow installation itself, such as missing Visual C++ Redistributable packages on Windows, an incompatible Python version, or corrupted environment.fixEnsure TensorFlow is correctly installed and compatible with your system. Update Visual C++ Redistributables (on Windows), or try reinstalling TensorFlow in a clean virtual environment, possibly specifying a CPU-only version if GPU issues are suspected. Verify your Python version meets TensorFlow's requirements. -
ValueError: The candidates dataset must produce (id, embedding) pairs or just embeddings for indexing.
cause This usually happens when using `index_from_dataset` on a `tfrs.layers.factorized_top_k.TopK` layer (e.g., `BruteForce`) but the dataset provided does not yield the expected format (e.g., just movie titles without their embeddings, or embeddings without their corresponding IDs).fixModify the `candidates` dataset passed to `index_from_dataset` to map each candidate item to a tuple of `(identifier, embedding)`. For example, `dataset.map(lambda title: (title, movie_model(title)))`. -
TypeError: 'AutoTrackable' object is not callable
cause This error often occurs when trying to call a TensorFlow 2.x Keras model or layer that was not properly built or initialized, or when trying to load a TF1 Hub format model with TF2 `hub.load()`.fixEnsure your Keras model or layer is properly defined and has had its `build()` method called (implicitly by calling it on data, or explicitly). If loading from TF Hub, verify it's a TF2 SavedModel or handle TF1 Hub models correctly according to migration guides.
Warnings
- breaking The `tfrs.layers.factorized_top_k.TopK` layer's indexing API changed in v0.6.0. Direct indexing with datasets is no longer supported in the same way.
- breaking In v0.7.0, the `tfrs.metrics.FactorizedTopK` constructor parameters `k` was replaced with `ks` (a list of k values) and the `metrics` parameter was removed as it only makes sense with top-k metrics.
- breaking The `tfrs.tasks.Retrieval` task was updated in v0.7.3 to accept a *list* of factorized metrics, instead of a single optional metric.
- deprecated The `batch_size` argument for `tfrs.layers.embedding.TPUEmbedding` is deprecated and no longer required since v0.7.0.
- gotcha TensorFlow Recommenders, while built on Keras, can have a steep learning curve due to its advanced concepts in recommendation systems and deep integration with TensorFlow.
Install
-
pip install tensorflow-recommenders
Imports
- tfrs
import tfrs
import tensorflow_recommenders as tfrs
- tf
import tensorflow as tf
- tfds
import tensorflow_datasets as tfds
Quickstart
import tensorflow as tf
import tensorflow_datasets as tfds
import tensorflow_recommenders as tfrs
# Load the MovieLens 100K dataset
ratings = tfds.load('movielens/100k-ratings', split="train")
movies = tfds.load('movielens/100k-movies', split="train")
# Prepare data by selecting relevant features
ratings = ratings.map(lambda x: {"movie_title": x["movie_title"], "user_id": x["user_id"]})
movies = movies.map(lambda x: x["movie_title"])
# Build vocabularies for user IDs and movie titles
user_ids_vocabulary = tf.keras.layers.StringLookup(mask_token=None)
user_ids_vocabulary.adapt(ratings.map(lambda x: x["user_id"]))
movie_titles_vocabulary = tf.keras.layers.StringLookup(mask_token=None)
movie_titles_vocabulary.adapt(movies)
# Define user and movie models using Keras Sequential
user_model = tf.keras.Sequential([
user_ids_vocabulary,
tf.keras.layers.Embedding(user_ids_vocabulary.vocabulary_size(), 32)
])
movie_model = tf.keras.Sequential([
movie_titles_vocabulary,
tf.keras.layers.Embedding(movie_titles_vocabulary.vocabulary_size(), 32)
])
# Define the retrieval task with FactorizedTopK metric
task = tfrs.tasks.Retrieval(
metrics=tfrs.metrics.FactorizedTopK(
candidates=movies.batch(128).map(movie_model)
)
)
# Create a TFRS model
class MovieLensModel(tfrs.Model):
def __init__(self, user_model, movie_model):
super().__init__()
self.movie_model: tf.keras.Model = movie_model
self.user_model: tf.keras.Model = user_model
self.task: tf.keras.layers.Layer = task
def compute_loss(self, features: dict, training=False) -> tf.Tensor:
user_embeddings = self.user_model(features["user_id"])
positive_movie_embeddings = self.movie_model(features["movie_title"])
return self.task(user_embeddings, positive_movie_embeddings)
# Compile and train the model
model = MovieLensModel(user_model, movie_model)
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1))
model.fit(ratings.batch(4096), epochs=3)
# Generate recommendations (example for a specific user)
index = tfrs.layers.factorized_top_k.BruteForce(model.user_model)
index.index_from_dataset(
movies.batch(100).map(lambda title: (title, model.movie_model(title)))
)
# Example: get recommendations for user with ID '42'
_, titles = index(tf.constant(["42"]))
print(f"Top 3 recommendations for user '42': {titles[0, :3].numpy().astype(str)}")