pyLDAvis

raw JSON →
3.4.1 verified Mon Apr 27 auth: no python

Interactive topic model visualization, port of the R package. Current version 3.4.1. Supports LDA models from gensim, scikit-learn, and other sources. Release cadence: irregular, last update in 2023.

pip install pyldavis
error ModuleNotFoundError: No module named 'pyLDAvis.sklearn'
cause In pyLDAvis 3.4.0, the sklearn module was renamed to sklearn_models.
fix
Use 'from pyLDAvis import sklearn_models' instead.
error ModuleNotFoundError: No module named 'pyLDAvis.gensim'
cause In pyLDAvis 3.3.0, the gensim module was renamed to gensim_models.
fix
Use 'from pyLDAvis import gensim_models' instead.
error ValueError: The parameter init='pca' cannot be used with metric='precomputed'.
cause When using pyLDAvis with sklearn LDA models, the default init='pca' is incompatible with some distance metrics.
fix
When preparing sklearn LDA data, set init='random' or adjust metric parameter in the prepare call.
error AttributeError: 'LdaModel' object has no attribute 'get_topics'
cause In older pyLDAvis versions, the API expects a method that doesn't exist in newer gensim.
fix
Upgrade pyLDAvis to latest version (>=3.3.0) which uses updated gensim API.
breaking Module renamed in v3.4.0: 'pyLDAvis.sklearn' no longer exists. Use 'pyLDAvis.sklearn_models'.
fix Change import from 'pyLDAvis.sklearn' to 'pyLDAvis.sklearn_models'.
breaking Module renamed in v3.3.0: 'pyLDAvis.gensim' no longer exists. Use 'pyLDAvis.gensim_models'.
fix Change import from 'pyLDAvis.gensim' to 'pyLDAvis.gensim_models'.
gotcha Pandas 2.x changed .drop behavior; pyLDAvis 3.4.1 fixed this by adding axis=1 argument. If using older pyLDAvis with pandas 2.x, you'll get a ValueError.
fix Upgrade pyLDAvis to 3.4.1 or later, or downgrade pandas to 1.x.
deprecated sklearn's get_feature_names is removed in scikit-learn 1.0; pyLDAvis 3.4.0+ handles this. Older versions may raise AttributeError.
fix Upgrade pyLDAvis to 3.4.0+ or use older scikit-learn.

Load a gensim LDA model, prepare interactive visualization, and save as HTML.

import pyLDAvis
import pyLDAvis.gensim_models as gensimvis

# Prepare data (example: using gensim LDA model)
# If using real API key, set it via os.environ
import os
api_key = os.environ.get('OPENAI_API_KEY', 'sk-...')  # not used in this example

# Example with dummy data
from gensim.corpora import Dictionary
from gensim.models import LdaModel

# Create a simple corpus
docs = [['apple', 'orange', 'banana'], ['car', 'truck', 'bus']]
dictionary = Dictionary(docs)
corpus = [dictionary.doc2bow(doc) for doc in docs]
lda_model = LdaModel(corpus, num_topics=2, id2word=dictionary, passes=5)

# Prepare visualization
vis_data = gensimvis.prepare(lda_model, corpus, dictionary)
# Save to HTML file
pyLDAvis.save_html(vis_data, 'vis.html')
print('Visualization saved as vis.html')