PyTextRank

3.3.0 verified Fri May 01 auth: no python

Python implementation of TextRank as a spaCy pipeline extension for graph-based natural language work including phrase extraction, keyword extraction, and knowledge graph extraction. Current version 3.3.0, requires Python >=3.7, follows spaCy's extension pattern. Development is active but releases are infrequent.

pip install pytextrank

Common errors

error ModuleNotFoundError: No module named 'pytextrank' ↓

cause PyTextRank is not installed or installed in a different Python environment.

fix

Run 'pip install pytextrank' in the same environment as spaCy.

error KeyError: "Cannot add pipeline component 'pytextrank' - not found." ↓

cause PyTextRank component is not registered; likely because import pytextrank is missing.

fix

Add 'import pytextrank' before calling nlp.add_pipe('pytextrank').

error AttributeError: 'spacy.tokens.doc.Doc' object has no attribute '_' ↓

cause The spaCy model does not have token extensions set, meaning PyTextRank may not be added correctly or is not active.

fix

Verify that nlp.add_pipe('pytextrank') is called after loading the spaCy model and before processing text.

Warnings

breaking PyTextRank 3.x requires spaCy 3.x. It will not work with spaCy 2.x. If you have spaCy 2.x, install an older PyTextRank version (<=2.x) or upgrade spaCy. ↓

fix Upgrade spaCy: pip install -U spacy

deprecated Accessing keyphrases via doc._.phrases has replaced the older method doc._.textrank. The old method will be removed in a future version. ↓

fix Use doc._.phrases instead of doc._.textrank.

gotcha The spaCy model must be loaded before adding PyTextRank pipeline. Adding the pipeline before loading the model will fail silently or raise an error. ↓

fix Ensure nlp = spacy.load('en_core_web_sm') before nlp.add_pipe('pytextrank').

Imports

PyTextRank

wrong

import pytextrank
nlp = pytextrank.PyTextRank()

correct

from spacy.lang.en import English
import spacy
nlp = spacy.load('en_core_web_sm')
nlp.add_pipe('pytextrank')
from pytextrank import PyTextRank

PyTextRank is a pipeline component, not directly instantiated as a standalone object. Use nlp.add_pipe('pytextrank').

Quickstart

Loads spaCy model, adds PyTextRank pipeline, processes text, and prints extracted keyphrases with rank and count.

import spacy
import pytextrank

nlp = spacy.load('en_core_web_sm')
nlp.add_pipe('pytextrank')

text = "Natural language processing enables computers to understand human language. It is used in chatbots and translation."
doc = nlp(text)

for phrase in doc._.phrases:
    print(phrase.text, phrase.rank, phrase.count)