PyStemmer

3.0.0 · active · verified Sun Apr 12

PyStemmer provides efficient access to stemming algorithms from the Snowball project, wrapping the `libstemmer_c` library in a Python module. It's primarily used in information retrieval and search engines to reduce words to their common linguistic base form. The current version is 3.0.0, with an active but irregular release cadence typically driven by updates to the underlying Snowball library or Python compatibility.

Warnings

Install

Imports

Quickstart

Initialize a stemmer for a specific language and use it to stem single words or lists of words. It's recommended to reuse the stemmer object for performance due to caching.

import Stemmer

# Get a list of available algorithms
algorithms = Stemmer.algorithms()
# print(algorithms) # Uncomment to see the list

# Get an instance of the English stemmer
stemmer = Stemmer.Stemmer('english')

# Stem a single word
word = 'cycling'
stemmed_word = stemmer.stemWord(word)
print(f"'{word}' stemmed to: '{stemmed_word}'")

# Stem a list of words
words = ['connection', 'connections', 'connective', 'connected', 'connecting']
stemmed_words = stemmer.stemWords(words)
print(f"Words {words} stemmed to: {stemmed_words}")

view raw JSON →