Rust Stemmers for Python

0.1.5 · active · verified Fri Apr 10

py-rust-stemmers (version 0.1.5) is a high-performance Python wrapper around the Rust `rust-stemmers` library. It implements the Snowball stemming algorithm, offering efficient word stemming for multiple languages with support for parallel processing, making it a powerful tool for text processing tasks. The library is actively maintained, with its latest version uploaded to PyPI in February 2025 and continued development activity on GitHub through late 2025.

Warnings

Install

Imports

Quickstart

Initialize a `SnowballStemmer` for a specific language and then use `stem_word`, `stem_words`, or `stem_words_parallel` for single word, batch, or parallel stemming, respectively.

from py_rust_stemmers import SnowballStemmer

# Initialize the stemmer for the English language
s = SnowballStemmer('english')

text = """This stem form is often a word itself, but this is not always the case as this is not a requirement for text search systems, which are the intended field of use. We also aim to conflate words with the same meaning, rather than all words with a common linguistic root (so awe and awful don't have the same stem), and over-stemming is more problematic than under-stemming so we tend not to stem in cases that are hard to resolve. If you want to always reduce words to a root form and/or get a root form which is itself a word then Snowball's stemming algorithms likely aren't the right answer."""
words = text.split()

# Stem a single word
stemmed_word = s.stem_word(words[0])
print(f"Stemmed word: {stemmed_word}")

# Stem a list of words
stemmed_words = s.stem_words(words)
print(f"Stemmed words: {stemmed_words}")

# Stem words in parallel (for larger text sequences)
stemmed_words_parallel = s.stem_words_parallel(words)
print(f"Stemmed words (parallel): {stemmed_words_parallel}")

view raw JSON →