MARISA Trie for Python

1.4.1 · active · verified Thu Apr 09

marisa-trie provides static, memory-efficient, and fast Trie-like data structures for Python, wrapping the C++ MARISA library. It's ideal for use cases like autocomplete, spell checkers, and storing large sets of strings or byte sequences for fast prefix matching and lookup. The library is actively maintained with regular updates to support new Python versions and minor feature enhancements.

Warnings

Install

Imports

Quickstart

Demonstrates creating a BytesTrie (the default Trie) and a StringTrie, then performing basic lookups and prefix searches. Note that BytesTrie expects byte strings for all operations.

from marisa_trie import BytesTrie, StringTrie

# Example with BytesTrie (most common, requires bytes)
words_bytes = [b'apple', b'apricot', b'banana', b'bandana', b'cherry']
b_trie = BytesTrie(words_bytes)

print(f"'apple' in BytesTrie: {b'apple' in b_trie}")
print(f"Words starting with 'ap': {list(b_trie.keys(b'ap'))}")

# Example with StringTrie (requires strings, new in 1.4.0)
words_str = ['hello', 'world', 'helium', 'wonder']
s_trie = StringTrie(words_str)

print(f"'hello' in StringTrie: {'hello' in s_trie}")
print(f"Words starting with 'h': {list(s_trie.keys('h'))}")

view raw JSON →