MARISA Trie for Python
marisa-trie provides static, memory-efficient, and fast Trie-like data structures for Python, wrapping the C++ MARISA library. It's ideal for use cases like autocomplete, spell checkers, and storing large sets of strings or byte sequences for fast prefix matching and lookup. The library is actively maintained with regular updates to support new Python versions and minor feature enhancements.
Warnings
- breaking Python 3.7 and 3.8 support was dropped. Users on these versions must upgrade Python or stick to older marisa-trie versions.
- gotcha MARISA tries are immutable after creation. You cannot add or remove keys once the trie is built. To modify, you must build a new trie.
- gotcha The default `Trie` and `BytesTrie` classes expect `bytes` objects, not `str`. Operations like `trie.keys(b'prefix')` or `b'word' in trie` require byte strings.
Install
-
pip install marisa-trie
Imports
- Trie
from marisa_trie import Trie
- BytesTrie
from marisa_trie import BytesTrie
- RecordTrie
from marisa_trie import RecordTrie
- StringTrie
from marisa_trie import StringTrie
Quickstart
from marisa_trie import BytesTrie, StringTrie
# Example with BytesTrie (most common, requires bytes)
words_bytes = [b'apple', b'apricot', b'banana', b'bandana', b'cherry']
b_trie = BytesTrie(words_bytes)
print(f"'apple' in BytesTrie: {b'apple' in b_trie}")
print(f"Words starting with 'ap': {list(b_trie.keys(b'ap'))}")
# Example with StringTrie (requires strings, new in 1.4.0)
words_str = ['hello', 'world', 'helium', 'wonder']
s_trie = StringTrie(words_str)
print(f"'hello' in StringTrie: {'hello' in s_trie}")
print(f"Words starting with 'h': {list(s_trie.keys('h'))}")