TrieRegex

raw JSON →
1.0.0 verified Fri May 01 auth: no python

TrieRegex builds trie-based regular expressions from large word lists, enabling fast string matching against many patterns. Current version 1.0.0 requires Python >=3.6 and is released on PyPI. The library has a stable API with no recent breaking changes.

pip install trieregex
error AttributeError: 'Trie' object has no attribute 'get_regex'
cause Code uses deprecated method removed in very old version (pre-0.9.3) or newer version.
fix
Use trie.regex() instead of trie.get_regex().
error TypeError: add() missing 1 required positional argument: 'word'
cause Passing a list to trie.add(list) instead of iterating.
fix
Use trie.add_all(list) to add multiple words, or iterate over list calling add().
gotcha Trie.regex() returns a non-capturing group (?: instead of capturing parentheses. If you need to capture the matched word, use Trie.capturing_groups() or wrap the pattern.
fix Use trie.capturing_groups() or manually add parentheses around pattern.
gotcha Trie does not automatically escape special regex characters in input words. If words contain characters like '.', '*', etc., the resulting regex may have unintended meta-meaning.
fix Preprocess words with re.escape() before adding them, or use Trie.add() with pre-escaped strings.
deprecated The method Trie.get_regex() is deprecated since version 0.9.3 in favor of Trie.regex() but still works.
fix Use Trie.regex() instead.

Create a trie from a list of words and compile a regex pattern.

from trieregex import Trie

words = ['foo', 'bar', 'foobar']
trie = Trie()
trie.add_all(words)
pattern = trie.regex()
print(pattern)  # Output: (?:bar|foo(?:bar)?)
# Pattern can be used with re module
import re
result = re.search(pattern, 'foobar is here')
if result:
    print(result.group())  # Output: foobar