Rebulk

raw JSON →
3.3.0 verified Mon Apr 27 auth: no python

Rebulk is a Python library for defining simple search patterns in bulk and performing advanced matching on any string. It provides a clean API for building complex matchers with regular expressions, functional patterns, and chain filtering. Current version is 3.3.0, released in December 2023, with irregular release cadence.

pip install rebulk
error AttributeError: module 'rebulk' has no attribute 'Match'
cause Imported Match from rebulk module incorrectly (e.g., import rebulk; rebulk.Match).
fix
Use: from rebulk import Match
error Regex pattern '...' not matching even though regex module is installed.
cause Regex is disabled by default in Rebulk versions >=3.0.0.
fix
Set os.environ['REBULK_REGEX_ENABLED'] = '1' before importing rebulk.
error rebulk.exceptions.PatternError: Unsupported pattern type
cause Using an unsupported pattern type (e.g., tuple without proper indicator).
fix
Use string, compiled regex, or callable for patterns.
breaking Regex support is disabled by default since v3.0.0. To enable regex matching, set REBULK_REGEX_ENABLED=1 environment variable before importing rebulk.
fix Set environment variable: import os; os.environ['REBULK_REGEX_ENABLED'] = '1'
breaking Python 2.7 and 3.4 support dropped in v3.0.0; Python 3.5 dropped in v3.1.0; Python 3.6 dropped in v3.2.0.
fix Use Python 3.7 or later (Python 3.12 supported in v3.3.0).
gotcha Patterns are evaluated in order, and later patterns can override earlier ones if they match same position. Use `exclude` or `priority` to control behavior.
fix Understand pattern evaluation order: first added pattern wins in case of conflict unless priority is set.
deprecated The REGEX_DISABLED environment variable was replaced by REBULK_REGEX_ENABLED in v3.0.0. Old variable is ignored.
fix Use REBULK_REGEX_ENABLED instead of REGEX_DISABLED.

Create a Rebulk matcher, add string and regex patterns, and match against input.

from rebulk import Rebulk, Match

rebulk = Rebulk()
rebulk.string('hello')
rebulk.regex(r'world')
matches = rebulk.matches('hello world')
for m in matches:
    print(m.value, m.span)