Google RE2 Python Bindings
google-re2 provides Python bindings for Google's high-performance RE2 C++ regular expression library. It offers a fast and secure alternative to Python's built-in `re` module, designed for linear-time execution, but with a more restricted feature set. The library maintains a frequent release cadence, often monthly, reflecting updates to the underlying C++ library.
Warnings
- gotcha `re2` is NOT a full drop-in replacement for Python's `re` module. It intentionally omits advanced features like backreferences, lookarounds, and some complex Unicode properties for performance and security reasons. Patterns using these features will raise `re2.error: invalid perl_re`.
- gotcha Installation may require a C++ compiler (e.g., GCC, Clang, MSVC) and RE2 C++ library headers if pre-built wheels are not available for your specific Python version and operating system. This can complicate deployment in environments without build tools.
- gotcha While `re2` supports UTF-8, its interpretation of certain character classes (e.g., `\b` for word boundaries, `\w` for word characters) can differ from Python's `re` module, especially with non-ASCII or complex Unicode text. This can lead to unexpected matching behavior.
Install
-
pip install google-re2
Imports
- re2
import re2
Quickstart
import re2
# Compile a regex pattern
pattern = re2.compile(r'hello (\w+)')
# Search for a match
match = pattern.search('hello world')
if match:
print(f"Found: {match.group(0)}")
print(f"Captured: {match.group(1)}")
# Find all occurrences
text = 'hello python, hello universe'
all_matches = re2.findall(r'hello (\w+)', text)
print(f"All matches: {all_matches}")
# Replace text
replaced_text = re2.sub(r'hello', 'hi', text)
print(f"Replaced text: {replaced_text}")