TextSearch

0.0.24 · active · verified Wed Apr 15

TextSearch is a Python library designed for efficient and convenient searching and replacing of multiple strings within text. It leverages C-speed through an Aho-Corasick implementation, making it significantly faster than equivalent regex operations for specific tasks. The library focuses on providing convenience for Natural Language Processing (NLP) and text search tasks, often defaulting to full word matches rather than sub-matches. The current version is 0.0.24, with releases appearing on an as-needed basis.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize TextSearch, add multiple keywords, and then use it to find all occurrences or replace them within a given text. It highlights the `case` and `returns` parameters for flexible search behavior.

from textsearch import TextSearch

ts = TextSearch(case="ignore", returns="match")
words_to_find = ["hi", "bye", "hello"]
ts.add(words_to_find)

text = "Hello, hi Pascal, bye, how are you?"
found_matches = ts.findall(text)

print(f"Original text: {text}")
print(f"Words added for search: {words_to_find}")
print(f"Found matches: {found_matches}")

# Example of replacement
ts_replace = TextSearch(case="ignore", returns="replace")
ts_replace.add("hi", "GREETING")
ts_replace.add("bye", "FAREWELL")

replaced_text = ts_replace.replace(text)
print(f"Text after replacement: {replaced_text}")

view raw JSON →