{"id":6471,"library":"textsearch","title":"TextSearch","description":"TextSearch is a Python library designed for efficient and convenient searching and replacing of multiple strings within text. It leverages C-speed through an Aho-Corasick implementation, making it significantly faster than equivalent regex operations for specific tasks. The library focuses on providing convenience for Natural Language Processing (NLP) and text search tasks, often defaulting to full word matches rather than sub-matches. The current version is 0.0.24, with releases appearing on an as-needed basis.","status":"active","version":"0.0.24","language":"en","source_language":"en","source_url":"https://github.com/kootenpv/textsearch","tags":["text search","NLP","Aho-Corasick","performance","string matching","text processing"],"install":[{"cmd":"pip install textsearch","lang":"bash","label":"Install with pip"}],"dependencies":[{"reason":"Core functionality relies on a C-module implementation of the Aho-Corasick algorithm for speed.","package":"pyahocorasick","optional":false}],"imports":[{"symbol":"TextSearch","correct":"from textsearch import TextSearch"}],"quickstart":{"code":"from textsearch import TextSearch\n\nts = TextSearch(case=\"ignore\", returns=\"match\")\nwords_to_find = [\"hi\", \"bye\", \"hello\"]\nts.add(words_to_find)\n\ntext = \"Hello, hi Pascal, bye, how are you?\"\nfound_matches = ts.findall(text)\n\nprint(f\"Original text: {text}\")\nprint(f\"Words added for search: {words_to_find}\")\nprint(f\"Found matches: {found_matches}\")\n\n# Example of replacement\nts_replace = TextSearch(case=\"ignore\", returns=\"replace\")\nts_replace.add(\"hi\", \"GREETING\")\nts_replace.add(\"bye\", \"FAREWELL\")\n\nreplaced_text = ts_replace.replace(text)\nprint(f\"Text after replacement: {replaced_text}\")","lang":"python","description":"This quickstart demonstrates how to initialize TextSearch, add multiple keywords, and then use it to find all occurrences or replace them within a given text. It highlights the `case` and `returns` parameters for flexible search behavior."},"warnings":[{"fix":"Understand the `TextSearch` constructor parameters, especially how it processes and matches tokens. If sub-string matching is required, ensure your search terms and configuration align with that expectation, or consider alternative libraries if the core 'full word' matching is not desired.","message":"By default, TextSearch focuses on full word matches (tokens). Users accustomed to standard regex might expect sub-string matches. This behavior can be configured but is a key distinction from typical regex patterns.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure your system has the necessary build tools (e.g., `build-essential` on Debian/Ubuntu, Xcode command-line tools on macOS, or Visual C++ build tools on Windows) installed before attempting `pip install textsearch`.","message":"The library relies on a C-module (`pyahocorasick`) for its performance benefits. This means installation might require a C compiler and development headers on some systems, potentially leading to build errors if not available.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Evaluate your use case. If you are searching for one or very few fixed strings, simpler built-in Python methods might be more appropriate. TextSearch's benefits shine when dealing with a large dictionary of terms to find or replace.","message":"While TextSearch is significantly faster than regex for *multiple* string searches, for searching a *single* simple string, Python's built-in `str.find()` or `in` operator might be sufficient and have less overhead.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z"}