{"id":3831,"library":"textparser","title":"Textparser","description":"Textparser is a Python library designed for fast text parsing. It allows users to define token specifications using regular expressions and construct grammars to parse text into a structured parse tree. The project prioritizes parsing speed, as highlighted in its benchmarks. The current version is 0.24.0, released on April 16, 2022, with an infrequent release cadence.","status":"active","version":"0.24.0","language":"en","source_language":"en","source_url":"https://github.com/eerimoq/textparser","tags":["text parsing","parser","grammar","tokenization","regex","lexer"],"install":[{"cmd":"pip install textparser","lang":"bash","label":"Install stable version"}],"dependencies":[],"imports":[{"note":"Primary class for defining a parser.","symbol":"Parser","correct":"from textparser import Parser"},{"note":"Commonly used grammar element for sequential patterns.","symbol":"Sequence","correct":"from textparser import Sequence"}],"quickstart":{"code":"import textparser\nfrom textparser import Sequence\n\nclass MyParser(textparser.Parser):\n    def token_specs(self):\n        return [\n            ('SKIP', r'[ \\r\\n\\t]+'),\n            ('WORD', r'\\w+'),\n            ('EMARK', '!', r'!'),\n            ('COMMA', ',', r','),\n            ('MISMATCH', r'.')\n        ]\n\n    def grammar(self):\n        return Sequence('WORD', ',', 'WORD', '!')\n\ntree = MyParser().parse('Hello, World!')\nprint('Tree:', tree)\n# Expected output: Tree: ['Hello', ',', 'World', '!']","lang":"python","description":"This 'Hello World' example demonstrates how to define a custom parser by subclassing `textparser.Parser`. It specifies token types with regular expressions in `token_specs` and defines a simple grammar using `textparser.Sequence` in `grammar` to parse the string 'Hello, World!' into a parse tree."},"warnings":[{"fix":"If `token_specs` contains `('TYPE', 'friendly_name', r'regex')`, then `grammar` patterns should reference `'friendly_name'` not `'TYPE'`.","message":"When defining token specifications with `(kind, name, re)`, ensure the `grammar` refers to the `name` instead of the `kind`. Using `kind` when `name` is provided will lead to parsing errors.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Always inspect the output parse tree for your specific grammar and implement any necessary transformations or validations on the returned structure.","message":"The structure of parse trees returned by `textparser` can vary, and additional post-processing may be required to fit specific application needs. Its primary goal is speed, not necessarily a universally consistent parse tree format across different grammars.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Monitor the GitHub repository for updates. Test thoroughly with new Python versions or complex scenarios before deploying in production environments. Consider contributing if specific features or fixes are needed.","message":"The library's last release was April 2022, indicating a slower development pace. While generally stable, users should be aware of potential future compatibility challenges with newer Python versions or lack of updates for new parsing paradigms.","severity":"gotcha","affected_versions":"0.24.0 and older"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}