{"id":2107,"library":"luqum","title":"Luqum: Lucene Query Parser","description":"Luqum (LUcene QUery Manipolator) is a Python library that parses Lucene Query DSL strings, building an abstract syntax tree (AST) for inspection, analysis, and manipulation. It enables transforming Lucene DSL queries into native Elasticsearch JSON DSL. The library is currently at version 1.0.0 and sees releases as new features and maintenance updates are introduced, typically every few months. [1, 7, 11]","status":"active","version":"1.0.0","language":"en","source_language":"en","source_url":"https://github.com/jurismarches/luqum","tags":["lucene","query parser","elasticsearch","dsl","ast","query manipulation"],"install":[{"cmd":"pip install luqum","lang":"bash","label":"Install luqum"}],"dependencies":[{"reason":"Used for parsing Lucene Query DSL.","package":"PLY","optional":false}],"imports":[{"symbol":"parser","correct":"from luqum.parser import parser"},{"symbol":"ElasticsearchQueryBuilder","correct":"from luqum.elasticsearch import ElasticsearchQueryBuilder"},{"note":"Used to resolve implicit operators (e.g., 'foo bar') in parsed queries.","symbol":"UnknownOperationResolver","correct":"from luqum.utils import UnknownOperationResolver"},{"note":"Use this for thread-safe parsing instead of luqum.parser.parser.parse().","symbol":"parse","correct":"from luqum.thread import parse as thread_safe_parse"},{"note":"Common AST node types for programmatic tree construction/manipulation.","symbol":"AndOperation, Term, SearchField, Word","correct":"from luqum.tree import AndOperation, Term, SearchField, Word"}],"quickstart":{"code":"from luqum.parser import parser\nfrom luqum.elasticsearch import ElasticsearchQueryBuilder\nfrom luqum.utils import UnknownOperationResolver\n\n# 1. Parse a Lucene query string\nquery_string = '(title:\"foo bar\" AND body:\"quick fox\") OR title:fox'\ntree = parser.parse(query_string)\nprint(f\"Parsed AST: {repr(tree)}\")\nprint(f\"String representation: {str(tree)}\\n\")\n\n# 2. Resolve unknown operations (e.g., implicit AND/OR)\n# For a query like 'foo bar', it's parsed as UnknownOperation(Word('foo'), Word('bar'))\n# Use a resolver to make it explicit, e.g., 'foo AND bar'\nresolver = UnknownOperationResolver(default_operation=AndOperation) # AndOperation needs to be imported from luqum.tree\nresolved_tree = resolver(parser.parse('foo bar'))\nprint(f\"Resolved 'foo bar' to: {str(resolved_tree)}\\n\")\n\n# 3. Transform to Elasticsearch Query DSL\n# For complex schemas, pass nested_fields and object_fields arguments\nes_builder = ElasticsearchQueryBuilder()\nes_query = es_builder(tree)\nprint(f\"Elasticsearch DSL:\\n{es_query}\")","lang":"python","description":"This quickstart demonstrates parsing a Lucene query string into an Abstract Syntax Tree (AST), resolving implicit operators, and converting the AST into an Elasticsearch Query DSL dictionary. It highlights core functionalities of `luqum` for query manipulation and transformation. [2, 3, 8]"},"warnings":[{"fix":"Import `UnknownOperationResolver` from `luqum.utils` and apply it to your parsed AST, optionally specifying a `default_operation` (e.g., `AndOperation` from `luqum.tree`). [3, 8]","message":"Lucene queries with implicit operators (e.g., 'foo bar' instead of 'foo AND bar') are parsed as `UnknownOperation`. Users need to apply a transformer like `UnknownOperationResolver` to explicitly define the operator (e.g., AND, OR) for correct interpretation.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Manually set `node.head` and `node.tail` attributes on AST nodes when building trees programmatically to retain formatting information. The `auto_head_tail` utility can assist. [3, 8, 7]","message":"When constructing or modifying ASTs programmatically (rather than parsing a string), the `head` and `tail` properties (representing non-meaningful text like spaces around elements) must be set manually if preserving the original query's formatting or position information is critical. These properties are computed automatically during parsing.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Replace direct calls to `luqum.parser.parser.parse()` with `from luqum.thread import parse as thread_safe_parse` and then use `thread_safe_parse(query_string)` for thread-safe parsing. [6, 8]","message":"The underlying PLY library, used by luqum for parsing, is not inherently thread-safe. For concurrent parsing operations in a multi-threaded environment, `luqum.thread.parse()` should be used instead of `luqum.parser.parser.parse()` to ensure thread-safe execution by cloning the lexer state.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Upgrade Python to 3.10+ or pin `luqum` to a version less than 0.14.0. [7]","message":"Version 0.14.0 removed official support for Python 3.6, 3.7, 3.8, and 3.9. Users on these older Python versions should use `luqum < 0.14.0` or upgrade their Python environment.","severity":"breaking","affected_versions":">=0.14.0"},{"fix":"Review and adapt code interacting with `luqum.naming` and `auto_name` according to the 0.11.0 release notes and updated documentation. [7]","message":"In version 0.11.0, the `naming` module and its `auto_name` function were completely modified, leading to API incompatibility for any code using these features.","severity":"breaking","affected_versions":">=0.11.0"},{"fix":"Audit existing Elasticsearch queries generated by `luqum` for single-word matches and adjust expectations or logic if `match_phrase` behavior is strictly required (e.g., by explicitly modifying the AST before transformation). [7]","message":"Prior to version 0.7.0, the `ElasticsearchQueryBuilder` transformed single-word matches into `match_phrase` queries. From 0.7.0 onwards, if the field is analyzed, it now uses a `match` query, which aligns more closely with Elasticsearch's `query_string` behavior. This might alter the resulting Elasticsearch query structure for some inputs.","severity":"breaking","affected_versions":">=0.7.0"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}