{"id":8071,"library":"dawg-python","title":"DAWG-Python","description":"Pure-python reader for DAWGs (Directed Acyclic Word Graphs / Deterministic Acyclic Finite State Automata). It's designed to load and query existing DAWG files, often created by the dawgdic C++ library or the DAWG Python C extension, but can also build small DAWGs from sorted word lists. The current version is 0.7.2, with releases occurring infrequently as needed.","status":"active","version":"0.7.2","language":"en","source_language":"en","source_url":"https://github.com/kmike/DAWG-Python/","tags":["DAWG","DAFSA","data structure","dictionary","trie","immutable","nlp"],"install":[{"cmd":"pip install dawg-python","lang":"bash","label":"Install stable version"}],"dependencies":[],"imports":[{"note":"The PyPI package is 'dawg-python', but the main module to import from is 'dawg'.","wrong":"from dawg_python import DAWG","symbol":"DAWG","correct":"from dawg import DAWG"},{"note":"The PyPI package is 'dawg-python', but the main module to import from is 'dawg'.","wrong":"from dawg_python import IntDAWG","symbol":"IntDAWG","correct":"from dawg import IntDAWG"}],"quickstart":{"code":"import os\nfrom dawg import DAWG, IntDAWG\n\n# 1. Create a sample DAWG file (in a real scenario, this might come from dawgdic)\nwords_to_build = ['apple', 'apricot', 'banana', 'cat', 'dog']\n# For large sets, words should be pre-sorted for performance.\ntemp_dawg = DAWG(words_to_build)\ndawg_file_path = 'sample_data.dawg'\ntemp_dawg.save(dawg_file_path)\n\n# 2. Load the DAWG from a file (primary use case)\nloaded_dawg = DAWG().load(dawg_file_path)\n\n# 3. Query the loaded DAWG\nprint(f\"Is 'apple' in DAWG? {'apple' in loaded_dawg}\")\nprint(f\"Words starting with 'a': {list(loaded_dawg.keys('a'))}\")\nprint(f\"Longest prefix for 'apricot': {loaded_dawg.longest_prefix('apricot')}\")\n\n# Clean up the temporary file\nos.remove(dawg_file_path)\n\n# Example with IntDAWG for words with integer payloads\nint_data = [('hello', 10), ('world', 20)]\nint_dawg_obj = IntDAWG(int_data)\nprint(f\"Value for 'world': {int_dawg_obj['world']}\")","lang":"python","description":"This quickstart demonstrates how to create a DAWG (for illustration), save it to a file, and then load and query it. It also shows basic usage of `IntDAWG` for words with associated integer payloads. The core use case is loading and querying, with file creation typically handled by other, more performant tools for large datasets."},"warnings":[{"fix":"For large-scale DAWG construction, consider using `dawgdic` or its Python C extension bindings, and then loading the resulting `.dawg` files with `dawg-python`.","message":"The `dawg-python` library is primarily a *reader* for DAWG files. While it can build DAWGs from Python lists, for very large dictionaries, the C++ `dawgdic` library or the `DAWG-Python C extension` are recommended for efficient DAWG construction.","severity":"gotcha","affected_versions":"All versions"},{"fix":"To modify a DAWG, you must build a new DAWG object from the desired set of words and payloads.","message":"DAWG objects (both `DAWG` and `IntDAWG`) are immutable once created or loaded from a file. You cannot add, remove, or modify words/payloads in place.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Pre-sort your list of words/tuples before passing them to the DAWG constructor: `DAWG(sorted(my_words_list))`.","message":"When building a DAWG using `DAWG(iterable_of_words)` or `IntDAWG(iterable_of_tuples)`, the input iterable should be *sorted alphabetically* for optimal performance. If not sorted, the library will sort it internally, which can be slow for large inputs.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure that all payloads provided to `IntDAWG` are integers.","message":"`IntDAWG` is specifically designed for string keys with *integer* payloads. Passing non-integer values as payloads will result in a `TypeError`.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Double-check the file path. Ensure the `.dawg` file exists and is accessible from where your script is running. Use an absolute path if unsure.","cause":"The DAWG file you are trying to load does not exist at the specified path, or the path is incorrect.","error":"FileNotFoundError: [Errno 2] No such file or directory: 'my_dictionary.dawg'"},{"fix":"Convert all payloads to integers before constructing the `IntDAWG`. For example: `IntDAWG([('word', int(value)) for word, value in data])`.","cause":"You are attempting to use `IntDAWG` with payloads that are not integers (e.g., strings, floats).","error":"TypeError: 'str' object cannot be interpreted as an integer"},{"fix":"To 'update' a DAWG, you must build a completely new `DAWG` object from the desired set of words. There is no in-place modification.","cause":"You are attempting to modify a `DAWG` object after it has been created or loaded. DAWG structures are immutable.","error":"AttributeError: 'DAWG' object has no attribute 'add_word'"}]}