DAWG2 Python Library

0.9.0 · active · verified Thu Apr 16

DAWG2-Python is a pure-Python library providing read-only access to Directed Acyclic Word Graphs (DAWGs) or Deterministic Acyclic Finite State Automata (DAFSAs). It specifically works with `.dawg` files created by the `dawgdic` C++ library or the original `DAWG` Python extension. The current version is 0.9.0, with infrequent releases typically for minor updates or dependency adjustments, focusing on API and binary compatibility with the original `DAWG` library where possible.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to load and query a `.dawg` file using `dawg2-python`. Note that `dawg2-python` is a pure-Python *reader* and cannot create or save `.dawg` files. For creating the `.dawg` file in this example, the `dawg` (C-extension based) library is used. You would typically use existing `.dawg` files generated by `dawg` or `dawgdic`.

import os
import tempfile
import dawg # This is the *original* DAWG library, for creating the .dawg file
import dawg_python # This is dawg2-python, for reading the .dawg file

# Create a dummy .dawg file using the original 'dawg' library
# (dawg2-python is for reading, not creating/saving)
def create_dummy_dawg(filepath):
    words = ['apple', 'apricot', 'banana', 'bandana', 'cat']
    d = dawg.DAWG(words)
    d.save(filepath)

# Use a temporary file for demonstration
with tempfile.NamedTemporaryFile(suffix='.dawg', delete=False) as tmp_file:
    dawg_filepath = tmp_file.name

try:
    # 1. Create a DAWG file (requires the 'dawg' library installed)
    print(f"Creating a temporary DAWG file at {dawg_filepath}...")
    create_dummy_dawg(dawg_filepath)
    print("DAWG file created.")

    # 2. Load and query the DAWG file using dawg2-python
    print("Loading DAWG using dawg2-python...")
    reader = dawg_python.DAWG().load(dawg_filepath)
    print("DAWG loaded successfully.")

    # Querying words
    print(f"'apple' in DAWG: {'apple' in reader}")
    print(f"'banana' in DAWG: {'banana' in reader}")
    print(f"'orange' in DAWG: {'orange' in reader}")

    # Getting prefixes
    print(f"Prefixes for 'bandana': {list(reader.prefixes('bandana'))}")

    # Iterating through all words
    print("First 5 words:")
    for i, word in enumerate(reader.iterkeys()):
        if i >= 5: break
        print(f"- {word}")

finally:
    # Clean up the temporary file
    if os.path.exists(dawg_filepath):
        os.remove(dawg_filepath)
        print(f"Cleaned up temporary file: {dawg_filepath}")

view raw JSON →