UDAPI
UDAPI is a Python framework for processing Universal Dependencies (UD) data, providing an API for reading, writing, and transforming CoNLL-U formatted treebanks. It supports tasks like visualization, format conversion, querying, and transformations of dependency trees. The library is actively maintained, with the current version being 0.5.2, and sees regular updates.
Common errors
-
command not found: udapy
cause The `udapy` command-line script, installed by `pip`, is not in your system's `PATH` environment variable. This often happens with user-specific installations (`pip install --user`) if the local bin directory isn't added to PATH.fixAdd the pip-installed scripts directory (usually `~/.local/bin`) to your `PATH` environment variable: `export PATH="$HOME/.local/bin/:$PATH"`. You may also want to add this line to your shell's configuration file (e.g., `.bashrc` or `.zshrc`). -
ModuleNotFoundError: No module named 'ufal.udpipe'
cause You are trying to use a `udapi` block that relies on the `ufal.udpipe` library, but `ufal.udpipe` has not been installed. It is an optional dependency.fixInstall the `ufal.udpipe` library: `pip install --upgrade ufal.udpipe`. -
SyntaxError: invalid syntax (or similar errors like IndentationError) when running `udapi` code
cause The Python version being used is older than the required Python 3.9. While the specific error might seem generic, it's often an underlying incompatibility with the required Python version.fixUpgrade your Python environment to version 3.9 or newer. Use a tool like `pyenv` or create a virtual environment with the correct Python version.
Warnings
- breaking Starting with version 0.5.2, the `udapy` CLI script transitioned to use `console_scripts` for invocation. While largely backward compatible, custom installations or environments that relied on specific `PATH` configurations might require verification.
- gotcha Many advanced processing blocks (e.g., for parsing or dependency tree transformations) rely on the `ufal.udpipe` library. This is an optional dependency and is not installed by default with `pip install udapi`.
- gotcha UDAPI requires Python 3.9 or higher. Attempting to use it with older Python versions will result in runtime errors due to incompatible syntax or missing features.
Install
-
pip install udapi
Imports
- Document
from udapi.core.document import Document
- Block
from udapi.core.block import Block
- Conllu
import Conllu
from udapi.block.read.conllu import Conllu
- Text
from udapi.block.read.text import Text
Quickstart
import io
from contextlib import redirect_stdout
from udapi.core.document import Document
from udapi.block.read.text import Text
from udapi.block.tokenize.simple import Simple
from udapi.block.write.conllu import Conllu
# Create a new document
doc = Document()
# Read raw text into the document
read_text_block = Text(string="This is a test sentence.")
read_text_block.apply_on_document(doc)
# Tokenize the sentence using a simple whitespace tokenizer
tokenize_block = Simple()
tokenize_block.apply_on_document(doc)
# Write the document in CoNLL-U format to a string
f = io.StringIO()
with redirect_stdout(f):
write_conllu_block = Conllu()
write_conllu_block.apply_on_document(doc)
conllu_output = f.getvalue()
print(conllu_output)