{"id":9382,"library":"udtools","title":"Python tools for Universal Dependencies","description":"udtools (version 0.2.7) provides a suite of Python tools for working with Universal Dependencies (UD) data. It offers functionalities for reading, writing, querying, and transforming CoNLL-U files, as well as integrating with UDPipe. The library is actively maintained with an irregular release cadence, focusing on facilitating linguistic research and processing of dependency parsed text.","status":"active","version":"0.2.7","language":"en","source_language":"en","source_url":"https://github.com/udtools/udtools","tags":["nlp","natural-language-processing","universal-dependencies","linguistics","conllu","dependency-parsing"],"install":[{"cmd":"pip install udtools","lang":"bash","label":"Install stable version"}],"dependencies":[],"imports":[{"note":"Classes like CoNLLUDocument are nested within submodules.","wrong":"import udtools.CoNLLUDocument","symbol":"CoNLLUDocument","correct":"from udtools.conllu import CoNLLUDocument"},{"note":"Transformation functions are located in the 'transform' submodule.","wrong":"from udtools import collapse_compounds","symbol":"collapse_compounds","correct":"from udtools.transform import collapse_compounds"},{"note":"Core data structures are found in the 'conllu' submodule.","wrong":"from udtools import Sentence","symbol":"Sentence","correct":"from udtools.conllu import Sentence"}],"quickstart":{"code":"from udtools.conllu import CoNLLUDocument, Sentence, Token\nfrom udtools.transform import collapse_compounds\n\n# Create a sample CoNLL-U document from a string\nconllu_string = \"\"\"\n# sent_id = 1\n# text = This is an example.\n1\tThis\tthis\tPRON\tDT\tNumber=Sing|PronType=Dem\t3\tnsubj\t_\t_\n2\tis\tbe\tAUX\tVBZ\tMood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin\t3\tcop\t_\t_\n3\tan\ta\tDET\tDT\tDefinite=Ind|PronType=Art\t4\tdet\t_\t_\n4\texample\texample\tNOUN\tNN\tNumber=Sing\t0\troot\t_\tSpaceAfter=No\n5\t.\t.\tPUNCT\t.\t_\t4\tpunct\t_\t_\n\n\"\"\"\ndoc = CoNLLUDocument.from_string(conllu_string)\nprint(\"Original document:\")\nprint(doc.to_string())\n\n# Example transformation (collapse_compounds might not change this simple example)\ncollapsed_doc = collapse_compounds(doc)\n\nprint(\"\\nDocument after collapse_compounds (no change for this simple example):\")\nprint(collapsed_doc.to_string())\n\n# Demonstrate adding a new sentence\nnew_sentence = Sentence()\nnew_sentence.tokens.append(Token(id=\"1\", form=\"Hello\", lemma=\"hello\", upos=\"INTJ\"))\nnew_sentence.tokens.append(Token(id=\"2\", form=\".\", lemma=\".\", upos=\"PUNCT\"))\ndoc.sentences.append(new_sentence)\n\nprint(\"\\nDocument with a new sentence added:\")\nprint(doc.to_string())\n","lang":"python","description":"This quickstart demonstrates how to create a CoNLLUDocument from a string, print its contents, apply a transformation, and add new sentences programmatically, showcasing basic data manipulation without file dependencies."},"warnings":[{"fix":"Always pin `udtools` to an exact version in production environments (e.g., `udtools==0.2.7`) and thoroughly test updates before deploying to ensure compatibility.","message":"As a pre-1.0 library (version 0.x.x), `udtools` API might not strictly adhere to semantic versioning. Minor releases could introduce breaking changes or significant modifications to existing functionalities.","severity":"gotcha","affected_versions":"<1.0.0"},{"fix":"Ensure input CoNLL-U files are valid. Use online validators or implement pre-parsing checks before feeding data to `udtools`.","message":"Processing malformed CoNLL-U files can lead to `udtools.conllu.CoNLLUError` exceptions or silent data corruption. The library expects strict adherence to the CoNLL-U format.","severity":"gotcha","affected_versions":"all"},{"fix":"For extremely large datasets, consider processing files sentence by sentence or using generator-based parsing if available, or break down large files into smaller chunks. The current API encourages loading entire documents, so external chunking is the primary workaround.","message":"Loading very large CoNLL-U documents entirely into memory using `CoNLLUDocument.from_file()` or `CoNLLUDocument.from_string()` can consume significant system RAM, potentially leading to `MemoryError`.","severity":"gotcha","affected_versions":"all"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Specify the full submodule path for the import, e.g., `from udtools.conllu import CoNLLUDocument`.","cause":"Attempting to import a class directly from the top-level `udtools` package when it resides in a submodule (e.g., `udtools.conllu`).","error":"ModuleNotFoundError: No module named 'udtools.CoNLLUDocument'"},{"fix":"Verify that the file path is correct, the file exists, and your application has the necessary read permissions for the file and its directory.","cause":"The path provided to `CoNLLUDocument.from_file()` does not point to an existing or accessible CoNLL-U file.","error":"FileNotFoundError: [Errno 2] No such file or directory: 'path/to/my_file.conllu'"},{"fix":"Examine line X (and surrounding lines) in the problematic CoNLL-U file. Correct any format errors, such as incorrect number of tab-separated fields, invalid character encoding, or malformed ID/column values.","cause":"The CoNLL-U file being parsed contains syntax errors, missing fields, or incorrect formatting on a specific line, violating the CoNLL-U standard.","error":"udtools.conllu.CoNLLUError: Invalid CoNLL-U format: Line X does not conform to the specification."}]}