{"id":7985,"library":"bioc","title":"bioc - Processing BioC, Brat, and PubTator with Python","description":"bioc is a Python library designed for processing and manipulating data in BioC XML/JSON, Brat standoff, and PubTator formats. It provides an API that facilitates reading, writing, and working with these common bioinformatics text mining annotation formats. Currently at version 2.1, the library undergoes releases with a focus on supporting the latest Python versions and format specifications.","status":"active","version":"2.1","language":"en","source_language":"en","source_url":"https://github.com/bionlplab/bioc","tags":["bioc","brat","pubtator","bioinformatics","nlp","xml","json","text-mining"],"install":[{"cmd":"pip install bioc","lang":"bash","label":"Install latest stable version"}],"dependencies":[{"reason":"Requires Python 3.6 or higher for compatibility and features.","package":"python","optional":false}],"imports":[{"note":"In versions 2.x and later, BioC XML functionalities are moved under the `biocxml` submodule. Direct `import bioc` no longer exposes XML dump/load functions.","wrong":"import bioc","symbol":"biocxml","correct":"from bioc import biocxml"},{"symbol":"brat","correct":"from bioc import brat"},{"symbol":"pubtator","correct":"from bioc import pubtator"},{"note":"Core BioC data structures like `BioCCollection`, `BioCDocument`, etc., are typically accessed via `bioc.bioc` submodule.","wrong":"from bioc import BioCCollection","symbol":"BioCCollection","correct":"from bioc.bioc import BioCCollection"}],"quickstart":{"code":"from bioc import biocxml, bioc\n\n# Create a simple BioC Collection\ncollection = bioc.BioCCollection()\ncollection.date = '2023-01-01'\ncollection.source = 'Example'\n\ndocument = bioc.BioCDocument()\ndocument.id = '123'\n\npassage = bioc.BioCPassage()\npassage.offset = 0\npassage.text = 'This is a test sentence.'\n\nannotation = bioc.BioCAnnotation()\nannotation.id = 'T1'\nannotation.text = 'test sentence'\nannotation.add_location(bioc.BioCLocation(offset=10, length=13))\npassage.add_annotation(annotation)\n\ndocument.add_passage(passage)\ncollection.add_document(document)\n\n# Serialize to a BioC XML string\nxml_string = biocxml.dumps(collection, pretty_print=True)\nprint('--- BioC XML ---')\nprint(xml_string)\n\n# Deserialize from a BioC XML string\nloaded_collection = biocxml.loads(xml_string)\nprint('\\n--- Loaded Collection ID ---')\nfor doc in loaded_collection.documents:\n    print(doc.id)\n","lang":"python","description":"This quickstart demonstrates how to create a basic BioC collection programmatically, add a document with a passage and an annotation, and then serialize it to a BioC XML string using `biocxml.dumps`. It also shows how to deserialize an XML string back into a BioC collection using `biocxml.loads`."},"warnings":[{"fix":"Update imports from `import bioc` to `from bioc import biocxml` for XML operations, and then use `biocxml.dump()` or `biocxml.load()`.","message":"Direct top-level import of BioC XML functions (e.g., `bioc.dump`, `bioc.load`, `bioc.dumps`, `bioc.loads`) was removed in version 2.0. These functions are now part of the `biocxml` submodule.","severity":"breaking","affected_versions":">=2.0"},{"fix":"Ensure your project runs on Python 3.6 or newer. Upgrade your Python environment if necessary.","message":"Python 2.x is no longer supported. Version 1.2.1 removed support for Python 2, and subsequent versions are Python 3.6+ only.","severity":"breaking","affected_versions":">=1.2.1"},{"fix":"Disregard the 'Development Status' classifier on PyPI; the library is stable and actively developed.","message":"The PyPI project metadata still lists the 'Development Status' as '1 - Planning' (as of v2.1). This is misleading as the library has undergone multiple releases and is actively maintained for production use.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"For XML operations, use `from bioc import biocxml` and then call `biocxml.dump()` or `biocxml.load()`.","cause":"Attempting to use `bioc.dump` or `bioc.load` directly in `bioc` versions 2.x or later.","error":"AttributeError: module 'bioc' has no attribute 'dump'"},{"fix":"The `biocxml` module is part of the `bioc` package. Use `from bioc import biocxml`.","cause":"Trying to import `biocxml` directly as a top-level package or without `from bioc`.","error":"ModuleNotFoundError: No module named 'biocxml'"},{"fix":"Upgrade your Python environment to Python 3.6 or higher. `bioc` no longer supports Python 2.","cause":"Running `bioc` code (especially versions 1.2.1+) with a Python 2 interpreter.","error":"SyntaxError: invalid syntax (when running on Python 2.x)"}]}