Bioregistry
The Bioregistry is an integrative, open, community-driven meta-registry of databases, ontologies, and other nomenclature resources in the life sciences. It provides a Python package for common tasks like metadata lookup, CURIE expansion, and URI contraction. The library is actively maintained with frequent updates and a continuous release cadence, currently at version 0.13.40.
Common errors
-
AttributeError: 'NoneType' object has no attribute 'name' (or similar for other attributes)
cause Attempting to access attributes (e.g., `name`, `homepage`) on a `None` object returned by `bioregistry.get_resource()` or `bioregistry.normalize_prefix()`, indicating that the requested prefix is not recognized or found in the Bioregistry.fixAlways check if the result of `br.get_resource()` or `br.normalize_prefix()` is `None` before attempting to access its attributes. Example: `entry = br.get_resource('unknown_prefix'); if entry: print(entry.name)`. -
Failed to resolve CURIE 'prefix:invalid_id' / No URI found for 'prefix:invalid_id'
cause The provided identifier part of the CURIE ('invalid_id') does not conform to the expected regular expression pattern defined for the 'prefix' in the Bioregistry, or there are no configured providers to generate a URI for that prefix.fixVerify the identifier's format against the expected pattern, which can sometimes be found in the resource's metadata via `br.get_resource(prefix).pattern`. Ensure the prefix itself is valid and has registered providers for resolution.
Warnings
- breaking Changes in Bioregistry's internal canonicalization for certain non-biological prefixes (e.g., 'rdf') can lead to unexpected case changes (e.g., 'RDF' instead of 'rdf') in merged prefix maps, affecting downstream tools that consume Bioregistry's output and assume lowercase canonicalization for such prefixes.
- gotcha The Bioregistry does not fully cover or align with all external registries due to varying quality standards, lack of metadata, or resources being decommissioned. This means some prefixes from external sources may not be present or have complete metadata.
- gotcha CURIEs can fail to resolve or validate for three main reasons: the prefix is not registered, the identifier does not match the validation pattern, or no providers are available for that prefix. The `resolve()` and `expand_curie()` functions will return `None` or raise an error in these cases.
- gotcha Some ontology identifiers embed redundant prefixes (e.g., `GO:GO:0006915`). Bioregistry handles these cases, but this can be a source of confusion as different registries might manage these 'banana-style' identifiers differently.
Install
-
pip install bioregistry
Imports
- bioregistry
import bioregistry
- br
import bioregistry as br
Quickstart
import bioregistry as br
# Get metadata for a resource
taxonomy_entry = br.get_resource('taxonomy')
print(f"Taxonomy Name: {taxonomy_entry.name}")
print(f"Taxonomy Homepage: {taxonomy_entry.homepage}")
# Normalize a prefix
normalized_ec = br.normalize_prefix('ec-code')
print(f"'ec-code' normalized to: {normalized_ec}")
normalized_pubchem = br.normalize_prefix('pubchem')
print(f"'pubchem' normalized to: {normalized_pubchem}")
# Resolve a CURIE to a URI
curie = 'chebi:138488'
uri = br.resolve(curie)
print(f"Resolved {curie} to: {uri}")
# Expand a CURIE to a URI
expanded_uri = br.expand_curie(curie)
print(f"Expanded {curie} to: {expanded_uri}")