Python utilities for PDBx/mmCIF storage model
mmcif-pdbx provides a pure Python interface for working with PDBx/mmCIF files, emphasizing simplicity. It allows parsing and serialization of macromolecular crystallographic information files. The library is derived from Python examples provided by the wwPDB and is currently at version 2.0.1, with releases occurring periodically to add features and address issues.
Warnings
- breaking Version 1.0.0 introduced significant API changes, breaking compatibility with versions 0.*. These changes include PEP8-compliant class and function naming and simplification of the module structure.
- gotcha This `mmcif-pdbx` package provides a pure Python interface. For higher performance or more comprehensive features, especially those involving C/C++ acceleration for I/O, consider `rcsb/py-mmcif` (PyPI: `mmcif`), which is described as the 'canonical mmCIF Python package'.
Install
-
pip install mmcif-pdbx
Imports
- PdbxReader
from pdbx.reader import PdbxReader
- PdbxWriter
from pdbx.writer import PdbxWriter
- DataContainer
from pdbx.containers import DataContainer
- load
from pdbx import load
- loads
from pdbx import loads
- dump
from pdbx import dump
- dumps
from pdbx import dumps
Quickstart
import io
from pdbx import loads, dumps, DataCategory, DataContainer
# Example mmCIF data as a string
mmcif_data = '''
data_testblock
_entry.id test
loop_
_atom_site.id
_atom_site.type_symbol
_atom_site.label_atom_id
1 C CA
2 O O
'''
# Parse the mmCIF string
data_containers = loads(mmcif_data)
# Access data (assuming one data block)
if data_containers:
data_block = data_containers[0]
print(f"Data block ID: {data_block.name}")
# Access a category
atom_site_category = data_block.getObj('atom_site')
if atom_site_category:
print("\nAtom Site Category:")
for i in range(atom_site_category.getRowCount()):
atom_id = atom_site_category.getValue('id', i)
atom_type = atom_site_category.getValue('type_symbol', i)
print(f" ID: {atom_id}, Type: {atom_type}")
else:
print("Atom_site category not found.")
# Modify data (example: add a new item to the entry category)
entry_category = data_block.getObj('entry')
if entry_category:
entry_category.setValue('new_item', 0, 'new_value')
# Serialize the modified data back to a string
modified_mmcif_data = dumps(data_containers)
print("\nModified mmCIF data:\n", modified_mmcif_data)
else:
print("No data containers found.")