mmCIF Core Access Library

1.1.0 · active · verified Sat Apr 11

The `mmcif` Python library, current version 1.1.0, provides a comprehensive API for interacting with macromolecular Crystallographic Information File (mmCIF) and BinaryCIF data. Developed by the RCSB PDB, it includes native Python functionality and leverages pybind11 wrappers for accelerated I/O operations from a C++ core library. It is designed for reading, manipulating, and exporting structural biology data in mmCIF format and is actively maintained with updates.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to read an mmCIF file from a URL, access its data blocks, and extract information from a specific category using the `MarshalUtil` class.

import os
from rcsb.utils.io.MarshalUtil import MarshalUtil

# Create a MarshalUtil instance for I/O operations
mU = MarshalUtil()

# Define a public mmCIF file URL (e.g., from RCSB PDB)
mmcif_url = "https://files.rcsb.org/download/1ema.cif"

# Load data from the URL. The library can handle both local paths and URLs.
dataContainerList = mU.load(mmcif_url, contentType="mmcif")

if dataContainerList:
    # An mmCIF file can contain multiple data blocks; typically, we access the first one
    dataContainer = dataContainerList[0]

    print(f"Data block ID: {dataContainer.getName()}")

    # Access a specific data category, e.g., '_entity'
    entity_category = dataContainer.getObj("entity")

    if entity_category:
        print(f"\nFound {entity_category.getRowCount()} entities:")
        for i in range(entity_category.getRowCount()):
            pdbx_description = entity_category.getValue("pdbx_description", i)
            type_val = entity_category.getValue("type", i)
            print(f"  - Entity {i+1}: Type='{type_val}', Description='{pdbx_description}'")
    else:
        print("'_entity' category not found in the mmCIF file.")
else:
    print(f"Failed to load data from {mmcif_url}. Please check the URL and network connection.")

view raw JSON →