Pymatgen: Python Materials Genomics
Pymatgen (Python Materials Genomics) is a robust materials analysis code that defines core object representations for crystal structures, molecules, and electronic structure data. It powers the Materials Project and offers extensive tools for materials design, data analysis, and high-throughput computations. The current version is 2026.3.23, and it maintains a rapid release cadence, often with monthly or bi-monthly updates.
Warnings
- gotcha Starting with v2026.3.23, `pymatgen` underwent a major internal reorganization. Core functionality was moved to a separate `pymatgen-core` package, which `pymatgen` now depends on. While `pip install pymatgen` should remain fully backwards compatible and provide the same functionality, be aware of this architectural change in case of complex dependency management or introspection needs.
- gotcha The `orjson` library became a required dependency as of v2025.5.28 for faster JSON handling. Ensure `orjson` is installed in your environment if you encounter import errors or unexpected behavior related to JSON serialization.
- gotcha The usage of `lxml` for `Vasprun` parsing fluctuated around v2025.5.1 and v2025.5.2. It was introduced for speed improvements, then removed shortly after for specific situations. This might lead to unexpected performance changes or dependency requirements depending on the exact `pymatgen` version you are using if `Vasprun` parsing is critical.
- gotcha The behavior of `MPRester.get_entries` regarding `property_data` and `summary_data` was clarified and refined in v2025.4.24. `property_data` is now consistent with the returned entry, while `summary_data` (obtained via a kwarg) is more comprehensive but not always consistent. Users relying on specific data consistency may need to adjust their calls.
- deprecated The internal dependency for BibTeX parsing shifted from `pybtex` to `bibtexparser` in v2025.4.19. While `pymatgen` handles this change internally, users who were directly interacting with `pybtex`-related utilities within `pymatgen` might need to update their code.
Install
-
pip install pymatgen
Imports
- Structure
from pymatgen.core import Structure
- Lattice
from pymatgen.core import Lattice
- Molecule
from pymatgen.core import Molecule
- MPRester
from pymatgen.ext.matproj import MPRester
- Vasprun
from pymatgen.io.vasp import Vasprun
- Kpoints
from pymatgen.io.vasp.inputs import Kpoints
Quickstart
import os
from pymatgen.core import Structure, Lattice, Species
from pymatgen.ext.matproj import MPRester
# 1. Create a simple crystal structure (e.g., BCC iron)
lattice = Lattice.cubic(2.86)
species = [Species("Fe")]
coords = [[0, 0, 0]]
structure = Structure(lattice, species, coords)
print(f"Created structure: {structure.formula} with {structure.num_sites} sites.")
# 2. Use MPRester to fetch data (requires API key)
# Get your API key from materialsproject.org after logging in
# Set it as an environment variable 'MP_API_KEY'
api_key = os.environ.get("MP_API_KEY", "")
if api_key:
try:
with MPRester(api_key) as mpr:
# Fetch entries for a chemical system, e.g., Li-Fe-O
entries = mpr.get_entries("Li-Fe-O", inc_structure=True, property_data=["band_gap"])
print(f"Found {len(entries)} entries for Li-Fe-O.")
if entries:
first_entry = entries[0]
print(f"First entry formula: {first_entry.formula_pretty}")
if "band_gap" in first_entry.data:
print(f"Band gap: {first_entry.data['band_gap']} eV")
except Exception as e:
print(f"Error fetching data from Materials Project: {e}")
print("Ensure your MP_API_KEY is valid and has network access.")
else:
print("MP_API_KEY environment variable not found. Skipping MPRester example.")