DScribe: Machine Learning Descriptors for Atomistic Systems

2.1.2 · active · verified Thu Apr 16

DScribe is a Python package (current version 2.1.2) designed for generating fixed-size numerical fingerprints, known as descriptors, from atomic structures. These descriptors are crucial for various applications in materials science, including machine learning, visualization, and similarity analysis. The library maintains an active development status with regular updates, including new descriptors and derivative functionalities. [1, 2, 5]

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize and use CoulombMatrix and SOAP descriptors for single and multiple atomic structures (represented by ASE Atoms objects). It also shows how to compute derivatives for a descriptor. Note the use of modern parameter names like `r_cut`, `n_max`, `l_max`, and the `compression` parameter for SOAP. [2, 3, 6]

import numpy as np
from ase.build import molecule
from dscribe.descriptors import SOAP, CoulombMatrix

# Define atomic structures
samples = [molecule("H2O"), molecule("NO2"), molecule("CO2")]

# Setup CoulombMatrix descriptor
cm_desc = CoulombMatrix(n_atoms_max=3, permutation="sorted_l2")

# Setup SOAP descriptor (using modern parameter names and compression)
soap_desc = SOAP(species=["C", "H", "O", "N"], r_cut=5, n_max=8, l_max=6, compression="crossover")

# Create descriptors for a single system
water = samples[0]
coulomb_matrix_h2o = cm_desc.create(water)
soap_h2o = soap_desc.create(water, centers=[0])

print("Coulomb Matrix for H2O:\n", coulomb_matrix_h2o)
print("SOAP for Oxygen in H2O:\n", soap_h2o)

# Create descriptors for multiple systems (can be parallelized)
coulomb_matrices_all = cm_desc.create(samples, n_jobs=2)
oxygen_indices = [np.where(x.get_atomic_numbers() == 8)[0] for x in samples]
oxygen_soap_all = soap_desc.create(samples, oxygen_indices, n_jobs=2)

print("Coulomb Matrices for all samples shape:", coulomb_matrices_all.shape)
print("SOAP for Oxygen in all samples shape:", oxygen_soap_all.shape)

# Descriptors also allow calculating derivatives
der, des = soap_desc.derivatives(samples[0], return_descriptor=True)
print("SOAP derivatives shape:", der.shape)
print("SOAP descriptor from derivatives shape:", des.shape)

view raw JSON →