DeepChem

raw JSON →
2.8.0 verified Sat May 09 auth: no python

DeepChem is a Python library for deep learning in drug discovery, quantum chemistry, and the life sciences. It provides molecular featurization, model building, and dataset handling, supporting both TensorFlow and PyTorch backends. Current stable version is 2.8.0, with approximately bi-annual releases.

pip install deepchem
error ModuleNotFoundError: No module named 'deepchem.feat.graph_features'
cause The 'graph_features' submodule was removed in DeepChem 2.7.0, with graph featurizers moved to 'deepchem.feat.graph_data' or directly to 'deepchem.feat'.
fix
Update import to 'from deepchem.feat import MolGraphConvFeaturizer' or use 'from deepchem.feat.graph_data import GraphData'.
error AttributeError: 'GraphData' object has no attribute 'num_node_features'
cause In newer versions of DeepChem, the attribute was renamed or restructured. Old code expecting a direct attribute fails.
fix
Access features via 'graph_data.node_features.shape[1]' instead of 'graph_data.num_node_features'.
error ImportError: cannot import name 'load_delaney' from 'deepchem.molnet'
cause The 'load_delaney' function was removed or moved. MoleculeNet loaders are now organized differently in DeepChem 2.8.0.
fix
Use 'from deepchem.molnet import load_delaney' (still exists) or check if the dataset is available via 'dc.molnet.load_delaney()'. If not found, use the new API: 'from deepchem.molnet import load_delaney' (same). Actually, this error often occurs due to mismatched versions; for 2.8.0, it should be present. If not, try using 'deepchem.molnet.load_delaney'.
error ValueError: Graph convolution requires RDKit to be installed.
cause RDKit is a required dependency for molecular featurization, but not automatically installed with pip. Missing RDKit causes this error.
fix
Install RDKit via conda: 'conda install -c conda-forge rdkit' (recommended) or pip: 'pip install rdkit-pypi' (may have issues).
breaking DeepChem 2.4.0 dropped TensorGraph and moved to Keras-based models. Models built with older versions will not work without migration.
fix All models now use Keras layers. TensorGraph classes have been removed.
breaking DeepChem 2.8.0 requires Python <3.12. Installation on Python 3.12 will fail.
fix Use Python 3.7–3.11. Downgrade Python or use a virtual environment with supported version.
gotcha Default backend between TensorFlow and PyTorch is not always clear. Some models (e.g., GraphConvModel) are TensorFlow only, while others (e.g., GCNModel) exist in both. Check documentation for backend compatibility.
fix Use the specific model import (e.g., 'from deepchem.models import GCNModel') and ensure required backend is installed. For PyTorch models, install PyTorch separately.
deprecated The legacy featurizer classes like 'CircularFingerprint' are deprecated in favor of graph-based featurizers (e.g., 'MolGraphConvFeaturizer').
fix Use 'from deepchem.feat import MolGraphConvFeaturizer' instead.
pip install deepchem[gpu]

Minimal example: load Delaney solubility dataset, train a GraphConv model, and evaluate with Pearson R².

import deepchem as dc

# Load a dataset from MoleculeNet
tasks, datasets, transformers = dc.molnet.load_delaney(featurizer='GraphConv')
train_dataset, valid_dataset, test_dataset = datasets

# Define a graph convolutional model
model = dc.models.GraphConvModel(n_tasks=len(tasks), mode='regression', dropout=0.2)

# Fit the model on training data
model.fit(train_dataset, nb_epoch=10)

# Evaluate on test set
metric = dc.metrics.Metric(dc.metrics.pearson_r2_score)
scores = model.evaluate(test_dataset, [metric])
print(scores)