{"id":1694,"library":"rdkit","title":"RDKit","description":"RDKit is a free, open-source toolkit for chemoinformatics, combining chemistry with computer science. It provides functionalities for analyzing molecules, predicting chemical properties, visualizing structures, and preparing data for drug discovery. The library is primarily written in C++ with extensive Python bindings. It maintains a regular release cadence with two major releases per year (typically March/April and September/October) and monthly patch releases, indicating active development and maintenance.","status":"active","version":"2026.3.1","language":"en","source_language":"en","source_url":"https://github.com/kuelumbus/rdkit-pypi","tags":["chemoinformatics","cheminformatics","chemistry","molecular modeling","drug discovery","molecules","smiles"],"install":[{"cmd":"pip install rdkit","lang":"bash","label":"Using pip"},{"cmd":"conda create -n my_rdkit_env -c conda-forge rdkit\nconda activate my_rdkit_env","lang":"bash","label":"Using Conda (recommended for environments)"}],"dependencies":[{"reason":"Commonly required for numerical operations and data handling, often implicitly pulled by RDKit functionalities.","package":"numpy","optional":true},{"reason":"Required for image generation and visualization features like `rdkit.Chem.Draw.MolToImage()`.","package":"Pillow","optional":true}],"imports":[{"note":"The primary module for molecular manipulation, reading/writing, and properties.","symbol":"Chem","correct":"from rdkit import Chem"},{"note":"Contains more advanced or less frequently used functionality (e.g., 2D->3D generation, force fields). Importing from `rdkit.Chem` is correct; a direct `from rdkit import AllChem` is incorrect.","wrong":"from rdkit import AllChem","symbol":"AllChem","correct":"from rdkit.Chem import AllChem"},{"note":"Module for molecular visualization and drawing.","symbol":"Draw","correct":"from rdkit.Chem import Draw"},{"note":"Used for working with molecular fingerprints and similarity calculations.","symbol":"DataStructs","correct":"from rdkit import DataStructs"}],"quickstart":{"code":"from rdkit import Chem\nfrom rdkit.Chem import Draw\n\n# Create a molecule from a SMILES string\nsmiles_string = \"CCO\"\nmolecule = Chem.MolFromSmiles(smiles_string)\n\nif molecule is not None:\n    print(f\"Successfully created molecule from SMILES: {smiles_string}\")\n    print(f\"Number of heavy atoms: {molecule.GetNumHeavyAtoms()}\")\n    \n    # Optionally add hydrogens for better geometry or calculations\n    mol_with_hs = Chem.AddHs(molecule)\n    print(f\"Total number of atoms (including Hs): {mol_with_hs.GetNumAtoms()}\")\n\n    # Visualize the molecule (requires Pillow installed)\n    # img = Draw.MolToImage(molecule)\n    # img.show() # Uncomment to display image\nelse:\n    print(f\"Failed to create molecule from SMILES: {smiles_string}\")\n    print(\"This might happen for invalid SMILES strings or if sanitization fails.\")","lang":"python","description":"This quickstart demonstrates how to create a molecule object from a SMILES string, retrieve basic properties like atom counts, and highlights the importance of adding explicit hydrogens for certain applications. It also shows how to visualize the molecule using `rdkit.Chem.Draw`."},"warnings":[{"fix":"Update your `pip install` commands and `requirements.txt` to use `pip install rdkit`.","message":"The PyPI package name for RDKit changed from `rdkit-pypi` to `rdkit`. Older installations or `requirements.txt` files might still refer to `rdkit-pypi`.","severity":"breaking","affected_versions":"<=2022.09.5 (for `rdkit-pypi`), all current versions (for new `rdkit` package)"},{"fix":"Always check if the returned molecule object is `None` before proceeding: `mol = Chem.MolFromSmiles(smi); if mol is not None: ...`","message":"MolFromSmiles (and similar functions) return `None` on failure (e.g., for invalid SMILES strings or during sanitization issues) instead of raising an exception. Directly attempting to use methods on a `None` object will lead to `AttributeError`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Use `mol = Chem.AddHs(mol)` to add explicit hydrogens after creating a molecule. They can be removed later with `Chem.RemoveHs(mol)` if needed.","message":"RDKit molecules often implicitly handle hydrogens. For accurate structural calculations, 3D conformer generation, or correct atom counts, explicit hydrogens often need to be added.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Review the detailed release notes for specific versions when upgrading, especially for applications sensitive to molecular representations or calculated properties. Pin RDKit versions for reproducibility in production or research.","message":"Major releases may introduce backwards incompatible changes, particularly in stereochemistry perception, MCS (Maximum Common Substructure) algorithms, canonicalization, ring finding, and default conformer generation parameters (e.g., ETKDG). These changes, while improving accuracy, can lead to different results or require code adjustments.","severity":"breaking","affected_versions":">=2023.03, >=2023.09, >=2025.09.1"},{"fix":"Encapsulate molecule creation and sanitization in `try-except` blocks, or implement robust filtering. Understand common issues like pyrrolic vs. pyridinic nitrogens which often cause sanitization failures.","message":"Molecule sanitization is a critical step, and `Chem.SanitizeMol()` can raise `KekulizeException` or `ValenceException` for chemically invalid structures, particularly with problematic nitrogen protonation or incorrect valences. This can halt processing or result in `None` molecules if not handled.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}