SELFIES

raw JSON →
2.2.0 verified Fri May 01 auth: no python

SELFIES (SELF-referencIng Embedded Strings) is a general-purpose, sequence-based, robust representation of semantically constrained graphs, primarily used for molecular structures in cheminformatics and ML. Current version 2.2.0, releases every few months.

pip install selfies
error AttributeError: module 'selfies' has no attribute 'selfies'
cause Wrong import style: trying to access selfies.selfies.
fix
Use from selfies import encoder, decoder directly.
error RuntimeError: The input SMILES could not be parsed due to semantic constraints.
cause SMILES violates bond constraints (e.g., hypervalent atoms) and strict=True is default.
fix
Either fix the SMILES or pass strict=False to encoder.
error TypeError: encoder() got an unexpected keyword argument 'strict'
cause Using version <2.0.0 where strict parameter didn't exist.
fix
Upgrade selfies: pip install --upgrade selfies
breaking In v2.0.0, the encoder now defaults to strict=True, raising errors for SMILES violating bond constraints. Previously, invalid SMILES were silently allowed.
fix Use encoder(smiles, strict=False) to mimic old behavior, or ensure input SMILES are valid.
deprecated Support for Python 3.5-3.6 dropped in v2.1.0.
fix Upgrade to Python >=3.7.

Basic encode/decode a SMILES string.

from selfies import encoder, decoder

smiles = "CCO"  # ethanol
selfies_str = encoder(smiles)
print(selfies_str)  # [C][C][O]

smiles_back = decoder(selfies_str)
print(smiles_back)  # CCO