tqchem.chem

Functions

is_smiles_string(→ bool)

Check if string could be a smiles string

is_multiline_string(→ bool)

is_existing_file(→ bool)

Tries to check if a string could be an existing file

ase_from_rdkit(→ ase.Atoms)

Convert rdkit.Mol object to ase.Atoms object

ase_to_rdkit(→ rdkit.Chem.Mol)

Convert ase.Atoms object to rdkit.Mol object

ase_from_path(→ ase.Atoms)

ase_from_molecule_file_content(→ ase.Atoms)

Convert a string containing a molecule file's content to ase.Atoms

rdkit_from_smiles(→ rdkit.Chem.Mol)

Convert a smiles string to a rdkit Molecule

ase_from_smiles(→ ase.Atoms)

Convert a smiles string to ase.Atoms object using rdkit

ase_to_xyz_content(→ str)

Return string containing the contents of an xyz file

ase_molecule(→ ase.Atoms)

Smart factory for ase.Atoms object

xyz_contents(→ str)

Given a molecule specification return the contents of the corresponding xyz file

align_molecules(→ list[float])

Aligns molecules to a reference molecule.

align_xyz_strings(→ tuple[list[str], list[float]])

Aligns molecules to a reference molecule.

rdkit_molecules_to_xyz(→ None)

Write list of rdkit molecules to xyz file

adjacencyMatrix(→ numpy.ndarray)

Determine adjacency matrix based on atomic distances

bondMatrix(mol[, radius_cutoff])

Determine bond matrix based on atomic distances

determineBonds(→ numpy.ndarray)

Determine the bonds based on the adjacency matrix

element_color(→ tuple[float, float, float])

Return element color for a given atomic number

wiberg_bond_orders(→ numpy.ndarray)

atoms_collided(→ bool)

True if two atoms are closer than a set cutoff.

adjacency_differs(→ bool)

Return if 2 adjacency matrices differ

generate_rdkit_conformers(→ list[int])

Generate conformers from rdkit and returns ids of generated ones

Module Contents

tqchem.chem.is_smiles_string(string: str) bool

Check if string could be a smiles string

SMILES strings may contain:

  • Letters of the element symbols:

  • b, c, n, o, p, s for aromatic atoms

  • Numbers for rings

  • . - = # $ : / for Bonds

  • [] + - for Atom and charge specification

  • () for branching

  • @ for stereocenters

tqchem.chem.is_multiline_string(string: str) bool
tqchem.chem.is_existing_file(string: str) bool

Tries to check if a string could be an existing file

An OSError is for example raised when the filename is too long. As the error codes are OS dependent we can only assume that an OSError means that we do not have a valid file.

tqchem.chem.ase_from_rdkit(molecule: rdkit.Chem.Mol, conformer: rdkit.Chem.Conformer | int = -1) ase.Atoms

Convert rdkit.Mol object to ase.Atoms object

Note the Molecule needs to be “embedded” or have 3D coordinates

tqchem.chem.ase_to_rdkit(molecule: ase.Atoms, charge: int = 0) rdkit.Chem.Mol

Convert ase.Atoms object to rdkit.Mol object

tqchem.chem.ase_from_path(path: pathlib.Path, format_: str = None) ase.Atoms
tqchem.chem.ase_from_molecule_file_content(string: str, format_: str = None) ase.Atoms

Convert a string containing a molecule file’s content to ase.Atoms

The format could be .xyz, .mol, .sdf, but if none is specified .xyz is assumed

tqchem.chem.rdkit_from_smiles(smiles: str) rdkit.Chem.Mol

Convert a smiles string to a rdkit Molecule

tqchem.chem.ase_from_smiles(smiles: str) ase.Atoms

Convert a smiles string to ase.Atoms object using rdkit

tqchem.chem.ase_to_xyz_content(molecule: ase.Atoms, comment: str = '') str

Return string containing the contents of an xyz file

tqchem.chem.ase_molecule(input_: pathlib.Path | str | ase.Atoms | rdkit.Chem.Mol, format_: str = None) ase.Atoms

Smart factory for ase.Atoms object

Parameters:
  • input ((Path | str | ase.Atoms | Chem.Mol)) – Molecule provided as: Path object, ase.Atoms, rdkit.Chem.Mol, or string containing the contents of an xyz file, a Smiles string or a filename

  • format (str, optional) – Optional format string used when a filename or Path or the content of a molecule file are provided

Returns:

molecule

Return type:

ase.Atoms

tqchem.chem.xyz_contents(input_: pathlib.Path | str | ase.Atoms | rdkit.Chem.Mol, comment: str = '') str

Given a molecule specification return the contents of the corresponding xyz file

tqchem.chem.align_molecules(molecules: list[rdkit.Chem.Mol], reference: rdkit.Chem.Mol, to_previous: bool = False) list[float]

Aligns molecules to a reference molecule.

Parameters:
  • molecules (list[rdkit.Chem.Mol]) – list of molecules to align

  • reference (rdkit.Chem.Mol) – Molecule to align to

  • to_previous (bool, optional) – Align to previous molecule

Returns:

rmsds – list of root mean square deviations from the reference

Return type:

list[float]

tqchem.chem.align_xyz_strings(xyzs: list[str], reference: str, to_previous: bool = False) tuple[list[str], list[float]]

Aligns molecules to a reference molecule.

Parameters:
  • xyzs (list[str]) – list of strings containing xyz data

  • reference (str) – String containing xyz data for the reference to align against

  • to_previous (bool, optional) – Align to previous molecule

Returns:

  • aligned_xyzs (list[str]) – aligned xyz strings

  • rmsds (list[float]) – list of root mean square deviations from the reference

tqchem.chem.rdkit_molecules_to_xyz(molecules: list[rdkit.Chem.Mol], path: str | pathlib.Path) None

Write list of rdkit molecules to xyz file

Parameters:
  • molecules (list[rdkit.Chem.Mol]) – list of molecules to write to file

  • path (str | Path) – Path to the xyz file

tqchem.chem.adjacencyMatrix(mol: ase.Atoms) numpy.ndarray

Determine adjacency matrix based on atomic distances

Parameters:

mol (ase.atoms.Atoms) – Object representing the molecule

Returns:

mat – Adjacency matrix for each pair of atoms

Return type:

np.array

tqchem.chem.bondMatrix(mol: ase.Atoms, radius_cutoff: float = ...)

Determine bond matrix based on atomic distances

Parameters:
  • mol (ase.atoms.Atoms) – Object representing the molecule

  • radius_cutoff (float) – Cutoff for the bond length

Returns:

mat – matrix which is 1 for bond between atoms and 0 otherwise

Return type:

np.array

tqchem.chem.determineBonds(mol: ase.Atoms) numpy.ndarray

Determine the bonds based on the adjacency matrix

tqchem.chem.element_color(atomic_number: int) tuple[float, float, float]

Return element color for a given atomic number

tqchem.chem.wiberg_bond_orders(molecule: ase.Atoms) numpy.ndarray
tqchem.chem.atoms_collided(molecule: ase.Atoms, cutoff: float = 0.5) bool

True if two atoms are closer than a set cutoff.

tqchem.chem.adjacency_differs(adjacency1: numpy.ndarray, adjacency2: numpy.ndarray) bool

Return if 2 adjacency matrices differ

tqchem.chem.generate_rdkit_conformers(molecule: rdkit.Chem.Mol, n_conformers: int, threshold: float = 0.125, keep_current: bool = False, threads: int = 4, seed: int = 42) list[int]

Generate conformers from rdkit and returns ids of generated ones

The conformers are stored inside the molecule and a list of conformer ids for the conformers generated during this function is returned

Parameters:
  • molecule (Chem.Mol) – rdkit molecule object

  • n_conformers (int,) – Number of conformers to generate. If threshold > 0 we generate 10 times more to get at least the specified number

  • threshold (float, default=0.125) – RMSD threshold below which we consider conformers identical

  • keep_current (bool, default=False) – Keep conformers currently stored in molecule

  • threads (int, default=4) – Number of threads to use in the conformer generation

  • seed (int, default=42) – Random seed

Return type:

list of conformer ids for the conformers generated