-
Notifications
You must be signed in to change notification settings - Fork 104
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #736 from padix-key/rdkit
Add interface to RDKit
- Loading branch information
Showing
19 changed files
with
941 additions
and
36 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
:sd_hide_title: true | ||
|
||
.. include:: /tutorial/preamble.rst | ||
|
||
########################## | ||
``interface`` subpackage | ||
########################## | ||
|
||
Connecting the ecosystem - The ``interface`` subpackage | ||
======================================================= | ||
|
||
.. currentmodule:: biotite.interface | ||
|
||
In the last section we learned that :mod:`biotite.application` encapsulates entire | ||
external application runs with subsequent calls of ``start()`` and ``join()``. | ||
In contrast :mod:`biotite.interface` provides flexible interfaces to other Python | ||
packages in the bioinformatics ecosystem. | ||
Its purpose is to convert between native Biotite objects, such as :class:`.AtomArray` | ||
and :class:`.Sequence`, and the corresponding objects in the respective interfaced | ||
package. | ||
Each interface is located in a separate subpackage with the same name as the | ||
interfaced package. | ||
For example, the interface to ``rdkit`` is placed in the subpackage | ||
:mod:`biotite.interface.rdkit`. | ||
|
||
.. note:: | ||
|
||
Like in :mod:`biotite.application`, the interfaced Python packages are not | ||
dependencies of the ``biotite`` package. | ||
Hence, they need to be installed separately. | ||
|
||
The following chapters will give you an overview of the different interfaced packages. | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:hidden: | ||
|
||
rdkit |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
.. include:: /tutorial/preamble.rst | ||
|
||
Interface to RDKit | ||
================== | ||
|
||
.. currentmodule:: biotite.interface.rdkit | ||
|
||
`RDKit <https://www.rdkit.org/>`_ is a popular cheminformatics package | ||
and thus can be used to supplement *Biotite* with a variety of functionalities focused | ||
on small molecules, such as conversion from/to textual representations | ||
(e.g. *SMILES* and *InChI*) and visualization as structural formulas. | ||
Basically, the :mod:`biotite.interface.rdkit` subpackage provides only two functions: | ||
:func:`to_mol()` to obtain a :class:`rdkit.Chem.rdchem.Mol` from an :class:`.AtomArray` | ||
and :func:`from_mol()` for the reverse direction. | ||
The rest happens within the realm of *RDKit*. | ||
This tutorial will only give a small glance on how the interface can be used. | ||
For comprehensive documentation refer to the | ||
`RDKit documentation <https://www.rdkit.org/docs/>`_. | ||
|
||
First example: Depiction as structural formula | ||
---------------------------------------------- | ||
*RDKit* allows rendering structural formulas using | ||
`pillow <https://pillow.readthedocs.io/en/stable/>`_. | ||
For a proper structural formula, we need to compute proper 2D coordinates first. | ||
|
||
.. jupyter-execute:: | ||
|
||
import biotite.interface.rdkit as rdkit_interface | ||
import biotite.structure.info as struc | ||
from rdkit.Chem.Draw import MolToImage | ||
from rdkit.Chem.rdDepictor import Compute2DCoords | ||
from rdkit.Chem.rdmolops import RemoveHs | ||
|
||
penicillin = struc.residue("PNN") | ||
mol = rdkit_interface.to_mol(penicillin) | ||
# We do not want to include explicit hydrogen atoms in the structural formula | ||
mol = RemoveHs(mol) | ||
Compute2DCoords(mol) | ||
image = MolToImage(mol, size=(600, 400)) | ||
display(image) | ||
|
||
Second example: Creating a molecule from SMILES | ||
----------------------------------------------- | ||
Although the *Chemical Component Dictionary* accessible from | ||
:mod:`biotite.structure.info` already provides all compounds found in the PDB, | ||
there are a myriad of compounds out there that are not part of it. | ||
One way to to obtain them as :class:`.AtomArray` is passing a *SMILES* string to | ||
*RDKit* to obtain the topology of the molecule and then computing the coordinates. | ||
|
||
.. jupyter-execute:: | ||
|
||
from rdkit.Chem import MolFromSmiles | ||
from rdkit.Chem.rdDistGeom import EmbedMolecule | ||
from rdkit.Chem.rdForceFieldHelpers import UFFOptimizeMolecule | ||
from rdkit.Chem.rdmolops import AddHs | ||
|
||
ERTAPENEM_SMILES = "C[C@@H]1[C@@H]2[C@H](C(=O)N2C(=C1S[C@H]3C[C@H](NC3)C(=O)NC4=CC=CC(=C4)C(=O)O)C(=O)O)[C@@H](C)O" | ||
|
||
mol = MolFromSmiles(ERTAPENEM_SMILES) | ||
# RDKit uses implicit hydrogen atoms by default, but Biotite requires explicit ones | ||
mol = AddHs(mol) | ||
# Create a 3D conformer | ||
conformer_id = EmbedMolecule(mol) | ||
UFFOptimizeMolecule(mol) | ||
ertapenem = rdkit_interface.from_mol(mol, conformer_id) | ||
print(ertapenem) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
# This source code is part of the Biotite package and is distributed | ||
# under the 3-Clause BSD License. Please see 'LICENSE.rst' for further | ||
# information. | ||
|
||
""" | ||
This subpackage provides interfaces to other Python packages in the bioinformatics | ||
ecosystem. | ||
Its purpose is to convert between native Biotite objects, such as :class:`.AtomArray` | ||
and :class:`.Sequence`, and the corresponding objects in the respective interfaced | ||
package. | ||
In contrast to :mod:`biotite.application`, where an entire application run is handled | ||
under the hood, :mod:`biotite.interface` only covers the object conversion, allowing | ||
for more flexibility. | ||
""" | ||
|
||
__name__ = "biotite.interface" | ||
__author__ = "Patrick Kunzmann" | ||
|
||
from .warning import * |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# This source code is part of the Biotite package and is distributed | ||
# under the 3-Clause BSD License. Please see 'LICENSE.rst' for further | ||
# information. | ||
|
||
""" | ||
This subpackage provides an interface to the `RDKit <https://www.rdkit.org/>`_ | ||
cheminformatics package. | ||
It allows conversion between :class:`.AtomArray` and :class:`rdkit.Chem.rdchem.Mol` | ||
objects. | ||
""" | ||
|
||
__name__ = "biotite.interface.rdkit" | ||
__author__ = "Patrick Kunzmann" | ||
|
||
from .mol import * |
Oops, something went wrong.