Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
.idea
.venv
tests/test-results/
tests/test-results/
docs/_build/
28 changes: 28 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

# Required
version: 2

# Set the OS, Python version and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.11"

# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/conf.py

# Optionally build your docs in additional formats such as PDF and ePub
formats:
- pdf
- epub

# Optionally declare the Python requirements required to build your docs
python:
install:
- method: pip
path: .
- requirements: docs/requirements.txt
35 changes: 23 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# ChemGED

ChemGED is a Python package for enabling the appoximate graph edit distance (GED) computation between
chemicals. Normally, GED is a NP-hard problem, but ChemGED uses heuristics to approximate the GED in
ChemGED is a Python package for enabling the appoximate graph edit distance (GED) computation between
chemicals. Normally, GED is a NP-hard problem, but ChemGED uses heuristics to approximate the GED in
a reasonable time.

## Installation
Expand All @@ -12,8 +12,8 @@ pip install chemged
```

## Usage
To use ChemGED, you just create an ``ApproximateChemicalGED`` object, and then call the
``compute_ged`` method with two chemical structures. They can be SMILES *or*
To use ChemGED, you just create an ``ApproximateChemicalGED`` object, and then call the
``compute_ged`` method with two chemical structures. They can be SMILES *or*
[RDKit](https://www.rdkit.org/docs/index.html) Mol objects.

```python
Expand All @@ -33,25 +33,36 @@ approx_ged = ged_calc.compute_ged(MolFromSmiles(chemical1), MolFromSmiles(chemic
ChemGED also implements ``pdist`` and ``cdist`` functions to compute pairwise distances between
sets of chemicals. These will return as Numpy arrays.

> [!NOTE]
> [!NOTE]
> ``pdist`` will return the vector-form distance vector, while ``cdist`` will return a
> square-form distance matrix. ``scipy.spatial.distance.squareform`` can be used to convert
> the vector-form distance vector to a square-form distance matrix.

```python
from chemged import pdist, cdist

# Create a list of chemicals
chemicals = ["CCO", "CCN", "CC", "C"]

# Compute pairwise distances
distances_vector = pdist(chemicals)
print(distances_vector) # Vector form

# Compute all-vs-all distances
distances_matrix = cdist(chemicals, chemicals)
print(distances_matrix) # Square form
```

## Documentation
For more detailed documentation, please visit [ReadTheDocs](https://chemged.readthedocs.io/).

## Implementation

The approach used here uses bipartite graph matching[[1]](#1), and most of its implementation in python
The approach used here uses bipartite graph matching[[1]](#1), and most of its implementation in python
is based off scripts from https://github.com/priba/aproximated_ged/tree/master.
ChemGED uses [RDKit](https://www.rdkit.org/docs/index.html) to handle chemicals inside python.


## References
<a id="1">[1]</a>
Riesen, Kaspar, and Horst Bunke.
"Approximate graph edit distance computation by means of bipartite graph matching."
Image and Vision computing 27.7 (2009): 950-959
<a id="1">[1]</a>
Riesen, Kaspar, and Horst Bunke.
"Approximate graph edit distance computation by means of bipartite graph matching."
Image and Vision computing 27.7 (2009): 950-959
45 changes: 45 additions & 0 deletions docs/api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
=============
API Reference
=============

This page provides detailed documentation for all the classes and functions in the ChemGED package.

ApproximateChemicalGED
----------------------

.. autoclass:: src.chemged.ApproximateChemicalGED
:members:
:undoc-members:
:show-inheritance:

ChemicalGEDCostMatrix
---------------------

.. autoclass:: src.chemged.ChemicalGEDCostMatrix
:members:
:undoc-members:
:show-inheritance:

UniformElementCostMatrix
------------------------

.. autoclass:: src.chemged.UniformElementCostMatrix
:members:
:undoc-members:
:show-inheritance:

Chemical Utilities
==================

.. automodule:: src.chemged.chem_utils
:members:
:undoc-members:
:show-inheritance:

Plotting Utilities
==================

.. automodule:: src.chemged.plotting
:members:
:undoc-members:
:show-inheritance:
26 changes: 26 additions & 0 deletions docs/chemical_graph_attributes.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
.. _chem-graph-attributes:

=========================
Chemical Graph Attributes
=========================

ChemGED uses the NetworkX library to represent chemical graphs. Each node in the graph represents an atom,
and each edge represents a bond between atoms. The attributes of these nodes and edges are crucial for
calculating the graph edit distance (GED) accurately.

By default, the ``mol_to_nx`` function in ChemGED will convert a chemical into a NetworkX graph with the following
atom attributes stored in the node:

- **atomic_number**: The atomic number of the atom (e.g., 6 for Carbon, 8 for Oxygen).
- **degree**: The degree of the atom in the graph (number of bonds).
- **formal_charge**: The formal charge of the atom.
- **hybridization**: The hybridization state of the atom (e.g., 'sp3', 'sp2').
- **is_aromatic**: A boolean indicating if the atom is part of an aromatic ring.
- **is_in_ring**: A boolean indicating if the atom is part of a ring structure.
- **hydrogen**: The number of hydrogen atoms attached to the atom.

Chemical bond attributes are stored in the graph. The edge attributes include:
- **bond_type**: The type of bond (e.g., 'single', 'double', 'triple', 'aromatic').

While these are the current attributes stored in the graph, you can extend or modify these attributes as
needed for your specific use case.
63 changes: 63 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
"""Configuration file for the Sphinx documentation builder."""

# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

import os
import sys


sys.path.insert(0, os.path.abspath(".."))
sys.path.insert(0, os.path.abspath("../src"))

from chemged import __version__ as version


# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

source_suffix = ".rst"
master_doc = "index"

project = "ChemGED"
copyright = "2025, James Wellnitz"
author = "James Wellnitz"
release = version # Added for ReadTheDocs compatibility

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = [
"sphinx.ext.autodoc",
"sphinx.ext.viewcode",
"sphinx.ext.napoleon",
"sphinx.ext.intersphinx",
"sphinx.ext.mathjax",
]

templates_path = ["_templates"]
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]

# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = "sphinx_rtd_theme"
html_static_path = ["_static"]

# -- Extension configuration -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#configuration

autodoc_member_order = "bysource"
autodoc_typehints = "description"

# -- Intersphinx configuration -----------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/extensions/intersphinx.html#configuration

intersphinx_mapping = {
"python": ("https://docs.python.org/3", None),
"numpy": ("https://numpy.org/doc/stable/", None),
"networkx": ("https://networkx.org/documentation/stable/", None),
}

issues_uri = "https://github.com/molecularmodelinglab/ChemGED/{issue}"
issues_pr_uri = "https://github.com/molecularmodelinglab/ChemGED/pull/{pr}"
Loading