Skip to content

Open reimplementation of the Ertl algorithm for functional group identification based on the Chemistry Development Kit (CDK)

License

Notifications You must be signed in to change notification settings

JonasSchaub/ErtlFunctionalGroupsFinder

 
 

Repository files navigation

DOI Javadoc License: LGPL v2.1 Maintenance build GitHub issues GitHub contributors GitHub release Maven Central - Search index Maven Central - Artifact page Quality Gate Status Software Article - JChemInf GitHub - Wiki

ErtlFunctionalGroupsFinder_logo

ErtlFunctionalGroupsFinder

An open implementation of the Ertl algorithm for functional group identification in organic molecules.

Description

The algorithm for automated functional groups detection and extraction of organic molecules developed by Dr Peter Ertl is implemented on the basis of the Chemistry Development Kit (CDK).
This open reimplementation named ErtlFunctionalGroupsFinder is described in a scientific article.

ErtlFunctionalGroupsFinder is also available in the open Java rich client application MORTAR ('MOlecule fRagmenTation fRamework') where in silico molecule fragmentation can be easily conducted on a given data set and the results visualised (MORTAR GitHub repository, MORTAR article).

Contents of this repository

Sources

The "src" subfolder contains all source code packages including JUnit tests.

Tests

The test class ErtlFunctionalGroupsFinderTest tests the functionalities of ErtlFunctionalGroupsFinder. Among other things, it tests whether the correct functional groups are detected in example molecules.

Test resources

The test "resources" subfolder contains an SD file with a small subset of small molecules taken from the Chemical Entities of Biological Interest (ChEBI) database for example usage purposes. The database is licensed under the Creative Commons License (CC BY 4.0) which allows distribution and modification.

Performance Snapshot CMD Application

The folder "Performance_Snapshot_App_jar" contains the executable JAVA archive ErtlFunctionalGroupsFinder-PerformanceSnapshotApp.jar. It can be executed from the command-line (command: java -jar) to do a performance snapshot of the ErtlFunctionalGroupsFinder.find() method under parallelization on multiple threads. For more details see the file "Performance usage instructions.txt"

Example initialization and usage of ErtlFunctionalGroupsFinder

see in "wiki"

Installation

ErtlFunctionalGroupsFinder is hosted as a package/artifact on the sonatype maven central repository. See the artifact page for installation guidelines using build tools like maven or gradle.
To install ErtlFunctionalGroupsFinder via its JAR archive, you can get it from the releases. Note that other dependencies will need to be installed via JAR archives as well this way.
In order to open the project locally, e.g. to extend it, download or clone the repository and open it in a Gradle-supporting IDE (e.g. IntelliJ) as a Gradle project and execute the build.gradle file. Gradle will then take care of installing all dependencies. A Java Development Kit (JDK) of version 11 or higher must also be pre-installed.

Dependencies

Needs to be pre-installed:

Managed by Gradle:

Acknowledgments

Project team: Sebastian Fritsch, Stefan Neumann, Jonas Schaub, Christoph Steinbeck, and Achim Zielesny.

Logo: Kohulan Rajan

The authors thank Peter Ertl for describing his algorithm in a way that allowed easy re-implementation. This is not always the case. We also thank him for valuable discussions.
We appreciate help from Egon Willighagen and John Mayfield with the CDK integration and from Felix Bänsch for unbiased release testing.

References

ErtlFunctionalGroupsFinder

Ertl algorithm

Chemistry Development Kit (CDK)

About

Open reimplementation of the Ertl algorithm for functional group identification based on the Chemistry Development Kit (CDK)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 100.0%