Skip to content

V1.3.0.0

Latest
Compare
Choose a tag to compare
@JonasSchaub JonasSchaub released this 02 Feb 16:54
7513224

Lifted the strict input restrictions for the ErtlFunctionalGroupsFinder.find() method that identifies functional groups in input molecules and performed a general clean-up and refactoring of the code base.

Details:

So far, the ErtlFunctionalGroupsFinder.find() method that identifies functional groups in input molecules did not accept input molecules which contain metal, metalloid, or pseudo (R) atoms, have multiple disconnected structures (e.g. ion and counter-ion), or have formal charges. These were implemented based on Ertl's description of the performed standardisation steps prior to testing his algorithm on a molecular data set ("Organometallic structures were discarded and all molecules were standardized by removing counterions and neutralizing atomic charges.”). While applying such an algorithm only to "organic" and standardising an input structure set prior to a similar analysis is still considered best practice, these input restrictions made the ErtlFunctionalGroupsFinder (EFGF) complicated to use and one needed to perform complex filtering and preprocessing prior to analysing a new data set.

Therefore, these input restrictions were lifted by default but they can still be used via a new variant of the .find() method (set third parameter to "true"). It was tested using the complete ChEBI database whether the processing of molecules that would not have been accepted as input before causes any issues and it did not.

Note:

  • Metal and metalloid atoms are now treated like hetero atoms and are marked for as part of a functional group
  • Pseudo (R) atoms are ignored
  • Formal charges are treated like other atomic properties, they are conserved in marked atoms at FG extraction. Note: a charged carbon atom is not marked simply because of its charge because this was not included in the Ertl algorithm.
  • In disconnected structures, all disconnected parts are treated equally


Additional new functionalities:

  • The EFGF class now has additional utility methods for filtering (to be used if(!) the input restrictions are explicitly applied by the user)
  • New factory methods for creating EFGF instances with a specified functional group environment mode
  • New functional group environment mode that only extracts the marked atoms without their environment
  • The functional group environment mode is now configurable on an existing instance
  • Bug fix concerning explicit, unconnected hydrogen atoms

What's Changed

Full Changelog: V1.2.1...V1.3.0.0