Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InChI String to Structure (really) #76

Open
BobHanson opened this issue Jan 27, 2025 · 1 comment
Open

InChI String to Structure (really) #76

BobHanson opened this issue Jan 27, 2025 · 1 comment
Assignees
Labels
feature question Further information is requested

Comments

@BobHanson
Copy link

Question, not issue.

I may be missing something here. The "structures" InChI is delivering as MOL files have no coordinates and no stereochemistry. Is that correct? I think that is what I am reading in the inchi C code I checked out from this project.

The reason I ask is that I've had "true" InChI String to Structure -- meaning full stereochemistry in a representative 2D or 3D structure -- for several years using JNI-InChI in Jmol and have now just also added it for inchi_web.wasm for JavaScript and also integrated it into OpenChemLib-SwingJS for both Java and JavaScript. (Thank you, Frank Lange.)

This is being discussed on the inchi-discuss list that probably everyone here has already seen. I'm just posting to InChI-issues as well, hoping someone can clarify some of the coding aspects of this for me.

I guess my questions here are these:

  1. Is it widely known already that one can do this? Or is this a closely guarded secret that one can reliably create fully stereochemical SMILES, 2D structures, and 3D structures from standard or fixedH InChI strings, given the right tools? -- Or, I guess, am I REALLY missing something here.

  2. If it is possible, could this be integrated into INCHI_BASE instead of it being in wrappers that are JavaScript- and Java-specific? (It's a very small amount of coding, actually - see InChI-SwingJS inchi-web.c. And perhaps menitoned somewhere on the InChI site.

I'm aware that the historical statement has been "InChI is not a representation - it's an identifier" -- but I argue that if the string can be returned to a structure with full 3D stereochemistry, it's just as much a representation as a MOL or CDXML representation. Right?

It just seems to me that if this group could check over how this is being done and agrees, we might make it more generally available than just wrappers.

Note that I'm not suggesting for InChI to incorporate any coordinate handling. Just expose the inchi_Output model so that it can be used in additional environments besides OpenChemLib and Jmol that can take 0D atoms, bonds, and stereochemical parities to 3D MOL files. It's just a few simple methods.

Thank you.

Bob Hanson

@BobHanson
Copy link
Author

BobHanson commented Jan 29, 2025

Partially answering my own question here. No, of course I am not the first. And it is not a secret, but it is pretty well hidden. I read:

https://www.inchi-trust.org/technical-faq/#16.9

If you have just an InChI. First, as the InChI contains no atomic coordinates, the best result will be only the coordinate-less “0D” structure. Among other things, this means problems with restoring stereochemistry. Second, if the InChI has been created with some layers omitted, the corresponding structural details may not be restored, evidently. For example, as Standard InChI lacks reconnected metal and fixed-hydrogen layers, neither bonds to metals nor precise positions of mobile H atoms may be regenerated from a Standard InChI.

I suggest that the statement "this means problems with restoring sterochemistry" is misleading. It doesn't mean that stereochemistry cannot be reconstructed. There is no "problem" restoring stereochemistry, given the right tools. The only problem is that it is a bit tricky to get this right.

and

The InChI API library has a dedicated function GetStructFromINCHI() intended for restoring the structure from InChI (but not from InChIKey, as it is a hashed form of InChI which could not be directly decrypted). Software vendors have already complemented the InChI library with their own procedures for generating atomic coordinates and built the functionality ‘Generate structure from InChI string’ into their products. Examples (there are probably more) are Accelrys Draw, Perkin-Elmer (formerly CambridgeSoft) ChemDraw, and ACD/Labs ChemSketch.

From what I can see -- one thing I would like a comment from this group on -- is how GetStructFromINCHI() is not particularly useful. As far as I can tell, this method does not report stereochemical parities. So it is not really useful. (The coordinates will all be 0.0 for input of an InChI string.)

The statement about software vendors having complemented the InChI library could be expanded to describe how exactly this can be done using open-source extensions on InChI C, namely inchi-web.wasm and JNI-InChI. Specifically, any InChI string can be used as an input to inchi C to recreate inchi C's internal model, which DOES faithfully reproduce all covalent bonding and stereochemical parities. Open source libraries such as OpenChemLib and programs such as Jmol can get access to this internal model and use it to generate fully stereochemical SMILES (Jmol) or feed that model directly into their coordinate production algorithms (OCL-SwingJS) for production of 2D or 3D models.

My suggestion to this group is that inchi C more clearly deliver its model in a simple generally usable format such as JSON, rather than having that be an extension only found in wrappers like JniInchiWrapper.c and inchi-web.c.

And then have a clear description available for how to handle the stereochemical parities.

I think that is my proposal. Maybe that's a proposal for a publication. I don't know.

@JanCBrammer JanCBrammer added question Further information is requested feature labels Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants