Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WARNING about input molecules #6

Open
CesareWang opened this issue Aug 1, 2024 · 3 comments
Open

WARNING about input molecules #6

CesareWang opened this issue Aug 1, 2024 · 3 comments

Comments

@CesareWang
Copy link

Thank you for your outstanding work and sharing!
This WARNING occured while predicting the mass spectra of some smiles codes using your pre-trained model:
WOU9_LQM5M@62)}X`83GM{F
The model then stops continuing the prediction.
This suggests that it may be due to a problem with the input smiles codes, but I checked the 1840th smiles code entered: ‘CC1=CC2Cc3nc4cc(Cl)cccc4c(NCCCNCCCNc4c5c(nc6ccccc46)CCCC5)c3C(C1)C2 ‘. Locally rdkit recognises this smiles code without this WARNING. What could be the reason for this kind of problem and how can I change the code or check in advance if the input smiles code can be used for prediction.

Thanks again for your outstanding work. I appreciate all your help!

@adamoyoung
Copy link
Contributor

Hi CesareWang,

Thanks for your interest and kind words!

I tried parsing the compound that you provided with the version of rdkit used for the project (2021.03.3) and it was unable to parse. Can you confirm that you are using the correct rdkit version?

In any case, you could simply modify the code (or your input file) to skip this compound.

@CesareWang
Copy link
Author

Thank you very much for your reply and help!

As your valuable suggestion, I have filtered out the smiles codes in the input data that cannot be processed properly. The processed smiles code can perform the series of operations from "mol_from_smiles" to "init_from_smiles" normally. However, when executing "inference", a new issue appeared. May I ask what could be the reason for this issue and how can I solve this error?
image

Sincerely thank you for your help!

@adamoyoung
Copy link
Contributor

It seems like the preprocessing failed, since there is an instance of a molecule that is not a SMILES string or an RDKit mol object. Maybe you should take a look at that one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants