Skip to content
This repository has been archived by the owner on Jun 20, 2023. It is now read-only.

SweetNet for other ML application #1

Open
quarksome opened this issue Jan 6, 2022 · 2 comments
Open

SweetNet for other ML application #1

quarksome opened this issue Jan 6, 2022 · 2 comments

Comments

@quarksome
Copy link

Hi

I'm interested in using SweetNet to represent glycan structures which I need as input for some other ML model training. If I understood the paper correctly, using the SweetNet model, I can give it a glycan structure and it will output a vector representation of that structure, is that correct? And that output representation can then be used to train models for other various prediction tasks?

Do I need to re-train the SweetNet for my glycans, or is it appropriately trained already if my glycans are mostly just human N-glycans? From what I read, SweetNet was trained on all the structures from various glycan databases so it should be applicable to my case.

Thank you!

@Bribak
Copy link
Contributor

Bribak commented Jan 6, 2022

Hi,

thanks for your interest in our work!

In principle you can use our trained SweetNet model for your purpose but it may not lead to satisfying results as we only have supervised models stored (trained for a specific purpose). So I'd suggest trying it out.

The easiest way is to use the infrastructure in our glycowork package, for instance adapting the snippet on https://bojarlab.github.io/glycowork/examples.html#example4

by exchanging the rank to "Species", adding "trained=True" to prep_model, and then directly using glyowork.ml.inference.glycans_to_emb (https://github.com/BojarLab/glycowork/blob/master/glycowork/ml/inference.py#L23) to convert your glycans into learned embeddings. Otherwise you can also try retraining.

Hope that's helpful,
Daniel

@quarksome
Copy link
Author

I see. So the GCN model with 3 GraphConv layers, together with the 3 FC layers, are trained for specific prediction tasks. And for each different tasks, the glycan representations will be different because GCN embedding will be based on glycan features that contribute differently in different prediction tasks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants