SeSiMe

Protoype name. And prototyoe code.

Here used to calculate similarities (or distances) between mass spectra or between biosynthetic gene clusters (BGCs).

Method categories

Method	Sensitive to word context	Sensitive to word order
PCA on 1-hot document vector	No	No
Autoencoder on 1-hot document vector	No	No
Autoencoder on full sequence	(Yes)	(Yes)
Word2Vec + document centroid	Yes	No
GloVe	Yes	No
ELMo, bi-LSTM etc.	Yes	Yes

Name		Name	Last commit message	Last commit date
Latest commit History 274 Commits
notebooks		notebooks
tests		tests
.gitignore		.gitignore
BGC_functions.py		BGC_functions.py
MS_functions.py		MS_functions.py
README.md		README.md
Similarities.py		Similarities.py
helper_functions.py		helper_functions.py