-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for update sourmash_plugin_branchwater lastest features and performance improvements in yacht #124
Comments
Thanks for the suggestion @tnmquann ! We (@mahmudhera and @chunyuma ) have recently been working on this exact issue, but from a different direction: the reference database formation step in For the "only supporting one sample at a time", since running yacht on different samples is independent from running it on any other sample, doesn't something like |
Hi @dkoslicki , thank you for letting me know about the upcoming release, and it’s exciting to hear about the algorithmic improvements to reduce training time. I’ll be looking forward to seeing that in action when it’s ready. For running multiple samples, I’m currently using Thanks again, and really excited for the next release. |
We have experimented with loading the database once and using multithreading to process multiple samples, and found that there were very negligible gains (on the order of seconds). This might be helpful when you have a massive reference database, which typically occurs with a very high ANI value (eg. 0.99995), but in such cases, a more targeted approach seems better (focusing on a specific clade or clades) |
Hi,
I've been using yacht as a way to reduce false positives in sourmash, and I wanted to ask if it's possible to update the tool to incorporate the latest features from sourmash_plugin_branchwater? This would be helpful for a couple of reasons:
I believe incorporating improvements like supporting new rocksdb data format and using manysketch and/or fastmultigather could help reduce processing times and allow handling of multiple samples simultaneously.
Thanks for the great tool, and I'm looking forward to potential improvements in future releases!
The text was updated successfully, but these errors were encountered: