-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Documentation - Gecco use cases for 'annotation', downstream 'antismash' #4
Comments
Hi @tamuanand I do not think that the -vvv is working.Yes, this is an old option and it doesn't work anymore, I just forgot to remove the old prompt. There are just three verbosity level now (nothing, When do you use the gecco annotate command and what is the purpose of itI added this command to make it easier to create training data, it creates the feature tables that are then to be used with In what scenarios does one use gecco for downstream post-processing with antismashWell, none really. You'd probably want to use them in complement with one another, as they will give you different putative clusters (AntiSMASH being very good at finding clusters close to known things, GECCO being better at identifying novel architectures) If you are confused about the I am assuming you would have done a downstream BiG-SLiCE process with your datasetsWe actually didn't, as we didn't find BiG-SLiCE scalable enough for our dataset: it doesn't support heavily-distributed computations and requires to annotate the entirety of the BGCs with I do also note that you mention here to write our own scripts to make it compatible for BiG-SLiCEI am currently writing a dedicated command to help getting results into BiG-SLiCE, but everything is already still there in the GenBank "structured comments" of the output. |
Hi @althonos, I am not able to get the datasets.tsv file and the taxonomy folders. Are those supposed to be generated via the convert command? |
BiG-SLiCE requires these files to work because of their expected input structure, GECCO cannot generate them for you. |
Hi @althonos Thanks for responding to my queries. I have a follow up query: You suggest to use gecco as a complement to antiSMASH
My question: I am assuming
Were the above done with The reason I ask this is because the preprint at one place talks about antiSMASH 4.2 - any specific reason as to why 4.2 when 5.1 or 5.2 was already available.
|
Hi @althonos I was wondering if you could elaborate on the above. Thanks |
@tamuanand : The Figure 3.a was done with antiSMASH 5.2. We used antiSMASH 4.2 to mask the biosynthetic regions from our training data, because we prepared the sequences at a time were antiSMASH 5 was not available. We are in the process of improving our training set, which includes rebuilding our set of contigs, and for this will use antiSMASH 5.2 as well. |
Hi @althonos AntiSMASH 6 is now available - if you are planning to use antiSMASH I would recommend using antiSMASH 6.0 |
Hi @althonos
I have some questions pertaining to
documentation
. I know you mention here some documentation and also have a disclaimerBefore I ask my questions, I there is a bug or something wrong in the help text for
-vvv
(verbose debugging). I do not think that the-vvv
is working. Does it stand forvery very verbose
gecco -vvv run --genome GENOME.fasta -o gecco_GENOME >& verbose_GENOME_gecco.txt &
change vvv to vv
Here is the relevant
gecco --help
text - it statesvvv shows debug information
I have some questions/feature requests:
gecco annotate
command and what is the purpose of itgecco
for downstream post-processing withantismash
. I could not understand the use case for it from the preprintfeature request
orenhancement
, it would be nice to have gecco outputs (or scripts) in a compatible way for BiG-SLiCE.The text was updated successfully, but these errors were encountered: