Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] CI for Variant Calling #271

Open
Krannich479 opened this issue May 21, 2024 · 5 comments
Open

[Question] CI for Variant Calling #271

Krannich479 opened this issue May 21, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@Krannich479
Copy link

Is your feature request related to a problem? Please describe.
No problem here, just a question and an offer.

Describe the solution you'd like
I am currently developing a framework for continuous integration and evaluation of small variant calling. I read the GH Actions of poreCov and saw that the current tests are running dry runs testing principal functionality. What I offer is to write a GH Action to benchmark the variant calling (precision/recall/F1-score on synthetic data). Is that generally in your interest or beneficial? Are you already running such tests locally or otherwise outside of GH Actions?

@Krannich479 Krannich479 added the enhancement New feature or request label May 21, 2024
@replikation
Copy link
Owner

I would prefer to run automatic complete tests via git action, but the question would be, who is financing this ;)? We added test profiles so we run outside GH Action test runs. Important containers are tested directly with real data prior to deployment. So the workflow logic is tested, and some of the important containers are.

Also, since the workflow uses containerization, it would need to download and run multiple containers, which would require more space for the gitaction runs, I think.

But maybe I misunderstood what you mean by benchmark. we could always add such tests to out test profile runs. (listed in the --help

poreCov/poreCov.nf

Lines 559 to 563 in ff6cb14

${c_yellow}Input test data${c_reset} (choose one):
test_fasta
test_fastq
test_fast5
)

@Krannich479
Copy link
Author

Valid points. When I came up with this idea I totally forgot that poreCov is deploying a number of containers that in total might exceed the GH Actions runner's resources. Also, the most relevant aspect to me is the variant calling which (I just found out) has a quite elaborate test suite within the ARTIC pipeline. I'll close this issue for now.

@hoelzer
Copy link
Collaborator

hoelzer commented May 21, 2024

Agree, but maybe the "test profile" Option as mentioned by @replikation could be an interesting use case @Krannich479 . "test_vcf" could download simulated example data and a "gold standard" vcf and then let's porecov produce a new vcf which then can be compared.

Might be neat to test new medaka versions and models.

The test run would then happen outside any GitHub action locally

@MarieLataretu
Copy link
Collaborator

Yes, I agree with @hoelzer; it'd still be useful and interesting outside of GH!

We have GH actions using containers for some nextflow pipelines under a (free (?)) organization account - so far, I have not encountered problems with GH actions (e.g. clean and CoVpipe2). But I have to say that I have no overview of the limits, especially with non-organization accounts

@Krannich479
Copy link
Author

Okay okay @MarieLataretu @hoelzer @replikation, I'll have to look into how this works with profiles + nxf but I'll give it an attempt.

@Krannich479 Krannich479 reopened this May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants