[Question] CI for Variant Calling #271

Krannich479 · 2024-05-21T12:47:06Z

Is your feature request related to a problem? Please describe.
No problem here, just a question and an offer.

Describe the solution you'd like
I am currently developing a framework for continuous integration and evaluation of small variant calling. I read the GH Actions of poreCov and saw that the current tests are running dry runs testing principal functionality. What I offer is to write a GH Action to benchmark the variant calling (precision/recall/F1-score on synthetic data). Is that generally in your interest or beneficial? Are you already running such tests locally or otherwise outside of GH Actions?

replikation · 2024-05-21T13:34:08Z

I would prefer to run automatic complete tests via git action, but the question would be, who is financing this ;)? We added test profiles so we run outside GH Action test runs. Important containers are tested directly with real data prior to deployment. So the workflow logic is tested, and some of the important containers are.

Also, since the workflow uses containerization, it would need to download and run multiple containers, which would require more space for the gitaction runs, I think.

But maybe I misunderstood what you mean by benchmark. we could always add such tests to out test profile runs. (listed in the --help

poreCov/poreCov.nf

Lines 559 to 563 in ff6cb14

    
                 ${c_yellow}Input test data${c_reset} (choose one): 
        
                  test_fasta 
        
                  test_fastq 
        
                  test_fast5

)

Krannich479 · 2024-05-21T14:09:56Z

Valid points. When I came up with this idea I totally forgot that poreCov is deploying a number of containers that in total might exceed the GH Actions runner's resources. Also, the most relevant aspect to me is the variant calling which (I just found out) has a quite elaborate test suite within the ARTIC pipeline. I'll close this issue for now.

hoelzer · 2024-05-21T17:31:55Z

Agree, but maybe the "test profile" Option as mentioned by @replikation could be an interesting use case @Krannich479 . "test_vcf" could download simulated example data and a "gold standard" vcf and then let's porecov produce a new vcf which then can be compared.

Might be neat to test new medaka versions and models.

The test run would then happen outside any GitHub action locally

MarieLataretu · 2024-05-22T11:20:48Z

Yes, I agree with @hoelzer; it'd still be useful and interesting outside of GH!

We have GH actions using containers for some nextflow pipelines under a (free (?)) organization account - so far, I have not encountered problems with GH actions (e.g. clean and CoVpipe2). But I have to say that I have no overview of the limits, especially with non-organization accounts

Krannich479 · 2024-05-24T11:33:12Z

Okay okay @MarieLataretu @hoelzer @replikation, I'll have to look into how this works with profiles + nxf but I'll give it an attempt.

Krannich479 added the enhancement New feature or request label May 21, 2024

Krannich479 assigned replikation May 21, 2024

Krannich479 closed this as completed May 21, 2024

Krannich479 reopened this May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] CI for Variant Calling #271

[Question] CI for Variant Calling #271

Krannich479 commented May 21, 2024

replikation commented May 21, 2024

Krannich479 commented May 21, 2024

hoelzer commented May 21, 2024

MarieLataretu commented May 22, 2024

Krannich479 commented May 24, 2024

[Question] CI for Variant Calling #271

[Question] CI for Variant Calling #271

Comments

Krannich479 commented May 21, 2024

replikation commented May 21, 2024

Krannich479 commented May 21, 2024

hoelzer commented May 21, 2024

MarieLataretu commented May 22, 2024

Krannich479 commented May 24, 2024