-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code Review for #8752 #8871
Code Review for #8752 #8871
Conversation
… events (broadinstitute#8717) * M2 bad haplotype filter does not filter variants that share a haplotype with a germline event * two ECNT annotations -- haplotype and region -- and clustered events filter looks at both
…37/hg19 conversion (broadinstitute#8758) Don't print the very long and misleading "The following contigs are present in b37 and missing in the input VCF sequence dictionary" log message when we're not even doing b37/hg19 conversion.
Co-authored-by: Dror Kessler <[email protected]>
Set GGVCFs --all-sites GQ0 hom-refs to no-calls Set regular GGVCFs GQ0 hom-refs to no-calls (any DP, PL) for better AF/AN annotations Remove PLs in "no data" case where DP=0 for more accurate QUAL score
Several files tracked by git lfs were accidentally reimported as normal files. This makes them stubs again.
…elimited) (broadinstitute#8771) * Enable ReblockGVCF to subset AS annotations that aren't "raw" (i.e. pipe-delimited) * Fix tests by removing AssemblyComplexity from default annotations
* Add MappingQualityReadFilter * Added additional warnings for mmq * Fixed doc typo
…adinstitute#8802) * Add malaria spanning deletion exception regression test with fix * Disabling codecov. --------- Co-authored-by: Jonn Smith <[email protected]>
* Fixed a bug that prevented filtering by SOR in many cases
…f in GenomicsDB (broadinstitute#8759) * Allow for GT to be a nocall if GQ and PL[0] are zero instead of homref in GenomicsDB * Move to 1.5.3 release from snapshot --------- Co-authored-by: Nalini Ganapati <[email protected]> Co-authored-by: Nalini Ganapati <[email protected]>
* Reduced total layers in the GATK docker image from 44 down to 16. * Reduced GATK base image layers from 20 to 3. * This might be a better solution than a full squash down to a single layer, because: If we are hosting this in a premium ACR, the limit is 10,000 readOps per minute. So with 16 layers, you get around 625 pulls per minute. Also, this will be able to still take advantage of parallel pulls (default is 3, but at most 16 threads in this case, I believe) as opposed to one big layer which will not download in parallel. There's the potential of that being a lot slower and subsequent jobs falling into the same "minute" because others are not done, making it easier to hit that 10k readOps limit. Lastly, people using GATK outside data pipelines will not be able to take advantage of layer caching too. Resolves broadinstitute#8684
…in VCF header (broadinstitute#8831) Added a --mask-description argument to VariantFiltration to write a custom description for the mask filter in the VCF header
… with truth VCF (broadinstitute#8836) * added 20 more Permutect read features * Permutect test data can, like training data, be annotated with a truth VCF
* [BIOIN-1570] Fixed edge case in variant annotation when the variant is close to the edge of the reference
…rve existing behavior when storeAllVcfSources=false
@lbergelson or @droazen: this is related to your comments on #8752 about preserving the default behavior of just writing the first source. This PR overloads simpleMerge(), meaning code needs to opt-in to the feature of preserving all source names. I cannot kick off tests - would one of you be able to to that? |
@jamesemery, thanks for kicking off tests. the compile failure is probably due to a bad merge. I pushed a fix - would you be able to restart tests? |
…CNN WDL test failures (broadinstitute#8891) Replace usage of a deprecated Genomes-in-the-Cloud docker image that was causing the CNN WDL tests to fail with the GATK base image.
* Update no-calls in Gnarly haploid test * Put back turned off tests
…r customized filtering of output variants to intervals. (broadinstitute#6388) Co-authored-by: James <[email protected]>
… better override behavior and documentation (broadinstitute#8986)
…omicsDBImport (broadinstitute#8987) * Switch to logging a warning instead of an exception for intervals in a query that were not part of GenomicsDBImport * upgrade GenomicsDB 1.5.3 -> 1.5.4 * fixes broadinstitute#8415
* updates changes for gq0 comparison more changes some vcfs have no gqs clean up change for NPE clean up whitespace * addressing comments
This commit changes the default --mitochondria-mode parameters to what they are supposed to be in the source code based on the raised question in the forum https://gatk.broadinstitute.org/hc/en-us/community/posts/29969305493147-Incorrect-documentation-for-mitochondria-mode-in-Mutect2 This commit also fixes the mitochondria mode flag to its correct naming in the documentation.
…nstitute#9002) * Adds a new github actions workflow to submit our dependencies for analysis
…nstitute#9001) Our README was missing some important details on how to properly setup the GATK conda environment locally.
…nstitute#8004) * Port of the NVIDIA-authored nvscorevariants tool into GATK, with a basic tool frontend * This is a direct replacement for the legacy tool CNNScoreVariants. It produces results that are almost identical to that tool, but is implemented on top of a more modern ML library, Pytorch. * The Python code is taken from https://github.com/NVIDIA-Genomics-Research/nvscorevariants, with a few minor modifications necessary to get the tool working on newer versions of the Python libraries. * Added pytorch-lightning to the GATK conda environment, as it's required by this tool * Disabled jacoco in build.gradle, as it was causing strange errors related to jacoco trying to parse the new Pytorch model files in the resources directory --------- Co-authored-by: Louis Bergelson <[email protected]>
…tools (broadinstitute#9008) * Adding small warning messages to not to feed any GVCF files to these tools. * Update src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/VariantRecalibrator.java Co-authored-by: Louis Bergelson <[email protected]> * Update src/main/java/org/broadinstitute/hellbender/tools/walkers/annotator/VariantAnnotator.java Co-authored-by: Louis Bergelson <[email protected]> * Update src/main/java/org/broadinstitute/hellbender/tools/walkers/annotator/VariantAnnotator.java Co-authored-by: Louis Bergelson <[email protected]> --------- Co-authored-by: Louis Bergelson <[email protected]>
* Rebuild base image to incorporate recent patches * gatkbase-3.3.0 -> 3.3.1
…e configured tmp-dir (broadinstitute#8951)
* Update gradle and build.gradle * update gradle wrapper 8.2.1 -> 8.10.2 * remove 'versions' plugin because we don't use it * update gradle plugins to new versions * shadow plugin changed maintainers and coordinates com.github.johnrengelman.shadow:8.1.1 -> com.gradleup.shadow:8.3.3 * git-version 0.5.1 -> 3.1.0 * sonatype scan 2.6.1 -> 2.8.3 * download 5.4.0 -> 5.6.0 * use tasks.register() which is the newer style * add gradle.properties to configure daemon memory * update several vulnerable dependencies to safer versions * update htsjdk and picard * replace deprecated override of ThresholdStream.getStream() -> getOutputStream()
…roadinstitute#9009) Removed the code for these tools, now that NVScoreVariants is merged and we've moved to a Python environment that is incompatible with these legacy tools. Also added the tools to the DeprecatedToolsRegistry.
* Updating dependency management and vulnerable dependencies * Update dependencies to fix vulnerabilities as reported in broadinstitute#8950 * Update our dependency management to make use of some newish gradle features * Add dependency constraints to update transitive dependencies, this allows us to specify versions without making them direct dependencies * Remove most force expressions and replace them where necessary with version strict requirements * Make use of several published bom's to configure consistent dependency versions for platforms like netty and log4j2 * Remove exclude statements that are now handled by variant dependency resolution (like guava android vs jdk) * Exclude the org.bouncycastle:bcprov-jdk15on dependency and replace it with bcprov-jdk18onA This adds an unnecessary testUtilImplementation level dependency on what is really a transitive, but I couldn't get gradle's explicit version replacement to work. replacement logic to work so this is a workaround
…gged Gencode transcripts where possible (broadinstitute#9012) * Added a '--prefer-mane-transcripts' mode that enforces MANE_Select tagged Gencode transcripts where possible
* fix dependabot alert for jetty-xml * bom version -> 9.4.56.v2024082
…of + (broadinstitute#9018) * maven central doesn't like + in version ranges and gradle doesn't translate it
In GATK3, when merging variants, the IDs of all the source VCFs were retained. This code path seems like it intended that, since the variantSources set is generated, but it doesnt get used for anything. This PR will use that set to set the source of the resulting merged VC.
…rve existing behavior when storeAllVcfSources=false
…_merge2 # Conflicts: # src/main/java/org/broadinstitute/hellbender/utils/variant/GATKVariantContextUtils.java
Closed in favor of: #9032 |
This is responding to code review on #8752