Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove extreme multiallelic in exomes VDS #580

Merged
merged 3 commits into from
Mar 18, 2024
Merged

Conversation

KoalaQin
Copy link
Contributor

This adds in an option to delete an excessively multi-allelic locus on chr19 as we did in

logger.info("Dropping excessively multi-allelic site at chr19:5787204...")
.

@@ -279,6 +283,11 @@ def _remove_ukb_dup_by_index(
if split:
logger.info("Splitting multiallelics...")
vmt = vds.variant_data
if remove_extreme_multi_allelic:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just remove it no matter what, even if split is not True. Maybe switch to use filter_intervals with keep = False

@KoalaQin KoalaQin changed the title Add option to remove extreme multiallelic in VDS Remove extreme multiallelic in exomes VDS Mar 16, 2024
@KoalaQin
Copy link
Contributor Author

Code to test if this was done:

"""Test script for gnomAD v4 data"""
import hail as hl
from gnomad_qc.v4.resources.basics import get_gnomad_v4_vds

vds = get_gnomad_v4_vds()
vds = hl.vds.filter_intervals(
    vds,
    [hl.parse_locus_interval("chr19:5787204-5787205", reference_genome="GRCh38")],
)
print(vds.variant_data.count())

ec223c63fde5413a93e1dfc7ca33e2c2 output:
(0, 940583)

Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@KoalaQin KoalaQin merged commit acf9da7 into main Mar 18, 2024
2 checks passed
@KoalaQin KoalaQin deleted the qh/remove_vds_multiallelic branch April 2, 2024 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants