Suggestions to calculate variants per sample in exomes and genomes #577

KoalaQin · 2024-03-09T02:13:26Z

I simplified the two functions and renamed a few things, I added in option to get the count of missense and synonymous variants. I also prefer to write the aggregated stats in the HT globals, I think it's better structured this way. We'll discuss if only output just one final HT, but make it two steps.
I got the same number of all_variants the same as you for 1 sample, but number of variants passing filters are not again the same, I think the new Exome release seems changed, we need to confirm with Julia for that.
My test run is here: bcd45e3f472b4033bfac9532a022dafe

KoalaQin · 2024-03-09T02:35:30Z

gnomad_qc/v4/create_release/calculate_variant_statistics.py

-    :param agg_rare_variants: Stratify by variants which have adj AF <0.1%.
-    :param suffix: String of arguments to append to name.
+    :param pass_filters: Filter to variants which pass all variant qc filters.
+    :param ukb_regions: Filter to variants in UKB regions.


I'm thinking if all these should be under pass_filters, like:

if pass_filters: arg_dict.update({"pass_filters": hl.len(vmt.filters) == 0}) if ukb_regions: ....

KoalaQin · 2024-03-10T16:15:53Z

gnomad_qc/v4/create_release/calculate_variant_statistics.py

-        vmt_vep = vep.filter_vep_transcript_csqs(
-            vmt,
+    if by_csqs:
+        ht = filter_vep_transcript_csqs(


Did you or Julia agree on filtering to only MANE select transcripts? I leave it the same for now.

no, that was just the default. Let me parametrize this in the coming review though

KoalaQin · 2024-03-11T13:03:54Z

@matren395 Back to you now!

matren395

I have some more work to do this evening, let me know how this looks

gnomad_qc/v4/create_release/calculate_variant_statistics.py

KoalaQin · 2024-03-12T18:20:53Z

@matren395 Back to you again, I deleted two unnecessary arguments and implemented your suggetions (thanks a lot!). New test run: 6776707a23c948d79e9df883d5c4d42f

KoalaQin added 3 commits March 8, 2024 14:36

clean up the first function

7a5d23d

minor edits

6e3d62e

Add docstring to temp hail script

0d6049b

KoalaQin commented Mar 9, 2024

View reviewed changes

KoalaQin added 3 commits March 10, 2024 11:43

Restructure aggregate function

2aff019

minor edits to fix failing checks

7f2e91f

small change on output suffix

4a8de59

KoalaQin changed the title ~~Per sample counts 4 1 suggestions~~ Suggestions to calculate variants per sample in exomes and genomes Mar 10, 2024

KoalaQin commented Mar 10, 2024

View reviewed changes

KoalaQin requested a review from matren395 March 10, 2024 16:16

KoalaQin assigned KoalaQin and matren395 Mar 10, 2024

KoalaQin added v4.1 Release Stats labels Mar 10, 2024

Add note about remomving

832e8b1

matren395 reviewed Mar 11, 2024

View reviewed changes

Address review suggestions

32d0b0f

KoalaQin requested a review from matren395 March 12, 2024 18:20

matren395 approved these changes Mar 12, 2024

View reviewed changes

KoalaQin merged commit fdbc7b8 into dm/per_sample_counts_4_1 Mar 12, 2024
1 check passed

KoalaQin deleted the qh/per_sample_counts_4_1_suggestions branch April 2, 2024 16:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestions to calculate variants per sample in exomes and genomes #577

Suggestions to calculate variants per sample in exomes and genomes #577

KoalaQin commented Mar 9, 2024 •

edited

Loading

KoalaQin Mar 9, 2024

KoalaQin Mar 10, 2024

matren395 Mar 11, 2024

KoalaQin commented Mar 11, 2024

matren395 left a comment

KoalaQin commented Mar 12, 2024 •

edited

Loading

Suggestions to calculate variants per sample in exomes and genomes #577

Suggestions to calculate variants per sample in exomes and genomes #577

Conversation

KoalaQin commented Mar 9, 2024 • edited Loading

KoalaQin Mar 9, 2024

Choose a reason for hiding this comment

KoalaQin Mar 10, 2024

Choose a reason for hiding this comment

matren395 Mar 11, 2024

Choose a reason for hiding this comment

KoalaQin commented Mar 11, 2024

matren395 left a comment

Choose a reason for hiding this comment

KoalaQin commented Mar 12, 2024 • edited Loading

KoalaQin commented Mar 9, 2024 •

edited

Loading

KoalaQin commented Mar 12, 2024 •

edited

Loading