Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClinGenAllele - MT not in GRCh37 #1018

Open
davmlaw opened this issue Apr 2, 2024 · 14 comments
Open

ClinGenAllele - MT not in GRCh37 #1018

davmlaw opened this issue Apr 2, 2024 · 14 comments
Assignees
Labels
Milestone

Comments

@davmlaw
Copy link
Contributor

davmlaw commented Apr 2, 2024

CA658659094 is on MT (which is contig NC_012920.1 - same on both builds)

However, it only lists it as being for GRCh38 - so 37 fails with:

CA658659094/GRCh37 not in ClinGenAllele genomicAlleles response 

I should contact ClinGen about having 37 response as well

And we could handle this case as the contigs are the same

@davmlaw
Copy link
Contributor Author

davmlaw commented Apr 10, 2024

Made it work via contigs (so that 37 will look for the shared contig in GRCh38)

This should only take a few extra string comparisons per lookup so not too bad

@EmmaTudini
Copy link
Contributor

Testing:

  • Logged in as non-admin assigned to VCGS ([email protected])
  • As admin, imported ENST00000361381.2(MT-ND4):c.278_279insC (mitochondrial variant without m. – as m. turned off in Shariant currently)
  • As non-admin searched for
  1. Matching CAID CA2830850672
  2. NC_012920.1:m.11037_11038insC (alternative nomenclature for variant)
  3. ENST00000361381.2(MT-ND4):c.278_279insC
  4. ENST00000361381.2(MT-ND4):c.278_279insT
  5. NC_012920.1:m.11037_11038insT
  6. MT:11037 A>AC (correct)
  7. MT:11037 A>AT (incorrect)
    Expected output: 1,2,3,6 Should go to the appropriate allele page
    4,5,7 – should show result found with alternative alt and link to allele page. Example below

image.png

Actual output: Failed
4,5 both went to the allele page when genome build set to GRCh38, without any warning that the alt was different
BUT correct warning was shown when preferred build was GRCh37 (even though variant was imported as GRCh38)

7 – showed no results (in both genome builds)

  • Also checked with
    a. Deletion – searching for incorrect reference base - ENST00000361381.2(MT-ND4):c.279delG (instead of A)
    b. SNV - ENST00000361381.2(MT-ND4):c.1A>C (instead of G) and
    ENST00000361381.2(MT-ND4):c.1T>G (instead of reference base of A)
    MT:10760 A>C (instead of G)
    MT:10760 T>G (instead of reference base of A)
    Expected output: Should show results with warnings
    Actual output: Failed
  • ENSTs to allele page when preferred build GRCh38. Goes to warning page if set to GRCh37
  • MT genomic locations show no results (instead of showing warning with alternative)

@davemlaw – not sure whether this is a MT contig issue or an ENST search issue

  • As non-admin, searched for CA337095804 and NC_012920.1:m.263A>G (same variant)
    Expected output: Should return no results (as not in Shariant)
    Actual output: Passed

@davemlaw Slightly separate issues

  1. On the allele page, the MT variants are shown as a g. rather than an m. (as per standards). Can this be changed as a separate issue? See doc with standards from the UK here - https://www.acgs.uk.com/media/11935/bpg-for-the-molecular-diagnosis-of-mitochondrial-disease_ratified-november-2020.pdf
  2. Clicking on the g.HGVS takes you to the variant in the wrong genome build. I think it might default to your internal genome build setting. Screenshots below:

Screenshot 2024-06-27 at 2.05.20 pm.png

Screenshot 2024-06-27 at 2.05.27 pm.png

Also a note that we don’t allow for import of these variants in Shariant at the moment (as far as I can tell) – have raised https://github.com/SACGF/variantgrid_shariant/issues/169 to test

@EmmaTudini
Copy link
Contributor

@davmlaw I'm moving this to another release, when we turn on MT variants. @TheMadBug Can you confirm that we don't current allow MT variants at the moment please?

@TheMadBug
Copy link
Member

Right now the changes for ImportedAlleleInfo restrict users to importing only to c.HGVS and g.HGVS (not variant coordinate, even if the rest of the system is technically capable of importing via that).

Attempting to put MT:11037 A>AC into c_hgvs just results in a c.HGVS parsing error.

In future we will want to be able to import via variant_coordinate again (if just for communicating between systems) but can confirm right now we don't have to worry about users inserting new variants using it - and we control the importers for all but 1 of the systems at this point now anyway too.

@EmmaTudini
Copy link
Contributor

@TheMadBug But you can import a chgvs that resolves to an MT - ENST00000361381.2(MT-ND4):c.1A>C.

@TheMadBug
Copy link
Member

Yes, as currently when it comes to c.HGVS we explicitly accept or not, it's just based on the transcript prefix which we limit to
NM, NR, ENST and XR
and that it's a c. or g.

From your comment 2 weeks ago it sounds like you've already imported one, but here's the example you just provided
https://test.shariant.org.au/classification/imported_allele_info/35539

As for all our importers, there's nothing that would cause them to reject such a value as from a programming point of view it appears as a pretty standard c.HGVS - the main thing being that the last time we even got an Ensembl transcript was in 2022 from CHW for a record that was made in 2020

@EmmaTudini
Copy link
Contributor

Could this be prioritised for the next deploy? I might start tagging issues. There were a few searches that failed/returned false positives

@davmlaw
Copy link
Contributor Author

davmlaw commented Jul 9, 2024

7 ... should show result found with alternative alt and link to allele page... without any warning that the alt was different

This was reported in "outstanding search issues" over a year ago but not done. I have split it into its own issue: #1106

@EmmaTudini
Copy link
Contributor

@davmlaw There are other issues from this issue that need to be addressed outside of the one above. See my testing comment from June 27th

@davmlaw
Copy link
Contributor Author

davmlaw commented Jul 9, 2024

Added not reporting Reference base to "not reporting alt" issue - #1106

Separate issues:

  1. Mito g. instead of m. - raised as Mitochondria generated with g. instead of m. #1107
  2. MT build specific variant links - raised as Allele page - links to MT variants should specify genome build #1108

@davmlaw
Copy link
Contributor Author

davmlaw commented Jul 9, 2024

Sorry I went through and looked for bold fails the first time. I think I got them all - If I missed anything else can you please either raise a new issue or explicitly tell me what I missed. Thansk

@EmmaTudini
Copy link
Contributor

@davmlaw Would searching for the below and getting different results depending on the preferred genome build, fit into #1106
ENST00000361381.2(MT-ND4):c.278_279insT
NC_012920.1:m.11037_11038insT

Both went straight to the allele page when genome build set to GRCh38, without any warning that the alt was different
BUT correct warning was shown when preferred build was GRCh37 (even though variant was imported as GRCh38)

@davmlaw
Copy link
Contributor Author

davmlaw commented Jul 11, 2024

I think that's a new issue, as it's hgvs those other ones are for non hgvs and it looks to be mt and build specific

@davmlaw
Copy link
Contributor Author

davmlaw commented Jul 12, 2024

Split off into own issue - #1115 - Search for MT HGVS behaves differently across genome builds

This issue should now hopefully only be about being able to link ClinGenAlleles w/MT variants in GRCh37 (used to only work in 38)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants