Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Differences in mitochondrially-encoded gene expression between RNA-seq samples sequenced at BGI@CHOP vs. NantOmics #1601

Open
jaclyn-taroni opened this issue Aug 12, 2022 · 1 comment
Labels

Comments

@jaclyn-taroni
Copy link
Member

This issue is reported by @ginamawla. I am including Gina’s writing and visualizations here with some modifications for the issue.

What data file(s) does this issue pertain to?

Transcriptomic data, specifically TPM:

pbta-gene-expression-rsem-tpm.stranded.rds
pbta-gene-expression-rsem-tpm.polya.rds

But I believe this is most relevant to the stranded data.

What release are you using?

Unknown

Put a link to the relevant section of the OpenPBTA-manuscript here.

https://github.com/AlexsLemonade/OpenPBTA-manuscript/blob/ee2e934a9be7cdc77d883c031cf763e10ed49035/content/06.methods.md#data-generation

Put your question or report your issue here.

Summary

There are differences in mitochondrially-encoded gene expression – very likely to be artifactual in nature – between samples sequenced at BGI@CHOP and samples sequenced at NantOmics, despite no reported differences in processing in the manuscript. These differences have precluded classifying pediatric high-grade gliomas based on the RNA-seq data.

From @ginamawla

  1. All of the pHGG tumor samples sequenced at BGI@CHOP that I am working with in my studies (n=12) contain ~200-300 fold levels of mitochondrial 12S and 16S rRNAs (encoded by the genes MT-RNR1 and MT-RNR2). This result initially led us to believe that there was a novel subgroup of pHGG tumors, which we called the "mito" class.
  2. We were surprised to find that two samples, BS_6VPKXXMR and BS_M85CXHDV, originated from the same sample, but were sequenced at different centers (BGI@CHOP and NantOmics, respectively). Interestingly, BS_6VPKXXMR fell into the "mito" subgroup, as defined by high expression of MT-RNR1 and MT-RNR2, whereas BS_M85CXHDV did not.
  3. Upon closer inspection, this time taking particular notice of the sequencing center, we found that all 12/12 of the "mito" samples were sequenced by BGI@CHOP, and 12/12 BGI@CHOP-sequenced samples were of the "mito" group.
  4. This raises the concern that high MT-RNR1 and MT-RNR2 expression levels are artifactual and a sequencing center-specific signature, rather than a novel mitochondria-related pHGG subgroup.
  5. Of note, every other mitochondrially-encoded gene there is transcriptomic data for is down regulated in the BGI@CHOP-sequenced samples.

BGIvsCHOP_TPMs

BGIvsNant_DESeq2

BGIvsNant-1

@jaclyn-taroni
Copy link
Member Author

We've received an update from @mkoptyra that I'll summarize here. We have been unable to track down much additional information from the BGI@CHOP and Genomic Clinical Core at Sidra Medical and Research Center sites that might explain these differences. From a former employee of BGI@CHOP, we know that center frequently utilized TruSeq RNA Sample Prep Kit (Illumina, #FC-122-1001). @mkoptyra is also aware that both sites heavily used Illumina kits. We can't be certain as to what kits were used for these samples.

This prompted a change in the language of the manuscript here: AlexsLemonade/OpenPBTA-manuscript#345

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant