Skip to content
Natalia edited this page Aug 15, 2019 · 23 revisions

tranSMART HDD miRNAQPCR data loading option might be very useful for a specific project focused on collecting this type of data. When miRNAQPCR is loaded, store procedure transforms this data into negative values and calls it log2. It makes perfect sense as long as you expect it and don't do any additional transformations to the data. Therefore, miRNAQPCR data should be loaded as dCt values. dCt value represents negative log2 of a transcript abundance. Negative of a negative gives you actual log2 values that can be used in Advanced Workflows. But there is no RNAQPCR table to load RNA qPCR data that is more commonly generated in research. Having a specific procedure just for miRNAQPCR and not for RNAQPCR might be confusing.
Regardless of how we feel on the issue, tMDataLoader supports miRNAQPCR data. In order to avoid "code confusions" for miRNAQPCR data loading and miRNASeq data loading, we have added additional data type 'C' (e.g. Test Study_TEST005_MIRNA_Data_C). miRNASeq is still either type L (log) or R (raw).

miRNAQPCR Data Loading Instruction

miRNAQPCR Data is loaded from the MIRNA_QPCRDataToUpload Directory.

miRNAQPCR Data File

ID_REF GSM918938 GSM918939 GSM918940
1 -2.2 3.2 4.2
2 3 -4.2 4
3 6 5.2 5

miRNAQPCR Platform File

#PLATFORM_ID: MIRNAQPCRHS #PLATFORM_TITLE: MIRNAqpcr Platform HS Test #SPECIES: Homo Sapiens

ID_REF MIRNA_ID SN_ID PLT_NAME ORGANISM
1 hsa-miR-935 MIRNAQPCRHS Homo Sapiens
2 hsa-miR-127-5p MIRNAQPCRHS Homo Sapiens

miRNAQPCR Mapping File

Mapping file includes 10 columns: STUDY_ID, SITE_ID, SUBJECT_ID, SAMPLE_CD, PLATFORM, TISSUETYPE, ATTR1, ATTR2, CATEGORY_CD, SOURCE_CD. For more information see Expression Data.

STUDY_ID SITE_ID SUBJECT_ID SAMPLE_CD PLATFORM TISSUETYPE ATTR1 ATTR2 CATEGORY_CD SOURCE_CD
MIRNAQPCRTST 942 GSM918942 MIRNAQPCRHS Synovium Biomarker_Data+PLATFORM+ATTR1 STD

QPCR Data Loading Approach we actually use

qPCR data is confusing on a good day - high value results mean low transcript abundance, low values results mean high transcript abundance; values and formats such as Ct, dCt, ddCt, etc. Keeping in mind that miRNA qPCR values will get multiplied by -1 during loading but there is no such option for the RNA qPCR data could be too much to it keep straight. High dimensional qPCR data such as TLDA arrays can be perfectly loaded into RNAseq table as "L" (log transformed) when properly normalized. A reasonable approach would be to process QPCR data in the same spirit as RNAseq to be able to compare results between two methods: Negative dCt values where transcripts with Ct higher than an agreed upon cutoff for more than XX% of samples are removed (similar to RNAseq data normalized for a typical analysis where transcripts with 0 counts for more than XX% of samples are also usually removed), dCt calculated and converted into negative dCt. Scaling of the negative dCt is also used by some scientists to make this data type look even more similar to other gene expression data on graphs and box plots.

Note: there are qPCR methods that quantify absolute gene transcripts amount per sample in femtograms. These methods are not usually highthroughput and can be loaded as subject level "clinical" data. For HDD loading purposes this data would be Raw 'R'.

Clone this wiki locally