Skip to content
/ IB2014 Public

Classification of Breast Cancer Subtypes by Combining Gene Expression and DNA Methylation Data

License

Notifications You must be signed in to change notification settings

mlist/IB2014

Repository files navigation

These are the R scripts used in

List et al. Classification of Breast Cancer Subtypes by Combining Gene Expression and DNA Methylation Data, Journal of Integrative Bioinformatics (in press)

This work was presented at the 10th international symposium on Integrative Bioinformatics on 12th of May 2014 in Newcastle, UK.

The following files are included:

  • download_tcga_data.R: script for downloading the necessary files from the TCGA website
  • process_tcga_data.R: script to process the downloaded TCGA breast cancer data sets
  • remove.false.normal.labels.R: script to remove samples designated as normal by PAM50 from the processed data. helpful if they should be excluded from the analysis.
  • merge.normal.samples.R: Add non-tumor "normal" samples
  • varSelRF.R: Perform random forest using the varSelRF package.
  • helper_methods.R: Neat methods for analysing the data
  • pam50.txt: The 50 genes that are part of the PAM50 classifier (gold standard).

About

Classification of Breast Cancer Subtypes by Combining Gene Expression and DNA Methylation Data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages