Skip to content
Thomas Cokelaer edited this page Apr 18, 2016 · 12 revisions

Roadmap

Version 1.0

Features

Implemented

  • Identical analysis as in the R version
  • Regression analysis using OLS
  • FDR correction using BH or qvalues method.
  • HTML output as in R version
  • All figures as in R version

What's new:

  • Open cosmic browser in the HTML report
  • Settings used are available in the HTML report
  • HTML table with cosmic ids provided
  • 2-3 times faster
  • Descriptive analysis and visualisation of the IC50, genomic features
  • Javascript on the volcano plot: we decided to use mpl3d for experimental purpose but note that the GDSC data is then exposed to a dedicated website with javascript embeded so we won't provide JS support officially.
  • Include genomic data set.
  • Standalone application
  • Python notebooks
  • Estimation of FDR using about 10 different methods

What could be included:

  • Elastic Net: Implementation + test + documentation : 5 days
  • Estimation of FDR with empirical pvalues: ?? 6-7 days ?? Not clear how to do that for now
  • Training and support for developers to run gdsctools and generate data packages: 1-2 days
  • Fully documented software: 5 days
  • Include COSMIC
  • Fetch IC50 data file on the web automatically if possible from the web interface: 1-2 days

speed

  • 12Nov2015: on v17 data set, analysis takes about 16 minutes on a mac or a dell 6520 (265 drugs, 677 features).
  • Using multicore analysis (4 cores) takes about 9-10 minutes (10 Nov). Surprinsingly only a factor 2 improvmenents. We know that 4 cores does not mean 4x faster. However, note that this a fraction is due to the usage of the low_memory option (set to False), which means the analysis is slower by 20% for sure.
  • 24Oct2015 on v17 data set, analysis takes about 18-20 minutes on 265 drugs, 677 features
  • 12Oct2015 on v17 data set, analysis takes about 22-25 minutes on 265 drugs, 677 features

Clone this wiki locally