DSS-v2.0 is a computational pipeline for improved quantification of selective drug responses in cells from cancer patients and healthy controls. The standard drug metrics, such as DSS, AUC and IC50 can be computed in the Breeze application (https://github.com/potdarswapnil/Breeze). Based on the healthy control responses, the selective DSS (or sDSS) is calculated as difference between the patient DSS and the average of the healthy control responses. Given enough healthy controls, one can estimate the standard deviation of compound-specific reference distribution over the controls using the z-score statistic. Compared with sDSS, calculation of z-score (zDSS) or robust z-score (rDSS) implements not only location normalization for each compound (like sDSS), but also normalizes the patient responses in terms of variability over the control responses (i.e., scale normalization). When combining drug response profiles from different sources, the ComBat batch effect correction method should be used when there appear visible batch effects.
Citation: If you use this work, please cite the paper Chen et al. Robust scoring of selective drug responses for patient-tailored therapy selection Nature Protocols (2023).
R version 3.5.1 or newer is required.
The way to install packages 'sva' and 'pcaMethods' in Bioconductor differs from other packages in R:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
if (!requireNamespace(c("sva", "pcaMethods"), quietly = TRUE))
BiocManager::install(c("sva", "pcaMethods"))
# download the example data and R scripts
git clone https://github.com/yingjchen/DSS-v2.0.git
# change the directory
cd ./DSS-v2.0
# start the R program
R
lapply(c("matrixStats","dplyr","reshape","reshape2", "scales", "drc", "caTools","ggplot2", "data.table", "stringr","MESS", "BiocManager","svMisc", "egg", "pheatmap", "sva", "pcaMethods"), library, character.only = T)
source('./DSS.R')
source('./HelperFunctions.R')
# load the ex vivo dose-response profiles (cell viability at five drug concentrations)
df_dose.responses <- read.csv('./exampleData_procedure1.csv', header = T,sep = ',',check.names = F)
head(df_dose.responses)
## DRUG_NAME CONCENTRATION_nM SCREEN_NAME CELL_VIABILITY
## 1 Nelarabine 10000 AML_013_01 0.3012
## 2 Nelarabine 1000 AML_013_01 0.5565
## 3 Nelarabine 100 AML_013_01 0.7578
## 4 Nelarabine 10 AML_013_01 0.9133
## 5 Nelarabine 1 AML_013_01 0.8610
## 6 Decitabine 10000 AML_013_01 0.4829
# calculate the percentage of growth inhibition
df_dose.responses.list <- DOSE_RESPONSE_PROCESS(df_dose.responses, viability = TRUE)
# calculate DSS metrics (DSS1, DSS2, DSS3), AUC and relative IC50
df.metrics <- CALC_METRICS(df_dose.responses.list[[1]], df_dose.responses.list[[2]], graph = FALSE)
controls.dss <- read.csv('./controls/File_1_Drugname_response_DSS_10Healthy.txt', header = T, sep = '\t', row.names = 1,stringsAsFactors = F, check.names = F)
# compute the descriptive statistics of DSSs for each drug over 10 controls
controls.summary <- as.data.frame(rbind(colMeans(as.matrix(controls.dss)),colSds(as.matrix(controls.dss)),colMedians(as.matrix(controls.dss)), colMads(as.matrix(controls.dss))))
# define names of statistics
rownames(controls.summary ) <- c('mean', 'sd', 'median', 'mad')
# let's set DSS2 as the drug response metrics of patient samples (as an example)
patients.dss <- as.data.frame(acast(df.metrics,df.metrics$Patient.num ~ df.metrics$drug , value.var = 'DSS2'))
patients.dss[, 1:3]
## 1-methyl-D-tryptophan 4-hydroxytamoxifen 8-amino-adenosine
## AML_003_01 0 9.2 25.4
## AML_004_01 0 2.3 22.6
## AML_013_01 0 4.3 23.9
# normalize and scale patient-specific responses to drugs with control DSS profiles
# patient sDSS
patients.sdss <- patients.dss - slice(controls.summary['mean', colnames(patients.dss)],rep(1:n(), each = nrow(patients.dss)))
# patient zDSS
patients.zdss <- (patients.dss - slice(controls.summary['mean', colnames(patients.dss)],rep(1:n(), each = nrow(patients.dss))))/(slice(controls.summary['sd', colnames(patients.dss)],rep(1:n(), each = nrow(patients.dss))) + 1)
# patient rDSS
patients.rdss <- (patients.dss - slice(controls.summary['median', colnames(patients.dss)],rep(1:n(), each = nrow(patients.dss))))/(slice(controls.summary['mad', colnames(patients.dss)],rep(1:n(), each = nrow(patients.dss))) + 1)
To combine drug response profiles from multiple sources, we apply the ComBat algorithm to adjust for the known batch effects.
# load the example data: DSS2 proiles for 147 BeatAML samples, 125 FIMM-AML samples and 10 healthy controls
df.dss <- read.csv('./exampledata_procedure2.csv', header = T,sep = ',', row.names = 1, check.names = F)
One can visualize a PCA or Probabilistic PCA (PPCA) plot showing the batch effects due to the presence of missing data.
# The example DSS2 profiles contain missing data, use PPCA to check potential batch effects.
df.dss.1 <- df.dss[, 1 : (ncol(df.dss) - 3)]
score_pca <- PCA_FUNC(df.dss.1)
score_pca$group <- paste(df.dss$cohort, df.dss$status, sep = ' ')
ggplot(score_pca, aes(x = PC1, y = PC2, color = group)) +
geom_point() + labs(title = "DSS", x = "PC1", y = "PC2") +
theme_classic()
Given the differences between two cohorts, let's use the ComBat function from SVA package to correct the batch effects.
df.dss.combat <- ComBat(dat = t(df.dss.1), batch = as.factor(df.dss$cohort), mod = NULL, par.prior = F, prior.plots = F)
PPCA of ComBat-corrected DSS profiles
score_pca_combat <- PCA_FUNC(t(df.dss.combat))
score_pca_combat$group <- paste(df.dss$cohort, df.dss$status, sep = ' ')
ggplot(score_pca_combat, aes(x = PC1, y = PC2, color = group)) +
geom_point() + labs(title = "ComBat DSS", x = "PC1", y = "PC2") +
theme_classic()
For any questions please contact Yingjia Chen ([email protected])
Code copyright Robust scoring of selective drug responses for patient-tailored therapy selection
License https://github.com/yingjchen/DSS-v2.0/blob/main/LICENSE