-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathdiagnostics.Rmd
89 lines (59 loc) · 3.81 KB
/
diagnostics.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
title: "doomsayer_diagnostics"
output:
html_document:
toc: true
toc_float: true
keep_md: yes
theme: united
pdf_document:
date: '`r format(Sys.Date())`'
params:
dev: no
yaml_cfg: x
---
<style>
body {
position: absolute;
left: 0px;
}
.main-container {
max-width: 1800px !important;
}
</style>
```{r child="R/setup.Rmd"}
```
## Error profile heatmap
### Plot description
This plot shows the fold-difference in contribution per subtype for each outlier. Fold-differences are calculated relative to the mean contribution of that subtype across all samples. Scale is truncated to +/-5-fold difference. Both rows and columns are clustered using Ward's algorithm (`ward.D2` in R). Each outlier is assigned into one of `r I(rank)` groups according to their similarity. Outlier groups are indicated by the color of the branches (if using interactive mode) or by a colorbar under the dendrogram (if using static mode).
### Interpreting this plot
This heatmap shows patterns of similarity in the mutation spectra of flagged outliers. If there are batch effects that have affected the mutation spectra of certain samples, those rows should cluster together.
Clustering of the columns indicates that, among the flagged outliers, multiple subtypes tend to be enriched or depleted together.
Note that a lack of obvious clustering patterns does not necessarily mean the flagged outliers are free from error and were flagged due to random variation in the mutation spectra.
```{r child="R/heatmap.Rmd"}
```
```{r child="R/stat_boxplots1.Rmd"}
```
```{r child="R/stat_boxplots2.Rmd"}
```
## Compare component distributions
### Plot description
- The lower panels of this plot show pairwise scatterplots comparing each of the `r I(rank)` principal components (if using PCA) or NMF components (if using NMF) across the `r I(nsamples)` samples.
- The diagonal panels show the density of each component.
- The upper panels show the correlation between component variables for 1) all `r I(nsamples)` samples, 2) the `r I(ndrop)` outlier samples, and 3) the `r I(nkeep)` non-outlier samples.
### Interpreting this plot
The scatterplots (lower panels) show where outliers fall relative to the "good" samples in a `r I(rank)`-dimensional space. Increasing the outlier sensitivity threshold (with the `--threshold` option in `doomsayer.py`) will result in more outliers (red points) and tighter similarity of the non-outliers (black points). Conversely, decreasing the sensitivity threshold will result in fewer flagged outliers, but greater variability among the retained samples.
If any component shows clear evidence of multimodality (see diagonal panels), consider changing the default outlier detection from `--filtermode ee` (elliptic envelope) to either `--filtermode lof` (local outlier factor) or `--filtermode if` (isolation forest), as the elliptic envelope method assumes the components follow a unimodal distribution.
If you are using NMF decomposition and two or more components are perfectly correlated (see upper panels), consider running Doomsayer with a lower rank.
```{r child="R/component_scatter.Rmd"}
```
## Mutation spectra plots
### Plot description
Each bar represents the fraction of observations contained within that K-mer subtype across all samples of a particular group. Bars that are solid colors indicate the outliers in that group have a significantly higher or lower proportion of those SNVs compared to the inlier group (FDR-adjusted; Q-value < 0.05), whereas greyed-out bars indicate no significant difference.
```{r child="R/spectra.Rmd"}
```
## Signature loadings (H matrix)
### Plot description
Describes how each mutation subtype is loaded into the `r I(rank)` signatures
```{r child="R/sigloads_bar.Rmd"}
```