You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: documents/MPTevol.Rmd
+16-2
Original file line number
Diff line number
Diff line change
@@ -612,7 +612,9 @@ The samples trees reflecting the overall genetic similarity are often suffered f
612
612
613
613
### 6.1 Inferring clonal structures
614
614
615
-
This step is to infer the clonal structures. The `sciClone`[4](#refer) and `PyClone`[5](#refer) could infer the clonal structures.
615
+
This step is to infer the clonal structures. Many tools have been published to infer the clonal structures, including `sciClone`[4](#refer) and `PyClone`[5](#refer).
616
+
617
+
### 6.1.1 suggestions for inferring clonal structures
616
618
617
619
Two prominent approaches in clonal evolution studies are:
618
620
@@ -626,7 +628,14 @@ should be performed using copy-number aware tools such as `PyClone`, and copy nu
626
628
corrected VAFs can be obtained by dividing the CCFs estimated by such tools by two.
627
629
628
630
629
-
In MPTevol, the format of `variants` is used.
631
+
### 6.1.2 prepare the variants structures.
632
+
633
+
In MPTevol, the format of `variants` is used for downstream analysis.
634
+
635
+
The `variants` is a data frame, the rows indicate variants, and the columns include variant cellular prevalence of each sample and a column of cluster information. The cellular prevalence of variants is used to measure how many tumor cells containing such mutations. The VAF or CCF can be used for cellular prevalence. **The cluster should be named contiguous integer numbers**, starting from 1. The cellular prevalence columns should be short for better visualization.
636
+
637
+
Users are suggested to generate this data frame by yourselves because the variant clustering results need manual evaluation. The `maf2variants` can transform the maf format into `variants` if the cluster inform is included in the CCF data (**TO DO**).
For this data frame, columns from `Chromosome` to `mutid` indicates basic information for each variant, columns from `BRCA_1` to `UterusM_7` indicates the cellular prevalence for each sample, `sciClone` indicates the mutation clusters inferred from sciClone, `kmeans` indicates the mutation clusters inferred from k-means method and `cluster` is the final cluster information used for downstream analysis.
649
+
650
+
639
651
### 6.2 Check clonal prevalence across samples.
640
652
653
+
Since each cluster represents a clone, missing or incorrectly infer a cluster could hinder us from successful construction of the evolution models. Therefore, it is extremely important to obtain a good clustering result. `MPTevol` provides a convenient visualization of variant clusters across multiple samples to help evaluate clustering results, particularly when no tree is inferred.
0 commit comments