Skip to content

Commit 37440e8

Browse files
committed
update documents
1 parent d8392ca commit 37440e8

File tree

1 file changed

+16
-2
lines changed

1 file changed

+16
-2
lines changed

documents/MPTevol.Rmd

+16-2
Original file line numberDiff line numberDiff line change
@@ -612,7 +612,9 @@ The samples trees reflecting the overall genetic similarity are often suffered f
612612

613613
### 6.1 Inferring clonal structures
614614

615-
This step is to infer the clonal structures. The `sciClone` [4](#refer) and `PyClone` [5](#refer) could infer the clonal structures.
615+
This step is to infer the clonal structures. Many tools have been published to infer the clonal structures, including `sciClone` [4](#refer) and `PyClone` [5](#refer).
616+
617+
### 6.1.1 suggestions for inferring clonal structures
616618

617619
Two prominent approaches in clonal evolution studies are:
618620

@@ -626,7 +628,14 @@ should be performed using copy-number aware tools such as `PyClone`, and copy nu
626628
corrected VAFs can be obtained by dividing the CCFs estimated by such tools by two.
627629

628630

629-
In MPTevol, the format of `variants` is used.
631+
### 6.1.2 prepare the variants structures.
632+
633+
In MPTevol, the format of `variants` is used for downstream analysis.
634+
635+
The `variants` is a data frame, the rows indicate variants, and the columns include variant cellular prevalence of each sample and a column of cluster information. The cellular prevalence of variants is used to measure how many tumor cells containing such mutations. The VAF or CCF can be used for cellular prevalence. **The cluster should be named contiguous integer numbers**, starting from 1. The cellular prevalence columns should be short for better visualization.
636+
637+
Users are suggested to generate this data frame by yourselves because the variant clustering results need manual evaluation. The `maf2variants` can transform the maf format into `variants` if the cluster inform is included in the CCF data (**TO DO**).
638+
630639

631640
```{r}
632641
# load data
@@ -636,8 +645,13 @@ data("variants.ref", package = "MPTevol")
636645
head(variants,3)
637646
```
638647

648+
For this data frame, columns from `Chromosome` to `mutid` indicates basic information for each variant, columns from `BRCA_1` to `UterusM_7` indicates the cellular prevalence for each sample, `sciClone` indicates the mutation clusters inferred from sciClone, `kmeans` indicates the mutation clusters inferred from k-means method and `cluster` is the final cluster information used for downstream analysis.
649+
650+
639651
### 6.2 Check clonal prevalence across samples.
640652

653+
Since each cluster represents a clone, missing or incorrectly infer a cluster could hinder us from successful construction of the evolution models. Therefore, it is extremely important to obtain a good clustering result. `MPTevol` provides a convenient visualization of variant clusters across multiple samples to help evaluate clustering results, particularly when no tree is inferred.
654+
641655

642656
```{r warning=FALSE}
643657
library(clonevol)

0 commit comments

Comments
 (0)