About output #13

lancy-liang · 2023-07-20T08:55:29Z

Hello, I have some confusion about the output of SCAPE, and I am not quite sure what the "count" in the pasite.csv.gz file refers to:

1:40004:36:+,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,2,0,0,1,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,

May I know what do these values represent?

zhou-ran · 2023-07-20T09:05:34Z

Hi,

The first column represents the name of the predicted pA site, while the other columns correspond to individual cells, with each value indicating the abundance of the respective pA site in the respective cell. Additionally, the name "1:40004:36:+" of the pA site indicates that it is located on the forward strand of chromosome 1 at position 40004 with a disperse value of 36.

I hope this helps.

Ran

lancy-liang · 2023-07-20T09:14:37Z

Thank you very much for your response!

However, I still have some confusion:

"each value indicating the abundance of the respective pA site in the respective cell"

0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,2,0,0,1,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0

During the analysis, I constructed a Seurat object from this matrix. What do nCount_RNA and nFeature_RNA represent in this case?

thanks

zhou-ran · 2023-07-20T09:21:00Z

Hi,

Please refer to the Seurat manual for information on how to calculate this value, as I am not familiar with how Seurat performs the calculation.

Ran

lancy-liang · 2023-07-20T09:27:31Z

Hello, I used the [loadData.R] ([https://github.com/LuChenLab/SCAPE/blob/main/SCAPE.R/R/loadData.R) ↗](https://github.com/LuChenLab/SCAPE/blob/main/SCAPE.R/R/loadData.R)) script to directly construct a Seurat object, but I am not sure about the meaning of the count 0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,2,0,0,1,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0. I have read your paper where you mention the description of count:

"SCAPE is used to assign the reads to different APA isoforms. As a result, we get the number of reads for each pA site, which we term as pA counts."

"We use SCAPE to quantify the weights of pA sites of a gene in each cell."

"In summary, we demonstrate that SCAPE is able to accurately identify and quantify APA isoforms from the theoretical perspective."

"In terms of isoform weight quantification, SCAPE showed the highest correlation with the ground truth (R2 = 0.97), while the best performance of other methods was R2 = 0.53 (Supplementary Figure S1G)."

However, I am still not entirely clear on what this count specifically represents.

zhou-ran · 2023-07-20T09:47:15Z

Hi,

The count represents how many times a pA site was detected in an individual cell after removing PCR duplicates. Besides, the variables "nCount_RNA" and "nFeature_RNA" are established when a Seurat object is initiated and not created by SCAPE.
Ran

lancy-liang · 2023-07-20T10:05:33Z

thank you very much!!

lancy-liang · 2023-08-29T11:44:53Z

Hi, Ran.
I used psiCategory.R to calculate the psi values for pA sites, but I thought it would calculate the psi values for genes. In your article, "Category of pA usage" mentions, "To better understand the heterogeneity of APA patterns among cell populations, we first calculated the usage of pA sites for each gene at single cell level." Is this categorization for genes or for pA sites?
Regarding the section on "Expected pA length," is the calculated value referring to the usage of a single pA site or the usage of pA sites within a gene?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About output #13

About output #13

lancy-liang commented Jul 20, 2023

zhou-ran commented Jul 20, 2023

lancy-liang commented Jul 20, 2023

zhou-ran commented Jul 20, 2023

lancy-liang commented Jul 20, 2023

zhou-ran commented Jul 20, 2023 •

edited

Loading

lancy-liang commented Jul 20, 2023

lancy-liang commented Aug 29, 2023

About output #13

About output #13

Comments

lancy-liang commented Jul 20, 2023

zhou-ran commented Jul 20, 2023

lancy-liang commented Jul 20, 2023

zhou-ran commented Jul 20, 2023

lancy-liang commented Jul 20, 2023

zhou-ran commented Jul 20, 2023 • edited Loading

lancy-liang commented Jul 20, 2023

lancy-liang commented Aug 29, 2023

zhou-ran commented Jul 20, 2023 •

edited

Loading