Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[help wanted with double-check] matrix in .X, expect log1p normalized expression to 10000 counts per cell #144

Open
KunHHE opened this issue Dec 31, 2024 · 3 comments

Comments

@KunHHE
Copy link

KunHHE commented Dec 31, 2024

Hi @ChuanXu1 Very cool tool. Trying this tool for annotation, But I have a confused question about "matrix in .X, expect log1p normalized expression to 10000 counts per cell", want to double-check. So we I import my spatial Anndata, I only need to do sc.pp.normalize_total(adata, target_sum=1e4) -> sc.pp.log1p(adata), no more PCA/UMAP processing, right? And go to Celltypist directly?
Then If I used an annotated reference.h5ad, this is a annotated data, so used it as it is? Or I need to do anything else processing? Linking to similar issue#83, Thank you!

@ChuanXu1
Copy link
Collaborator

@KunHHE, yes, if you start by raw count data with all genes, you can normalize it with the two commands you have shown and directly go for CellTypist. CellTypist prediction does not rely on your own annotation, but you can visualize their correspondence using celltypist.dotplot

@KunHHE
Copy link
Author

KunHHE commented Dec 31, 2024

Hi @ChuanXu1 Thanks so much for your quick response! Can I ask you, once finishing the annotation by Celltypist,
how to process the data, Because the rwad data was process to be sc.pp.normalize_total(adata, target_sum=1e4) and sc.pp.log1p(adata).

We usually do it as below, right?

sc.pp.normalize_total(comb_adata, inplace=True)
sc.pp.log1p(comb_adata)
sc.pp.pca(comb_adata)
sc.pp.neighbors(comb_adata)
sc.tl.umap(comb_adata)
sc.tl.leiden(comb_adata)

Then UMAP....

@ChuanXu1
Copy link
Collaborator

@KunHHE, for your adata, CellTypist only adds additional columns in the .obs. So you can process your data without considering how CellTypist influences your data. For your code, you can perform HVGs detection and scaling between sc.pp.log1p and sc.pp.pca

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants