SCTransform workflow steps on multi-sample datasets #9700

MarcElosua · 2025-02-20T18:14:26Z

MarcElosua
Feb 20, 2025

Hello! Thank you for developing this package and for all the documentation available.

TLDR: Recommended workflow for multi-sample datasets SCT normalization to merging and PCA.

I have a question about the usage of SCTransform in the context of multiple-sample datasets where we don't want to integrate directly. As I understand from reading these issues: issue1, issue2, and issue3 I understand that when I have a multisample dataset I should be running SCTransform by each 10X experiment to correct for experiment specific technical noise. This also helps clear memory requirements with very large datasets 500K+ cells. Once merged, we can obtain the highly variable genes from the dataset like this:

pancreas.list <- lapply(X = pancreas.list, FUN = SCTransform)
features <- SelectIntegrationFeatures(pancreas.list)
VariableFeatures(merged_object) <- features

Next I am interested in computing PCA on these samples to assess if integration is even necessary and I was wondering if I should do ScaleData on the HVG or not?

Thank you very much for any help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SCTransform workflow steps on multi-sample datasets #9700

{{title}}

Replies: 0 comments

Select a reply

SCTransform workflow steps on multi-sample datasets #9700

MarcElosua Feb 20, 2025

Replies: 0 comments

MarcElosua
Feb 20, 2025