Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCTransform generates memory error with a large dataset #9702

Open
TiphaineCMartin opened this issue Feb 21, 2025 · 0 comments
Open

SCTransform generates memory error with a large dataset #9702

TiphaineCMartin opened this issue Feb 21, 2025 · 0 comments

Comments

@TiphaineCMartin
Copy link

TiphaineCMartin commented Feb 21, 2025

I am trying to run SCTransform on a large (~250k cell dataset) to join our spatial transcriptomic with single-cell data, and it crashes during the second step (getting residuals) if I increase the memory usage accepted to 10Go.
Any advice?

first, I tried to run FindTransferAnchors, but I had an error as I don't have SCTModel.list
`anchors <- FindTransferAnchors(reference = adipose.raw, query = HFD8,

  •                            normalization.method = "SCT")
    

Error in slot(object = reference[[reference.assay]], name = "SCTModel.list") :
no slot of name "SCTModel.list" for this object of class "Assay5"`

So I try to run SCTransform, but I don't have enough allocated memory

> options(future.globals.maxSize = 3000 * 1024^2)
> adipose.raw <- SCTransform(adipose.raw, ncells = 5000, verbose = TRUE)
Running SCTransform on assay: RNA
Running SCTransform on layer: counts
vst.flavor='v2' set. Using model with fixed slope and excluding poisson genes.
Variance stabilizing transformation of count matrix of size 26950 by 238465
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 5000 cells
Error in getGlobalsAndPackages(expr, envir = envir, globals = globals) : 
  The total size of the 19 globals exported for future expression (‘FUN()’) is 9.24 GiB.. This exceeds the maximum allowed size of 2.93 GiB (option 'future.globals.maxSize'). The three largest globals are ‘FUN’ (9.22 GiB of class ‘function’), ‘umi_bin’ (19.30 MiB of class ‘numeric’) and ‘data_step1’ (1.91 MiB of class ‘list’)```

So I extended to 10Go as I have 32Go on my computer, but I met another error message related to the memory and I don't know what to do now.

options(future.globals.maxSize = 10000 * 1024^2)
adipose.raw <- SCTransform(adipose.raw, ncells = 5000, verbose = TRUE)
Running SCTransform on assay: RNA
Running SCTransform on layer: counts
vst.flavor='v2' set. Using model with fixed slope and excluding poisson genes.
Variance stabilizing transformation of count matrix of size 26950 by 238465
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 2000 genes, 5000 cells
Found 218 outliers - those will be ignored in fitting/regularization step

Second step: Get residuals using fitted parameters for 26950 genes
Error: vector memory limit of 100.0 Gb reached, see mem.maxVSize()


Any advice?

sessionInfo()
R version 4.4.2 (2024-10-31)
Platform: x86_64-apple-darwin20
Running under: macOS Sequoia 15.3.1

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats4 grid stats graphics grDevices utils datasets methods base

other attached packages:
[1] glmGamPoi_1.16.0 reshape2_1.4.4 future_1.34.0
[4] shiny_1.10.0 monocle3_1.3.7 SingleCellExperiment_1.26.0
[7] SummarizedExperiment_1.34.0 GenomicRanges_1.56.2 GenomeInfoDb_1.40.1
[10] IRanges_2.38.1 S4Vectors_0.42.1 MatrixGenerics_1.16.0
[13] matrixStats_1.5.0 Biobase_2.64.0 BiocGenerics_0.50.0
[16] lubridate_1.9.4 forcats_1.0.0 purrr_1.0.4
[19] readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
[22] tidyverse_2.0.0 RColorBrewer_1.1-3 stringr_1.5.1
[25] ggthemes_5.1.0 ggbeeswarm_0.7.2 plotly_4.10.4
[28] destiny_3.18.0 SeuratWrappers_0.3.2 wesanderson_0.3.7
[31] ggsci_3.2.0 patchwork_1.3.0 harmony_1.2.3
[34] Rcpp_1.0.14 Matrix_1.7-2 Seurat_5.2.1
[37] SeuratObject_5.0.2 sp_2.2-0 ggplot2_3.5.1
[40] ggridges_0.5.6 viridis_0.6.5 viridisLite_0.4.2
[43] data.table_1.16.4 dplyr_1.1.4

loaded via a namespace (and not attached):
[1] spatstat.sparse_3.1-0 httr_1.4.7 tools_4.4.2
[4] sctransform_0.4.1 R6_2.6.1 lazyeval_0.2.2
[7] uwot_0.2.2 withr_3.0.2 gridExtra_2.3
[10] progressr_0.15.1 textshaping_1.0.0 cli_3.6.4
[13] spatstat.explore_3.3-4 fastDummies_1.7.5 isoband_0.2.7
[16] sass_0.4.9 labeling_0.4.3 robustbase_0.99-4-1
[19] spatstat.data_3.1-4 proxy_0.4-27 pbapply_1.7-2
[22] systemfonts_1.2.1 R.utils_2.12.3 parallelly_1.42.0
[25] Rfast2_0.1.5.2 limma_3.60.6 TTR_0.24.4
[28] rstudioapi_0.17.1 generics_0.1.3 crosstalk_1.2.1
[31] ica_1.0-3 spatstat.random_3.3-2 car_3.1-3
[34] abind_1.4-8 R.methodsS3_1.8.2 lifecycle_1.0.4
[37] yaml_2.3.10 scatterplot3d_0.3-44 carData_3.0-5
[40] SparseArray_1.4.8 Rtsne_0.17 promises_1.3.2
[43] crayon_1.5.3 miniUI_0.1.1.1 lattice_0.22-6
[46] cowplot_1.1.3 knitr_1.49 pillar_1.10.1
[49] boot_1.3-31 future.apply_1.11.3 codetools_0.2-20
[52] Rnanoflann_0.0.3 glue_1.8.0 leidenbase_0.1.32
[55] spatstat.univar_3.1-1 pcaMethods_1.96.0 remotes_2.5.0
[58] vcd_1.4-13 Rdpack_2.6.2 vctrs_0.6.5
[61] png_0.1-8 spam_2.11-1 gtable_0.3.6
[64] assertthat_0.2.1 cachem_1.1.0 xfun_0.50
[67] rbibutils_2.3 S4Arrays_1.4.1 mime_0.12
[70] Rfast_2.1.4 RcppEigen_0.3.4.0.2 reformulas_0.4.0
[73] survival_3.8-3 statmod_1.5.0 fitdistrplus_1.2-2
[76] ROCR_1.0-11 nlme_3.1-167 xts_0.14.1
[79] bit64_4.6.0-1 RcppAnnoy_0.0.22 bslib_0.9.0
[82] irlba_2.3.5.1 vipor_0.4.7 KernSmooth_2.23-26
[85] colorspace_2.1-1 nnet_7.3-20 ggrastr_1.0.2
[88] tidyselect_1.2.1 smoother_1.3 processx_3.8.5
[91] bit_4.5.0.1 compiler_4.4.2 curl_6.2.0
[94] hdf5r_1.3.12 desc_1.4.3 DelayedArray_0.30.1
[97] scales_1.3.0 DEoptimR_1.1-3-1 lmtest_0.9-40
[100] hexbin_1.28.5 callr_3.7.6 digest_0.6.37
[103] goftest_1.2-3 presto_1.0.0 spatstat.utils_3.1-2
[106] minqa_1.2.8 rmarkdown_2.29 XVector_0.44.0
[109] htmltools_0.5.8.1 pkgconfig_2.0.3 lme4_1.1-36
[112] sparseMatrixStats_1.16.0 fastmap_1.2.0 rlang_1.1.5
[115] htmlwidgets_1.6.4 UCSC.utils_1.0.0 DelayedMatrixStats_1.26.0
[118] jquerylib_0.1.4 farver_2.1.2 zoo_1.8-12
[121] jsonlite_1.8.9 R.oo_1.27.0 magrittr_2.0.3
[124] Formula_1.2-5 GenomeInfoDbData_1.2.12 dotCall64_1.2
[127] munsell_0.5.1 reticulate_1.40.0 RcppZiggurat_0.1.6
[130] stringi_1.8.4 zlibbioc_1.50.0 MASS_7.3-64
[133] plyr_1.8.9 pkgbuild_1.4.6 parallel_4.4.2
[136] listenv_0.9.1 ggrepel_0.9.6 deldir_2.0-4
[139] splines_4.4.2 tensor_1.5 hms_1.1.3
[142] ps_1.8.1 igraph_2.1.4 ranger_0.17.0
[145] spatstat.geom_3.3-5 RcppHNSW_0.6.0 evaluate_1.0.3
[148] RcppParallel_5.1.10 BiocManager_1.30.25 laeken_0.5.3
[151] nloptr_2.1.1 tzdb_0.4.0 httpuv_1.6.15
[154] VIM_6.2.2 RANN_2.6.2 polyclip_1.10-7
[157] knn.covertree_1.0 scattermore_1.2 rsvd_1.0.5
[160] xtable_1.8-4 e1071_1.7-16 RSpectra_0.16-2
[163] later_1.4.1 ragg_1.3.3 class_7.3-23
[166] memoise_2.0.1 beeswarm_0.4.0 cluster_2.1.8
[169] ggplot.multistats_1.0.1 timechange_0.3.0 globals_0.16.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant