-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QST] GPU out-of-memory errors when applying UMAP to extremely large SAE feature matrices #6167
Comments
Hi @lc82111, that is quite a large dataset for a 3090! I saw the discussion from Apply UMAP to the SAEs features RE-N-Y/sae#1, will be adding the details here and we will be looking into things. |
@lc82111 We just recently released a new algorithm for being able to scale massive datasets which are larger than the memory available in the GPU. The algorithm works by breaking apart the dataset into some number of partitions (using kmeans as a clustering algorithm) so that each partition CAN fit on the GPU. The ideal setting for Are the out of memory errors you are getting happening on the GPU or in RAM? You have 512GB of RAM available but the nn-descent partitioning algorithm will still require the data be available in RAM. It's possible you could try to use memory mapping for this, but I do caution that we have not tried this. That being said, I would maybe try 25-30 partitions so that each partition contains ~13-16GB of data. If that still doesn't work, you can try increasing the number of partitions further. Please let us know if we can help futher! |
@lc82111 In case you haven't seen it yet, I also wanted to bring your attention to a recent blog that explains the scaling algorithm in more detail: https://developer.nvidia.com/blog/even-faster-and-more-scalable-umap-on-the-gpu-with-rapids-cuml/. Prior to this feature, it was expected that the algorithm would scale by training a UMAP embedding on a smaller subsample of the data and then using |
Hi, everyone I am also experiencing out of memory issue when running umap, albeit with a smaller dense matrix (10e6 rows, 16 cols, float 32). I think maybe the batching or data_on_host is not working for me. Any suggestions on what might help? GPU: A16-8Q I am using docker: Fails: (nnd_n_clusters: 2)
MemoryError: std::bad_alloc: CUDA error at : /opt/conda/include/rmm/mr/device/managed_memory_resource.hpp Works: (nnd_n_clusters: 1)
Thank you for any suggestions Kind regards, |
Hi beckernick,
I'm encountering GPU out-of-memory errors when applying UMAP to extremely large SAE feature matrices (~400GB in fp32 format). My environment details are as follows:
OS: Ubuntu 20.04
RAPIDS: 24.12
GPU: 3090
CPU Memory: 512 GB
Python: 3.12.8
I suspect that setting nnd_n_clusters=1 might help with the memory usage. Could you provide guidance or share any implementation strategies that could handle such large-scale feature matrices efficiently?
Here is a list of installed packages for reference:
Thank you for your assistance!
Best regards,
The text was updated successfully, but these errors were encountered: