Excessive memory usage during compilation with pip #115

sirmarcel · 2024-04-29T17:04:23Z

Currently, attempting to build the sphericart-torch wheel with pip requires a large amount of RAM if many CPU cores are present. I think this is due to this line, which invokes cmake without specifying the number of jobs, which presumably will default to the total number of cores. On a HPC system those can be 40 or 80, and so compilation tends to get killed by the host OS.

While this is not catastrophic, it is inconvenient, and a waste of resources in many cases (the compilation is not much faster in parallel mode). I would suggest defaulting to some reasonable default instead, or disabling parallel builds entirely. Alternatively, the installation docs should at least mention this fact (see #116).

The text was updated successfully, but these errors were encountered:

nickjbrowning · 2024-04-29T20:24:03Z

Thanks for the find, this is a very good point. I'll address this in a PR tomorrow.

sirmarcel · 2024-04-29T21:09:59Z

Thanks @nickjbrowning !

Luthaf · 2024-04-30T09:28:09Z

One thing I don't understand here is that we don't have that many files to compile, so make -j and make -j8 should have the same behavior (launch ~8 compilation jobs).

sirmarcel · 2024-04-30T10:57:18Z

It's a bit suspicious. My observation is: (a) compilation dies with kill on the default allocation on izar (4GB I believe), (b) if you remove --parallel from the setup.py file of sphericart-torch, it works without problem, (c) requesting a node with 32GB also works, without modification.

Luthaf · 2024-04-30T11:30:51Z

Oh, right. I can see the compiler requiring a couple of GiB per file (there are a lot of torch header to parse and template to instantiate), so parallel compilation would fail with only 4GiB of available RAM. But then the changed by @nickjbrowning would not fix it here, since the compilation would also fail with only 8 jobs.

nickjbrowning · 2024-07-12T08:09:32Z

I've added these two environment variables to the build process:

SPHERICART_PARALLEL_BUILD=ON
SPHERICART_JOBS=NJOBS

So you can now control the number of build jobs via:

SPHERICART_PARALLEL_BUILD=OFF pip install .[torch] #disables parallel builds
SPHERICART_JOBS=4 pip install .[torch] #uses 4 jobs for compilation

--------- Co-authored-by: frostedoyster <[email protected]> Co-authored-by: Filippo Bigi <[email protected]> Co-authored-by: Guillaume Fraux <[email protected]>

sirmarcel mentioned this issue Apr 29, 2024

Improvements to installation docs #116

Open

nickjbrowning self-assigned this Apr 29, 2024

nickjbrowning mentioned this issue Jul 12, 2024

fix to build system to limit number of threads #115 #119

Merged

frostedoyster closed this as completed Jul 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Excessive memory usage during compilation with pip #115

Excessive memory usage during compilation with pip #115

sirmarcel commented Apr 29, 2024 •

edited

Loading

nickjbrowning commented Apr 29, 2024

sirmarcel commented Apr 29, 2024 •

edited

Loading

Luthaf commented Apr 30, 2024

sirmarcel commented Apr 30, 2024

Luthaf commented Apr 30, 2024

nickjbrowning commented Jul 12, 2024 •

edited

Loading

Excessive memory usage during compilation with pip #115

Excessive memory usage during compilation with pip #115

Comments

sirmarcel commented Apr 29, 2024 • edited Loading

nickjbrowning commented Apr 29, 2024

sirmarcel commented Apr 29, 2024 • edited Loading

Luthaf commented Apr 30, 2024

sirmarcel commented Apr 30, 2024

Luthaf commented Apr 30, 2024

nickjbrowning commented Jul 12, 2024 • edited Loading

sirmarcel commented Apr 29, 2024 •

edited

Loading

sirmarcel commented Apr 29, 2024 •

edited

Loading

nickjbrowning commented Jul 12, 2024 •

edited

Loading