Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INFO:numexpr.utils:Note #17

Open
avnikonenko opened this issue Dec 11, 2024 · 7 comments
Open

INFO:numexpr.utils:Note #17

avnikonenko opened this issue Dec 11, 2024 · 7 comments

Comments

@avnikonenko
Copy link

avnikonenko commented Dec 11, 2024

Hello!
I have an question about NUMEXPR_MAX_THREADS logging info.
During the a3fe run I got this messages:

INFO:numexpr.utils:Note: detected 128 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
INFO:numexpr.utils:Note: NumExpr detected 128 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16.
INFO:numexpr.utils:NumExpr defaulting to 16 threads.

I found that if I want to control this behavior I need to set up 2 system variables. Stackoverflow:

import os
os.environ['NUMEXPR_MAX_THREADS'] = '128'
os.environ['NUMEXPR_NUM_THREADS'] = '128'
import a3fe as a3 

It works fine. But the question is to which calculations it applies and what are the most effective values and does it have any effect? I couldn't find any information about it in the documentation.
Could you clarify this for me please?
Thank you!

@fjclark
Copy link
Collaborator

fjclark commented Dec 12, 2024

Hello!

As mentioned on the stack exchange it's not really something to worry about. I think the main place numexpr is indirectly used by a3fe is in PyMBAR (here). So if you may be able to accelerate PyMBAR slightly by increasing the number of threads e.g. to equal your number of cores, but I haven't looked into it, and I never set the numexpr number of threads (I just ignore this message). Even if you did accelerate things slightly, the MD will still be the bulk of the cost of the calculation, which is unaffected by this.

To make things clearer, I should either add something to the docs, or figure out why this warning is appearing. This seems to be due to BioSimSpace - if I run

import BioSimSpace

I see the message, but if I run

import numexpr
import BioSimSpace

I don't see the message.

I'm super busy at the minute so will leave figuring this out til after Christmas as it's not critical!

Thanks.

@avnikonenko
Copy link
Author

It is not critical at all, I was just was wondering if it affects any login node calculations - like calc.analyse(). As it seems it is not, I believe just your answer here is enough.
Thank you for the clarification!

@fjclark
Copy link
Collaborator

fjclark commented Dec 13, 2024

No problem!

Also, if you want to avoid running the analysis on the headnode (which can be fairly intensive) you can submit the analysis as a slurm job with calc.analyse(slurm=True). I should probably set this as the default.

@avnikonenko
Copy link
Author

avnikonenko commented Dec 13, 2024

That's what I need! But does calc.analyse(slurm=True) use GPUs for calculation? We have different types of nodes with limited time of usage and if GPUs are not used I would prefer to run it using CPU node. In this case is it possible to set up 2 templates for GPU and CPU calculations or start new calculation only for analysis while using all previous results?

@fjclark
Copy link
Collaborator

fjclark commented Dec 16, 2024

Currently we use PyMBAR 3 rather than 4 because of convergence issues (see choderalab/pymbar#544). PyMBAR 3 only runs on CPUs (whereas PyMBAR 4 can be accelerated with GPUs), so it's not ideal that a3fe submits to the same GPU queue as the MD.

  • A quick hack for now (I'm afraid I'm super busy before Christmas trying to smash out my thesis) is to wait until the calculation has finished, then update run_somd.sh in the input directory with the new queue. Then, load the calculation, update the submission scripts, and analyse with slurm with
import a3fe as a3
calc = a3.Calculation()
calc.update_run_somd()
calc.analyse(slurm=True)
  • I'll open an issue and sort out a proper solution after Chirstmas. We're planning to overhaul the code (Too Much Coupling to SOMD #11), including replacing run_somd.sh with a cleaner pydantic slurm settings class, and I'll make sure we allow submission to separate queues for the analysis and MD!

@avnikonenko
Copy link
Author

The solution works fine. Thank you!
And I wish you luck with the thesis!

@fjclark
Copy link
Collaborator

fjclark commented Dec 19, 2024

Great! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants