-
Notifications
You must be signed in to change notification settings - Fork 1
Enable usched and initret and create ConverseExit #143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@lvkale I’d like to loop you in to discuss this PR. Ritvik noticed that the current Charm++ code calls ConverseExit in two different ways:
Because of these two usage patterns, ConverseExit has to be implemented as a real exit function, as done in this PR. In this implementation, all worker threads with rank ≥ 1 enter an infinite loop at the end of ConverseExit, waiting to be terminated, while the rank 0 thread eventually calls exit, allowing the OS to kill all threads. I’m not sure this is the most elegant way to handle program termination. What are your thoughts? |
|
This is a major-ish change. Would be good to get some old timers to review. Will you please tag Sam and Eric Bohm (and maybe Evan)? |
|
For @ericjbohm and others, I was hoping you would try these changes on NAMD and see if this works for you. I also would appreciate feedback on the design of the exit procedure (which I have to make compatible with existing Charm++). |
To support NAMD, we need the ability for reconverse users to control the scheduler and do extra setup (aside from Cmi_startfn) on their own, if they choose. These changes re-work the thread launches to work more like old Converse, where rank 0 uses the current thread instead of launching a new thread. All cleanup occurs in ConverseExit, including comm backend cleanup. This also allows the scheduler's stop flag to be reset, allowing for repeated scheduler calls.