-
Notifications
You must be signed in to change notification settings - Fork 861
WeeklyTelcon_20220927
Geoffrey Paulsen edited this page Oct 4, 2022
·
1 revision
- Dialup Info: (Do not post to public mailing list or public wiki)
- Akshay Venkatesh (NVIDIA)
- Austen Lauria (IBM)
- Brendan Cunningham (Cornelis Networks)
- Brian Barrett (AWS)
- Christoph Niethammer (HLRS)
- David Bernhold (ORNL)
- Edgar Gabriel (UoH)
- Geoffrey Paulsen (IBM)
- Josh Fisher (Cornelis Networks)
- Thomas Naughton (ORNL)
- Todd Kordenbrock (Sandia)
- Tommy Janjusic (nVidia)
- William Zhang (AWS)
- Artem Polyakov (nVidia)
- Aurelien Bouteiller (UTK)
- Brandon Yates (Intel)
- Charles Shereda (LLNL)
- Erik Zeiske
- George Bosilca (UTK)
- Harumi Kuno (HPE)
- Hessam Mirsadeghi (UCX/nVidia)
- Howard Pritchard (LANL)
- Jan (Sandia -ULT support in Open MPI)
- Jeff Squyres (Cisco)
- Jingyin Tang
- Joseph Schuchart
- Josh Hursey (IBM)
- Marisa Roman (Cornelius)
- Mark Allen (IBM)
- Matias Cabral (Intel)
- Matthew Dosanjh (Sandia)
- Michael Heinz (Cornelis Networks)
- Nathan Hjelm (Google)
- Noah Evans (Sandia)
- Raghu Raja (AWS)
- Ralph Castain (Intel)
- Sam Gutierrez (LLNL)10513
- Scott Breyer (Sandia?)
- Shintaro iwasaki
- Xin Zhao (nVidia)
- Multiple weeks on CVE from nvidia.
- libevent CVE checkers - If we patch out the CVE issues (in libevent code that we're not using)
- One of the two the warnings go away.
- The other is just looking at libevent version numbers, so CVE issue does NOT go away.
- Updating libevent in v4.x is painful, not painful in v5.x
- Not updating libevent version in v4.1.x
- v4.1.5
- Schedule: targeting ~6 mon (Targeting November, so first RC next week or two)
-
Austen posted a PR to update PRRTE+PMIx 10858
- Git commit checker is hung. Merged new updates he posted.
- In prep for new RC tomorrow.
-
Going to meet with Ralph 3pm Easter today to discuss regarding
-
Thomas had a todo to look into mpirun passing DVM piece.
- Still trying to figure out if we could do everything we want from
prun
. - Right thing might be to pass DVM_URI to mpirun.
- Happy to join in, would simplify all stuff you get with OMPI personality.
- Still trying to figure out if we could do everything we want from
-
Discuss
mca_base_env_list
https://github.com/open-mpi/ompi/pull/10788- Did google around, and this is documented https://oar.imag.fr/wiki:passing_environment_variables_to_openmpi_nodes
- Mentions that
-x
is deprecated?
- Mentions that
- Easy to fix Mellanox CI, but SHOULD we?
- Lets remove the test, and add it to an Issue 10698.
- Did google around, and this is documented https://oar.imag.fr/wiki:passing_environment_variables_to_openmpi_nodes
-
Discuss Remaining PRRTE CLI issues (https://github.com/open-mpi/ompi/issues/10698)
-
-N
document an error if they try to error if--map-by
conflict. -
--show-progress
- do the little...
on terminal to display, now it doesn't do anything.- DOE may set this by default in MCA parameters (makes some users feel happy)
-
--display-topo
Generally we've tried to be backwards compatible. -
-v
version -
-V
verbose -
-s|--preload-binary
<- functionally it works, but with-n
gets messed up - rankfile <- NOT deprecating
- --mca is Open MPI's framework
- No gprtemca. Created by PRRTE, but do we continue to support --gpmixmca?
- --test-suicide and others all prrtedameon not exposed to the users.
- passed to prrte launcher
-
-
Posted Issue Open-MPI #10698 with about 13 issue, that will need
-
No longer trust the verbage here, based on Ralph's comment
- Not recognized from mpirun, but sited in --help.
- Some of these aren't possible??? and mpirun -> prterun (one shot thing)
-
Should mpirun be able to talk to an existing dvm???
- Or is it always a 1 shot thing?
- If we have it talk to an existing DVM,
- prte to startup prteds, and pruns at that.
- If you're using MPI front-end, and want to interact with DVM, how should we tell users to do that?
- What should they do?
- Go through mpirun, or go through prun (with ompi personality?)
- Thomas can look and see if you can get everything you need.
- There were some common things that were difficult when switching between the two.
- Was there an option for this in v4.1?
- Yes, but perhaps wasn't working much.
- Are there legacy command line options that we should support or alias?
-
Are we dropping DVM support for v5?
- How did this work in v4?
- Howard thought you fired up an orte something, and that would provide a command line
- Couldn't do all of this with mpirun, it was a two stage process.
- Had to start DVM manually, and got back a URI
- But thought if you sourced this scziso and gave it a URI, it would do all of the right things.
- Could add support if the user fired up using PRTE the DVM, and got URI back.
- Don't have ompi-dvm executable in v5, so this is already a deviation.
- What do we do?
- support same CLI options (and executables, etc as documented for v4.x
- Don't support at all in v5, and if you want to do DVM things
- Maybe something in the middle.
- Does anyone care about DVM?
- Can we run ompi_scizo / personality with vanilla PRUN?
- Some people on call DO care about DVM.
- Early days of Sesions needed DVM run (no longer needed in main/v5)
-
Usually if customers are interested in doing this, they're willing do to a bit more work.
- But if we want to get v5.0.0 out in near future, it'd be more likely if we
- Thomas gets a lot of use with mini-task, some are MPI parallel.
- This is where DVM is useful because slamming lots of serial and parallel jobs in a short time.
- If they can do this via prun to get ompi_schziso doesn't matter the path.
- Thomas will investigate proper options.
- Could do a CLI interface for mpirun in a future version to have mpirun not call prterun
- Don't want to rush this.
-
Schedule:
- PMIx and PRRTE changes coming at end of August.
- PMIx v3.2 released.
- Try to have bugfixes PRed end of August, to give time to iterate and merged.
- Still using Critical v5.0.x Issues (https://github.com/open-mpi/ompi/projects/3) yesterday
- PMIx and PRRTE changes coming at end of August.
-
Docs
-
mpirun --help
is OUT OF DATE.- Have to do this relatively quickly, before PRRTE releases.
- Austen, Geoff and Tomi will be
- REASON for this, is because mpirun command line is in PRRTE.
-
-
mpirun manpage needs to be re-written.
- Docs are online and can be updates asyncronously.
- Jeff posted PR to document runpath vs rpath
- Our configure checks some linker flags, but there might be default in linker or in system that really governs what happens.
-
Symbol Pollution - Need an issue posted.
- OPAL_DECLSPEC - Do we have docs on this?
- No. Intent is where do you want a symbol available?
- Outside of your library, then use OPAL_DECLSPEC (like Windows DECLSPEC)
- I want you to export this symbol.
- No. Intent is where do you want a symbol available?
- need to clean up as much as possible.
- Open-MPI community's perspective, our ABI is just MPI_Symbols
- Still unfortunate. We need to clean up as much as possible.
- OPAL_DECLSPEC - Do we have docs on this?
- Community CI Jenkins had some errors last week.
- Needed to upgrade to Java (11 or 17?) from Java 8, and that caused some subtle issues.
- Cisco student to upgrade or replace use of some now deprecated jenkins plugins to improve stability/performance of jenkins.
- Bulk of the work is merged. Some follow up patches, etc.
- Then once this is done, will backport to v5.0.x
- We're probably not getting together in person anytime soon.
- So we'll send around a doodle to have time to talk about our rules.
- Reflect the way we worked several years ago, but not really right now.
- we're to review the admin steering committee in July (per our rules):
- we're to review the technical steering committee in July (per our rules):
- We should also review all the OMPI github, slack, and coverity members during the month of July.
- Jeff will kick that off sometime this week or next week.
- In the call we mentioned this, but no real discussion.
- Wiki for face to face: https://github.com/open-mpi/ompi/wiki/Meeting-2022
- Might be better to do a half-day/day-long virtual working session.
- Due to company's travel policies, and convenience.
- Could do administrative tasks here too.
- Might be better to do a half-day/day-long virtual working session.