-
Notifications
You must be signed in to change notification settings - Fork 861
WeeklyTelcon_20230613
- Dialup Info: (Do not post to public mailing list or public wiki)
- Geoff Paulsen (IBM)
- Jeff Squires (CISCO)
- Brian Barrett (Amazon)
- Luke Robison (Amazon)
- Quincey (AWS)
- Thomas Huber (Cornelis Networks)
- Todd Kordenbrok
- Tomislav Janjusic (nVidia)
- Joseph Schuchart
- Issue # with OMPI v4.1.5 - and latest PMIx (v4.2.5???)
- Just got a patch to fix it.
- Will drive a v4.1.x release
- fix is on the branch, so works in latest nightlies
- Potential Issue #11749 - OMPI won't spawn
-
#11532 - mca base params file - No progress yet
-
submodule pointer update got merged.
- Are we on tags in these submodule pointers?
-
Doc Issues: https://github.com/open-mpi/ompi/projects/3
-
async modex issues - just need to increase the timeout one
- Trying to make the timeout directly related to scale.
- very hard to know what this will look out
- Please update help message
-
Opened many other document Issues to make it easier to
-
NIC selection coming back from main PR #11739
- Somewhat large change, but really want this back, but it broke EFA
-
#11683 - Grequest issue, just a straight up bug fix.
- Not a v5.0.0 blocker
-
Quincey Still working on mpirun Docs #11730
-
Quincey talked to PRRTE last week to see how we could better manage documentation across repos
- Okay if we update the text in PRRTE to make this easier.
- Like to have text up to date
- mpirun --help and prterun --help pull from plain text.
- Can pull in text into .rst with includes
- He's updated it so manpage output is pulled from same plain text file
-
Idea, what if instead of the "source of true" being in rst
- then render rst into text, and then build man pages, and docs.
- Would require rst anyway, this is already needed/done in places.
- Can do makefile logic to be optional
-
One thing we're losing is the manpage option, is the ability to have internal links to jump around.
- Maybe we could keep this if we do this new Idea process "calling it inverted process"
-
Trying to keep all of the source for the document in the PRRTE repo, so need support from their community
- If we can't get PRRTE support for this, how about the current approach?
- Bummer, because we lose nice pretty HTML text in RST.
- We moved to RST thinking this HTML would be the primary source for users
- If we can't get PRRTE support for this, how about the current approach?
-
One set of source that generates man, help, and HTML3
- x2 for mpirun and prte
-
PMIX v4.2 async modex issue: https://github.com/openpmix/openpmix/issues/3077
- Work around: -x PMIX_MCA_gds= or enable opal_pmix_collect_all_data
- Need to up the timeout, fix in OMPI before PMIX_Get, increase timeout as a function of scale with user override.
- Likely that the original issue is missing an additional variable for async modex. to ompi_pml_base_check_pml
- New parameter exists for v5.0.x MUST be documented,
-
MCA Params issues are biggest issues now - no new updates.
-
https://github.com/openmpi/ompi/issues/11532
- PMIX command line parsing issue fixed the first stage completed, next stage fix over the next few days.
- https://github.com/openpmix/prrte/issues/1731
- Plan is to have 2 of the 3 fixes for v5.0.0, 3rd issue can wait for 5.0.x
- Quincy assigned, working on docs first.
-
https://github.com/openmpi/ompi/issues/11532
-
Need to cherry-pick NIC selection (distances PR fixes) to v5.0.x
- Several PRs will go into main, including coverity fixes.
- Amir to open up a v5.0.x PR to track all main commits and cherry-pick to v5.0.x when finished.
- Pending review -
- Will create initial v5.0.x PR as a pre-PR for the NIC selection: needs review
- #11726 -N bind ppr:X:node, map by package (socket), or core
- What we've confirmed is that there is a change to the way that binding works if you just specify
-N
- Seems like we try to change the schizo component so that we maintain behavior from v4 to v5.
- With this, we can decide what to do.
- What we've confirmed is that there is a change to the way that binding works if you just specify
- #11722 - Cannot build+install with out of source builds (VPATH)
- Possible blocker, need to update submodule pointers.
- only on main
- main needs submodule update - Austen
- Possible blocker, need to update submodule pointers.