-
Notifications
You must be signed in to change notification settings - Fork 861
WeeklyTelcon_20210615
- Austen Lauria (IBM)
- Brendan Cunningham (Cornelis Networks)
- Brian Barrett (AWS)
- David Bernholdt (ORNL)
- Geoffrey Paulsen (IBM)
- Jeff Squyres (Cisco)
- Josh Hursey (IBM)
- Matthew Dosanjh (Sandia)
- Michael Heinz (Cornelis Networks)
- Raghu Raja
- Sam Gutierrez (LANL)
- Tomislav Janjusic (NVIDIA)
- William Zhang (AWS)
- Akshay Venkatesh (NVIDIA)
- Artem Polyakov (NVIDIA)
- Aurelien Bouteiller (UTK)
- Brandon Yates (Intel)
- Charles Shereda (LLNL)
- Christoph Niethammer (HLRS)
- Edgar Gabriel (UH)
- Erik Zeiske (HPE)
- Geoffroy Vallee (ARM)
- George Bosilca (UTK)
- Harumi Kuno (HPE)
- Hessam Mirsadeghi (NVIDIA))
- Howard Pritchard (LANL)
- Joseph Schuchart (HLRS)
- Joshua Ladd (NVIDIA)
- Marisa Roman (Cornelius)
- Mark Allen (IBM)
- Matias Cabral (Intel)
- Nathan Hjelm (Google)
- Naughton III, Thomas (ORNL)
- Noah Evans (Sandia)
- Ralph Castain (Intel)
- Scott Breyer (Sandia?)
- Shintaro iwasaki
- Todd Kordenbrock (Sandia)
- Xin Zhao (NVIDIA)
- Will ship v4.0.6 today.
- No driver to rush, so now just in bugfix phase.
-
PMIX / PRRTE plan to release in next few weeks
-
Need to do a v5.0 rc as soon as PRRTE v2 ships.
- Need feedback if we've missed an important one.
-
PMIx Tools support is still not functional. Opened tickets in PRRTE.
- Not a common case for most users.
- This also impacts the MPIR shim.
- PRRTE v2 will probably ship with broken tool support.
-
Is the driving force for PRRTE v2.0 OMPI?
- So we'd be indirectly/directly responsible for PRRTE shipping with broken tool support?
- Ralph would like to retire, and really wants to finish PRRTE v2.0 before he retires.
- Or just fix it in PRRTE v2.0?
- Is broken tool support a blocker for PRRTE v2.0?
- Don't ship OMPI v5.0 with broken Tools support.
-
Is there any objections to delaying
- Either we resource this
-
https://github.com/openpmix/pmix-tests/issues/88#issuecomment-861006665
- Current state of PMIx tool support.
- We'd like to get Tool support in CI, but need it to be working to enable the CI.
-
https://github.com/openpmix/prrte/issues/978#issuecomment-856205950
- Blocking issue for Open-MPI
- Brian
-
PR 9014 - new blocker.
- fix should just be a couple of lines of code... hard to decide what we want.
- Ralph, Jeff and Brian started talking.
-
Need some configury changes in before we RC.
-
Issue 8850, 8990 and more
-
Brian will file 3-ish issues
- One is configure pmix
-
Dynamic Windows fix in for UCX.
-
Any update on debugger support?
-
Need some documentation that Open MPI v5.0 supports PMIx based debuggers, and that if
-
MPIR Shim - pushed up fixes, and enabled CI.
- Could add it to some more CI, to ensure that PMIx doesn't break
- IBM is working on some CI testing with MPIR (typically very brittle)
- Need some guidance on pmix version.
- Right not, probably not a big deal, but perhaps in 2 years when we have 3 release branches with different pmix versions on different release branches, it might make sense to do open-mpi CI testing.
- Shouldn't be too much work to do.
-
UCC coll component updating to just set to be default when UCX is selected. PR 8969
- Intent is that this will eventually replace hcoll.
- Solid progress happening, on Read the docs.
- These docs would be on the readthedocs.io site, or on our site?
- Haven't thought either way yet.
- No strong opinion yet.
-
Now released.
-
We don't KNOW that OMPI v6.0 may not be an ABI break
- So nice to get MPIX_ rename into v5.0
-
Would be NICE to get MPIX symbols into a seperate library.
- What's left in MPIX after persistant collectives?
- Short Float,
- Pcall_req - persistant collective
- Affinity
- If they're NOT built by default, it's not too high of a priority.
- What's left in MPIX after persistant collectives?
-
Should just be some code-shuffling.
- On the surface shouldn't be too much.
- If they use wrapper compilers, or official mechanism
- Top level library, since app -> MPI and app -> MPIX lib.
- libmpi_x library can then be versioned differently.
-
Dont change to build MPIX by default.
-
Open an issue to track all of our MPI 4.0 items
- MPI Forum will want, certainly before supercomputing.
-
Do we want an MPI 4.0 Design meeting in place of a Tuesday meeting.
- In person meeting is off the table for many of us. We might want an out of sequence meeting.
- Lets doodle something a couple of weeks out.
- Doodle and send it out
- trivial wiki page in style of other in person wiki.
- Mellanox hasn't been reporting for a while. Tommi will follow up.
- Jeff did some work on Cisco MTT.
- There are a bunch of one-sided issues across node.
- Austen and Jeff looking into.
- Narrowed it down to strange results from MPI_Comm_split
- Local Peers value appears to be set wrong under PRRTE
- Joseph see when he installed hwloc in installation path, which leads to warnings if using another hwloc.
- We changed how all of this worked a few weeks ago.
- We shouldn't be installing one unless we can't find an external one.
- Problem is if you link the application to a different hwloc, it now complains.
- This has always been true, we just warn now. Don't do this.
- Austen filed a couple of issues from MTT.
- No discussion
- No update
- No discussion.