Skip to content

ASC Q4 2021 Meeting

Josh Hursey edited this page Oct 28, 2021 · 15 revisions

PMIx Standard Administrative Steering Committee (ASC) 4Q 2021 Meeting

Quick Links

  • Governance Document [latest]

Agenda (Finalized on Oct. 13, 2021)

This meeting has a floating agenda with specific synchronization points to keep us on track. Rough time estimates are provided per agenda item, and the co-chairs plan to cover the topics in the order seen below. However, since some agenda items will take longer/shorter than anticipated an exact start/end timing is not guaranteed and some items may float to the second day. If you are not able to attend the full meeting and are presenting then please let the co-chairs know and we can plan accordingly.

  • Meeting Slides (posted after the meeting)

Day 1: Oct. 26 (10 am - 1 pm US Central Daylight Time)

Start End Topic
10:00 am 10:05 am Gathering (Josh)
10:05 am 10:10 am Roll Call (We will start roll call promptly at this time)
10:10 am 11:30 am Discussion of agenda items
11:30 am 11:45 am Break
11:45 am 1:00 pm Discussion of agenda items

Day 2: Oct. 28 (10 am - 1 pm US Central Daylight Time)

Start End Topic
10:00 am 10:05 am Gathering (Kathryn)
10:05 am 11:30 am Discussion of agenda items
11:30 am 11:50 am Voting and Break Doodle Vote Link & Officer Doodle Vote Link
11:50 am 12:30 am Administrative and Working Group agenda items
12:30 am 12:45 pm Technical and Use Case Presentation(s)
12:45 am 1:00 pm Closing discussion and wrap up

Agenda Items

Administrative and Working Group Agenda Items

  • Review 2022 quarterly meetings dates and plans
1Q 2022 - Virtual
  - Feb 15 & 17 (10 am - 1 pm US Central)
2Q 2022 - Virtual
  - May 10 & 12 (10 am - 1 pm US Central)
3Q 2022 - Virtual (tentative)
  - Aug 9 & 11 (10 am - 1 pm US Central)
4Q 2022 - Virtual (tentative)
  - Oct. 25 & 27 (10 am - 1 pm US Central)
  • ASC Membership
    • Vote on new ASC Members
    • Call for new ASC Members
  • Election of "even year" Co-Chair and Secretary Positions
  • Release Planning
  • Working Group Updates (~ 10-15 minutes each)
    • Client Separation / Implementation Agnostic Document
    • Tools
    • Dynamic Workflows
    • Open Call for New Working Groups
  • Technical and Use Case presentations
    • Isaías A. Comprés Ureña: "Experiences with a Slurm and MPICH based Malleability Prototype"
  • Additional discussion items

Meeting Notes:

Attendance

Person Institution Day 1 Day 2
Josh Hursey IBM x x
Kathryn Mohror LLNL x x
Brice Goglin Inria x x
David Solt IBM x x
Howard Pritchard LANL x x
Ken Raffenetti ANL x x
Nat Shineman OSU x x
Ralph Castain Nanook x x
John Delsignore Perforce TotalView x x
Justin Wozniak ANL x
Isaias Compres TUM x x
Thomas Naughton ORNL x x
Aurelien Bouteiller UTK x x
Albeaus Bayucan Altair x x
Norbert Eicker JSC x x
Jai Dyal Intel x x

Day 1: Oct. 26, 2021

  • Intro
  • PMIx Standard PRs (2nd votes)
  • PMIx Standard PRs (Readings/1st votes)
    • Assign specific values to constants (Ralph/John/IAWG ~ 10 min)
    • Comment: May bump up against the ABI discussion, could be we need to revisit some points here if things identified needing change here.
    • Comment: Question about how much this gets you toward ABI, recognized the structures are another key point. (See later discussion)
    • Ralph: Also may need to consider padding in the structures
    • John: Also dealt w/ these issues with the OpenMP versioning ABI.  Example: A downloadable header file on the openmp website that details everything for use with debugger/tool.  All needs from debugger side is defined in the header file.
    • Ralph: Note, the Standard has everything, but question raised if better to create an agnostic header file.  The standard has all those details.
    • John: The header file can offer convenience.  Some have even used the header to validate the specification/consistency. That header was stand-alone.
    • See also open ticket https://github.com/pmix/pmix-abi/issues/1 with similar idea
    • Segue to ABI and look at https://github.com/pmix/pmix-abi for working through this idea for headers.
    • Started working for example to try and dlopen() openpmix library and use headers as proof of prototype.
    • Note - ABI discussion being working on in Implementation Agnostic WG, and currently starting with the tool/client side as the driving use case.
    • Reading through governance guidance regarding ABI: https://github.com/pmix/governance/pull/34
      • Outline terms and the stages/release points when ABI changes occur
      • John: Possibly useful to look into “semantic versioning”, (https://semver.org).  Gist: provide interfaces to get the versioning details (major, minor, patch) and that way you can easily get at the version and defines what levels can have capabilities change that introduce breakage.  Allows you to find a version that will work when running things in dynamic scenario.
      • Ken: Question about how much this is a Standard vs. Implementation issue? A standard can publish the canonical interface for a version, but eventually the implementation is responsible for how you get at the (e.g., the hooks on how to get at the versioning, and underlying capabilities).
      • Do  need some ability for tools/users to negotiate the versions. Put burden on tools and not users.
      • Note - Linux rdma_core provides multiple ABI versions and possible good example for library that does this well. Also, see libfabric.
      • Can be painful, but very useful to the end user
      • Q: Any concerns about going down this path?
      • Ralph: Some caution that adding too much burden to supporting this could be a problem in getting it supported/implemented (realistically).
      • A prototype implementation may be a good way to guide standardization enhancements related to ABI versioning.
      • The calls are often fairly simple, e.g., get version (maj, min, patch) and then call also call back into the library to see if library can support a given maj/min/patch, and if not then fail out.  Get supported version of library to tool and then have tool ask library if it supports a given version, yes/no.
      • For OpenPMIx there is a table with backward ABI versioning and works as long as going backward.  This gets tested by packagers.
      • Q: What about macros?
      • Those are tougher. A few are very important beyond just convenience.  Maybe good idea to make those as a standardized interface, i.e., standardize those functions under those handy macros.
      • Gist: Could have macros in the header, but in few cases there are macros that use non-standard functions. So suggestion to possibly take those non-standard underlying functions used in macros and propose those for addition to PMIx (standard), e.g., pmix_info_list_init().
      • Seems good to have these as generic interfaces and avoid being too prescriptive (e.g., linked lists) in the standard itself.
      • Gist: Convert macros to standardized API would make things easier (b/c the functionality of the macros are very helpful).
      • David: In ABI review  -  Also looked at the pmix_value_t, currently a union, and is an item that would not want to add something larger that current largest item (to avoid growing that data structure).  Fairly confident this is not a major issue (fingers crossed).  But hopefully in good shape here.
      • Josh: The other items is the pmix_proc_t, comments for ABI discussions?
      • Ralph: Also item that is performance sensitive, b/c proc_t gets used/stored a lot.
      • John: Are there explicit copy functions for proc_t structures, or are they just directly assignable?
      • Proc_t’s are static array (256 char array) so know it is a fixed size, so know the stride length when using the data structure.
      • Comment: PMIX_MAX_NSLEN in spec says minimum of 63 characters, but earlier PR (#359) included a default of 255.  Do we need to change the spec (advise to user, section 3.1).
      • Are there creative solutions to avoid using these static arrays to avoid large memory consumption?
      • Discussion of indirection/translation tables, but then how that should be manifested in the standard.
      • One point raise was possibly using an encoding on the string (e.g., namespace as example) and fixed size prefix in string used to indicate the rest is a location.  Example: “@xyz” means the “@” tells that rest is location.
      • Something like this was used in the RegEx to have similar encoding.
      • Discuss some implementation impacts with such changes, e.g., break shared memory pieces due to move from fixed to variable length data.
      • Q: If switch to pointer for proc_t instead of a fixed size, who is responsible for resource mgmt/cleanup?
      • Comment that could possibly use reference counting in library b/c you are making PROC_CREATE/PROC_FREE calls that could be used for this.  Possibly something to look into.  But need to keep things lightweight and flexible.
      • Kathryn enquires if the standardization of constants could cause backward incompatibilities
        • Ralph: Standardizing things like error constants should be fine
        • Ralph: Standardizing key sizes, namespace sizes etc could be more problematic; we could defer these to a separate PR were we can double check if that causes issues
        • This is just one step toward the ABI, it will make the subsequent ABI pr’s easier to read/review
        • Ralph will do a revision exception change to the PR 359 to remove anything that we still want to discuss and present again on Thursday.
        • A (new) overarching issue on ABI will have a ‘checklist’ that will track all the attached PRs steps.
      • Aurelien: What about the abi-header repo being separate from the main standard repo?
        • David: we did not intend it to be ‘separate’ forever, it is a placeholder for WIP
        • Josh: we will add some text to the governance document when this is fleshed out
  • Cross-Version Support Use Case (Josh ~ 15 min)
    • https://github.com/pmix/pmix-standard/pull/357
    • Read
    • Aurelien: What is the rationale for the piece of text about other libraries also having cross-version compat issues?
      • Josh: interactions with users have highlighted that they often misunderstand that PMIX having solved its own cross-version issue doesn’t mean that it can solve all cross-version issues for all libraries
    • Kathryn: should we add pmix_get_version as one of the ways of checking cross-version compatibility (along with pmix_query_info)
      • Josh: sure can add to the list
      • John: is the version string implementation dependent? It can be hard for users to parse arbitrary strings; we could have another issue to look into how to obtain a standardized version (e.g., MAJOR, MINOR, PATCH)
      • Josh: we could add some text to get_version, get_version_ex etc about standardized formatting before we work on ABI? Consensus is to defer to ABI effort.
    • This ticket will appear as-is for ballot (revision exception to add get_version changes has been discarded during discussion see above)
  • PMIx Standard PRs(Reading Errata)
  • Plenary discussion items

Day 2: Oct. 28, 2021

  • Intro
  • Planning for 2022 ASC meetings
  • Call for new members and member organizations
    • Kathryn to reach out to Norbert Eicker (JSC, aka Julich) as he expressed interest
  • Roll call passed
    • Note: NVIDIA/Mellanox not present at roll-call
  • PR 359 takes a first step by defining values for most constants
  • ABI Summary from Day 1
  • Voting block
    • Everything passed except of PR 359 w/o revisions (it passed with revision)
    • Kathryn and Thomas renewed for a new term \o/
  • Discussion about moving changes made to Open PMIx into standard tickets
    • Ralph is trying to scale back involment in writing standard tickets and focus on implementation
    • We need more people to step-up and take-on RFC changes into standard tickets/PRs
    • Release timeline for PMIx 4.2, we’d like to have all these changes integrated
  • Release timeline for PMIx 5.0
    • Is ABI part of 5.0 or is it not realistic timeline-wise? We have to have 2Qs before we can vote on 5.0
    • Ralph would prefer to have it in v5, it could be in v6 if we have a strong driver to release v5 early
    • List of v5 things: Storage support (provisional, will remain provisional barring implementation effort), use-cases, clarifications, nothing very major
    • Consensus from discussion seems to be add ABI into v5 timeframe, but avoid having other large additions to have scope/time creep
    • Remark: possibly worth reconsidering governance about large text changes (e.g., use cases) and if it can go into a minor or major release. We want the major text changes to undergo double review voting process but still be eligible for minor release if they are semantic neutral. Aurelien to prepare a PR on this topic.
  • Working Group Updates
    • IA (David)
      • Mainly been spending time on ABI items
      • Return codes pushed through
      • Review of IN vs. INOUT on APIs may need attention (help wanted item)
    • Tools (Isaias)
      • Work w/ debugger
      • Malleability (see presentation later), not see need currently for additional APIs/attributes
      • Containers - research project looking into this & use case
      • Looking into generic topology description/format
    • Dynamic Workflows (Justin)
      • Justin W new lead and involved in many projects w/ interest in PMIx and looking at other use cases, to include workflows + MPI, etc.
      • CANDLE (deep learning app) interesting possibility for using PMIx
      • ExaWorks is also a possible candidate to leverage PMIx
      • Working group may return to regular meeting schedule, contact Justin to coordinate.
      • Tools WG meets bi-weekly at Wed 5pm Germany / 8am Pacific,
      • Possibly interleave with off week for Tools WG
  • Open call for new WG
    • No new
    • Q: Ask if ABI might be good to have as own WG, or keep inside IAWG?
      • No strong objections, so informally keep in IAWG
  • Technical presentations
    • Presentation by Isaias
      • TODO: Insert link to slides
Clone this wiki locally