Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Chunk distribution strategies #824

Open
wants to merge 11 commits into
base: dev
Choose a base branch
from

Conversation

franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented Nov 23, 2020

Based on topic-available-chunks, see #802.
Based on #1043

Adds strategies for guiding applications to efficient parallel chunk loading patterns.
Idea: PR #802 lets reading applications inquire the chunks that are available for loading in the backend. This PR adds a function chunk_assignment::assignChunks, taking such a ChunkTable as well as a strategy as input and produces a further ChunkTable that may be used as a load pattern.

Implemented strategies:

  • Round Robin: Assign chunks to reading processes in turn.
  • by hostname: Assign chunks to processes on the same host. Parameterized by the interior distribution strategy.
  • by cuboid slice: Assign chunks according to a slicing into n_ranks hyperslabs of the data space.
  • by binpacking: Assign chunks in an approximatively balanced manner via an approximation algorithm for the NP complete binpacking problem.

Also, add an attribute to the global Series object (currently called rankMetaInfo, I'm sure we'll change this) that can be used to write string-formatted meta information for the single writing processes and assign meaning to them (e.g. hostnames). Used by the hostname-based strategy.

TODO:

  • Python bindings
  • Testing, mostly done manually so far
  • Cleanup
  • Documentation
  • This PR adds many MPI-based methods to the Series class. It would probably be good to first separate MPI and non-MPI headers, and then to add this functionality to the MPI headers.

@franzpoeschel franzpoeschel changed the title Chunk distribution strategies [WIP] Chunk distribution strategies Nov 23, 2020
@lgtm-com
Copy link
Contributor

lgtm-com bot commented Nov 23, 2020

This pull request introduces 1 alert when merging 96841f9 into b2664b7 - view on LGTM.com

new alerts:

  • 1 for Catching by value

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Nov 24, 2020

This pull request introduces 1 alert when merging 6da1db3 into a6287fc - view on LGTM.com

new alerts:

  • 1 for Catching by value

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Feb 23, 2021

This pull request introduces 1 alert when merging 43b1e31 into f6b0054 - view on LGTM.com

new alerts:

  • 1 for Catching by value

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Feb 23, 2021

This pull request introduces 1 alert when merging 5f2b347 into f6b0054 - view on LGTM.com

new alerts:

  • 1 for Catching by value

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Mar 8, 2021

This pull request introduces 1 alert when merging bb7fa45 into 9b349eb - view on LGTM.com

new alerts:

  • 1 for Catching by value

@franzpoeschel franzpoeschel force-pushed the topic-chunk-distribution branch 3 times, most recently from 8c7fc78 to 9fc6d6b Compare March 9, 2021 17:09
@franzpoeschel
Copy link
Contributor Author

@ax3l ping

@ax3l ax3l self-requested a review April 1, 2021 05:47
@lgtm-com
Copy link
Contributor

lgtm-com bot commented Apr 26, 2021

This pull request introduces 1 alert when merging ebea6cf into 86a15ba - view on LGTM.com

new alerts:

  • 1 for Unused local variable

@franzpoeschel franzpoeschel force-pushed the topic-chunk-distribution branch 2 times, most recently from 993cfb6 to 0fca8b7 Compare April 27, 2021 08:06
@franzpoeschel
Copy link
Contributor Author

I've added Python bindings now, including usage of one distribution strategy in openpmd-pipe as an integrated usage example.

@franzpoeschel
Copy link
Contributor Author

franzpoeschel commented Aug 18, 2021

One issue of this approach is that the attribute rankMetaInfo is a vector of as many strings as there are parallel writers which I suspect is part of the reason why we saw scaling issues in some benchmarks.
Instead, a parallel dataset should preferrably be written.

EDIT: I've transformed this to a parallel dataset now. Needs some more checking for char portability. Also, we should probably buffer these values on reading, since the table can only be read in steps in which it was written. In short: Make this less like an attribute that can be set and inquired, and rather more automatic during Series construction.

@franzpoeschel franzpoeschel force-pushed the topic-chunk-distribution branch 3 times, most recently from 62758f5 to 86d5a8c Compare August 19, 2021 08:51
@franzpoeschel franzpoeschel force-pushed the topic-chunk-distribution branch 3 times, most recently from ae23675 to cc87760 Compare February 6, 2024 09:42
@franzpoeschel franzpoeschel force-pushed the topic-chunk-distribution branch 2 times, most recently from 37d0a0e to 8b1a41d Compare February 14, 2024 10:13
@franzpoeschel franzpoeschel force-pushed the topic-chunk-distribution branch 2 times, most recently from 166b3be to d734636 Compare February 28, 2024 13:53
@@ -1044,6 +1131,53 @@
}
#endif

#if openPMD_HAVE_MPI
TEST_CASE("unavailable_backend", "[core][parallel]")

Check notice

Code scanning / CodeQL

Unused static function Note test

Static function C_A_T_C_H_T_E_S_T_22 is unreachable (
autoRegistrar23
must be removed at the same time)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: new additions to the API discussion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant