Performance: Flush IOHandler only once, not for each Iteration #1642

franzpoeschel · 2024-06-27T10:10:48Z

This might address a performance regression seen by @guj.

Until now, Series::flushFileBased() and Series::flushGorVBased() flushed the IOHandler for every iteration. Since flushing has a constant overhead, calling Series::flush() has a linear complexity along with the number of Iterations, even if only a single Iteration has modifications.

Fixing this was not entirely trivial, since in file-based encoding, some frontend object must unset the written flag, since they must be written anew for each file. Before this PR, this could easily be done synchronously in the frontend, but since all Iterations are now flushed at once, this must now be done asynchronously in the backend as a backend task.

Also speed up the common case of parsing flush options: No options at all.

Check if this addresses the performance regressions seen by @guj
Check this against PIConGPU, the new written logic might behave weird in more complex setups

SerialIOTests fileBased_write_test still failing.

franzpoeschel · 2024-06-28T11:24:45Z

I just tested this against @guj's data on Perlmutter and it clearly improves the performance.

guj · 2024-06-29T05:12:03Z

I just tested this against @guj's data on Perlmutter and it clearly improves the performance.

Thanks Franz. I just checked out this branch, and on Mac it did not show any difference however I do see on Perlmutter it saved about 10% (20 seconds) . On my Mac times observed are pretty much the same as before.

franzpoeschel · 2024-07-01T09:36:57Z

No difference at all? I observed a clear system-independent performance regression for issue380_f /pscratch/sd/j/junmin/perlmutter/Jan2024/Test/issue380/bp5-f/defaultBP5-rank-ews_diags_f/diag1/openpmd_ bp f: The IOHandler is flushed for each Iteration even if no IOtasks are enqueued. The PR fuses all flushes into a single one, thus getting rid of this overhead.

You can further improve performance by using deferred iteration parsing:

Series series = Series(fileName, readMode, "defer_iteration_parsing = true");
// ...
    auto ex = series.iterations[i].open().meshes["E"]["x"];

And even further by closing the Iterations after using them:

   series.iterations[i].close();

Even without these improvements, I see a performance improvement far above just 20 seconds for going through 4000 Iterations.

guj · 2024-07-08T02:45:03Z

Ah, I was running issue380.cpp

Yes, issue380_f has significant improvement after your fix!!

No difference at all? I observed a clear system-independent performance regression for issue380_f /pscratch/sd/j/junmin/perlmutter/Jan2024/Test/issue380/bp5-f/defaultBP5-rank-ews_diags_f/diag1/openpmd_ bp f: The IOHandler is flushed for each Iteration even if no IOtasks are enqueued. The PR fuses all flushes into a single one, thus getting rid of this overhead.

You can further improve performance by using deferred iteration parsing:
Series series = Series(fileName, readMode, "defer_iteration_parsing = true");
// ...
    auto ex = series.iterations[i].open().meshes["E"]["x"];
And even further by closing the Iterations after using them:
   series.iterations[i].close();
Even without these improvements, I see a performance improvement far above just 20 seconds for going through 4000 Iterations.

franzpoeschel · 2024-07-08T13:47:06Z

Great to hear!

Ah, I was running issue380.cpp

Does that one have noticeable performance issues that we need to look into?

guj · 2024-07-08T16:50:47Z

Great to hear!

Ah, I was running issue380.cpp

Does that one have noticeable performance issues that we need to look into?
This fix basically made the time gap between issue380.cpp and issue380_f.cpp disappear. So would be great to merge it.

franzpoeschel · 2024-07-11T10:15:57Z

test/SerialIOTest.cpp

@@ -2071,6 +2071,7 @@ inline void fileBased_write_test(const std::string &backend)
            .makeConstant<double>(1.0);

        o.iterations[overlong_it].setTime(static_cast<double>(overlong_it));
+        o.flush();


Note: This change makes the test stricter, since Series::flush() unlike the destructor will not swallow exceptions.

franzpoeschel added 5 commits June 26, 2024 13:52

Some fixes for flushing many Iterations

511f5a9

SerialIOTests fileBased_write_test still failing.

[wip] Try fixing bugs

9a5269e

wip

bbcba8b

wip

d70919c

is this really the best solution for that..?

3e3145f

franzpoeschel added the internal label Jun 27, 2024

Distinguish more clearly when setWritten needs to run async

7c999c8

franzpoeschel force-pushed the fix-flush-many-iterations branch from cfc5db1 to 7c999c8 Compare June 27, 2024 16:27

Improve performance of ADIOS2IOHandlerImpl::flush() for many files

3ac36d5

franzpoeschel commented Jul 11, 2024

View reviewed changes

Better documentation for these changes

2fc967f

franzpoeschel force-pushed the fix-flush-many-iterations branch from 862d538 to 2fc967f Compare July 11, 2024 10:16

franzpoeschel enabled auto-merge (squash) July 11, 2024 10:17

franzpoeschel merged commit bda3544 into openPMD:dev Jul 11, 2024
30 of 31 checks passed

ax3l requested a review from guj July 16, 2024 17:08

ax3l added this to the 0.16.0 milestone Jul 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance: Flush IOHandler only once, not for each Iteration #1642

Performance: Flush IOHandler only once, not for each Iteration #1642

franzpoeschel commented Jun 27, 2024 •

edited

Loading

franzpoeschel commented Jun 28, 2024

guj commented Jun 29, 2024

franzpoeschel commented Jul 1, 2024

guj commented Jul 8, 2024

franzpoeschel commented Jul 8, 2024

guj commented Jul 8, 2024

franzpoeschel Jul 11, 2024

Performance: Flush IOHandler only once, not for each Iteration #1642

Performance: Flush IOHandler only once, not for each Iteration #1642

Conversation

franzpoeschel commented Jun 27, 2024 • edited Loading

franzpoeschel commented Jun 28, 2024

guj commented Jun 29, 2024

franzpoeschel commented Jul 1, 2024

guj commented Jul 8, 2024

franzpoeschel commented Jul 8, 2024

guj commented Jul 8, 2024

franzpoeschel Jul 11, 2024

Choose a reason for hiding this comment

franzpoeschel commented Jun 27, 2024 •

edited

Loading