You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description:
When running a BTD input, WarpX was hanging when using BP5, which uses PerformDataWrite(). And this function needs to be collective (and writing to the same ADIOS file).
The program hang when using file based encoding (e.g. one file per time step). I ran with two ranks. All was fine until at a certain point,
rank1 needs to write to step 1, the file is diags/diag2/openpmd_000001.bp
rank0 needs to write to step 0, the file is diags/diag2/openpmd_000002.bp
What happened in short is at step 2, both ranks on opening BP5 writer for file *00002, and BTD needs to write back to step 1 (file *00001) at rank 0 (but not rank 1).
When using PerformPut() instead of PerformDataWrite(), the run is successful because PerformPut waits till EndStep() to actually write out data, and EndStep() is collective.
When using group based encoding (e.g. all steps in one file), it works too as both ranks are writing to the same file.
Work around:
it looks like one option is to force BTD to be group based, or, we have an option to not call PerformDataWrite() in BP5 for BTD.
The text was updated successfully, but these errors were encountered:
Description:
When running a BTD input, WarpX was hanging when using BP5, which uses PerformDataWrite(). And this function needs to be collective (and writing to the same ADIOS file).
The program hang when using file based encoding (e.g. one file per time step). I ran with two ranks. All was fine until at a certain point,
rank1 needs to write to step 1, the file is diags/diag2/openpmd_000001.bp
rank0 needs to write to step 0, the file is diags/diag2/openpmd_000002.bp
What happened in short is at step 2, both ranks on opening BP5 writer for file *00002, and BTD needs to write back to step 1 (file *00001) at rank 0 (but not rank 1).
When using PerformPut() instead of PerformDataWrite(), the run is successful because PerformPut waits till EndStep() to actually write out data, and EndStep() is collective.
When using group based encoding (e.g. all steps in one file), it works too as both ranks are writing to the same file.
Work around:
it looks like one option is to force BTD to be group based, or, we have an option to not call PerformDataWrite() in BP5 for BTD.
The text was updated successfully, but these errors were encountered: