You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe the error is likely due to a memory access out-of-bounds issue.
In transfer_engine_bench, a 1GB buffer is allocated by default.
According to the offset calculation, the total memory accessed is batch_size * block_size * threads.
In your case, when block_size = 65536, max_offset = 1024 * 65536 * 16 = 1GB. However, when block_size is set to 65537, the total memory access exceeds 1GB, leading to this error.
We use commit 0d9e226 of Mooncake repo, and run "transfer_engine_bench" on 2 node,
h1
is the server side andh2
is client side. Our scripts are here:Other configuration:
When we set
--block_size=65536
, everything is ok. But setting it like--block_size=65537
or larger causes error:I just have no idea.
The text was updated successfully, but these errors were encountered: