-
Notifications
You must be signed in to change notification settings - Fork 3.9k
GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x #48180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
kou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you fix lint failure?
https://github.com/apache/arrow/actions/runs/19504115732/job/55873075753?pr=48180#step:6:84
diff --git a/cpp/src/arrow/compute/util.cc b/cpp/src/arrow/compute/util.cc
index 66c48631dc..3b671db021 100644
--- a/cpp/src/arrow/compute/util.cc
+++ b/cpp/src/arrow/compute/util.cc
@@ -325,10 +325,11 @@ void bytes_to_bits(int64_t hardware_flags, const int num_bits, const uint8_t* by
bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
#else
if (tail == 8) {
- bytes_next = util::SafeLoad(reinterpret_cast<const uint64_t*>(bytes + num_bits - tail));
+ bytes_next =
+ util::SafeLoad(reinterpret_cast<const uint64_t*>(bytes + num_bits - tail));
} else {
- // On Big-endian systems, for bytes_to_bits, load all tail bytes in little-endian order
- // to ensure compatibility with subsequent bit operations
+ // On Big-endian systems, for bytes_to_bits, load all tail bytes in little-endian
+ // order to ensure compatibility with subsequent bit operations
bytes_next = 0;
for (int i = 0; i < tail; ++i) {
bytes_next |= static_cast<uint64_t>((bytes + num_bits - tail)[i]) << (8 * i);
You can use nice pre-commit run --show-diff-on-failure --color=always --all-files cpp.
| #endif | ||
| ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8); | ||
| if (num_bytes == 8) { | ||
| return util::SafeLoad(reinterpret_cast<const uint64_t*>(bytes)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work on big-endian system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out this.. Now with the way we are handling the tail_bytes and loading the word data, we dont actually need to change "SafeLoadUpTo8Bytes()" function.. With the conditional compilation, this function will never be called on Big-endian architecture.
I have reverted this change.. Tested completely on s390x to see if all the test work. I have pushed a new commit. Please give your review comments. Thanks.
| #if ARROW_LITTLE_ENDIAN | ||
| uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8); | ||
| #else | ||
| int tail_bytes = (tail + 7) / 8; | ||
| uint64_t word; | ||
| if (tail_bytes == 8) { | ||
| word = util::SafeLoad(reinterpret_cast<const uint64_t*>(bits_tail)); | ||
| } else { | ||
| // For bit manipulation, always load into least significant bits | ||
| // to ensure compatibility with CountTrailingZeros on Big-endian systems | ||
| word = 0; | ||
| for (int i = 0; i < tail_bytes; ++i) { | ||
| word |= static_cast<uint64_t>(bits_tail[i]) << (8 * i); | ||
| } | ||
| } | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this?
The SafeLoadUpTo8Bytes() change adds support for big-endian, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I have removed the big-endian support to SafeLoadUpTo8Bytes() function, these changes are required as these handle the way we handle the tail_bytes on big-endian systems. If the tail_bytes are equal to 8, then we call directly the SafeLoad to load the data onto "word" variable. And for rest other cases, we need to take care of loading least significant bits to ensure compatibility with "CountTrailingZeros". This is the reason why we wont be able to make a direct call "SafeLoadUpTo8Bytes()" for every tail_bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed the lint errors and pushed my changes. Thanks..
d80ee37 to
ec51da8
Compare
Vishwanatha-HD
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed all the review comments. Please re-review. Thanks.
| #endif | ||
| ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8); | ||
| if (num_bytes == 8) { | ||
| return util::SafeLoad(reinterpret_cast<const uint64_t*>(bytes)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out this.. Now with the way we are handling the tail_bytes and loading the word data, we dont actually need to change "SafeLoadUpTo8Bytes()" function.. With the conditional compilation, this function will never be called on Big-endian architecture.
I have reverted this change.. Tested completely on s390x to see if all the test work. I have pushed a new commit. Please give your review comments. Thanks.
| #if ARROW_LITTLE_ENDIAN | ||
| uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8); | ||
| #else | ||
| int tail_bytes = (tail + 7) / 8; | ||
| uint64_t word; | ||
| if (tail_bytes == 8) { | ||
| word = util::SafeLoad(reinterpret_cast<const uint64_t*>(bits_tail)); | ||
| } else { | ||
| // For bit manipulation, always load into least significant bits | ||
| // to ensure compatibility with CountTrailingZeros on Big-endian systems | ||
| word = 0; | ||
| for (int i = 0; i < tail_bytes; ++i) { | ||
| word |= static_cast<uint64_t>(bits_tail[i]) << (8 * i); | ||
| } | ||
| } | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I have removed the big-endian support to SafeLoadUpTo8Bytes() function, these changes are required as these handle the way we handle the tail_bytes on big-endian systems. If the tail_bytes are equal to 8, then we call directly the SafeLoad to load the data onto "word" variable. And for rest other cases, we need to take care of loading least significant bits to ensure compatibility with "CountTrailingZeros". This is the reason why we wont be able to make a direct call "SafeLoadUpTo8Bytes()" for every tail_bytes.
| #if ARROW_LITTLE_ENDIAN | ||
| uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8); | ||
| #else | ||
| int tail_bytes = (tail + 7) / 8; | ||
| uint64_t word; | ||
| if (tail_bytes == 8) { | ||
| word = util::SafeLoad(reinterpret_cast<const uint64_t*>(bits_tail)); | ||
| } else { | ||
| // For bit manipulation, always load into least significant bits | ||
| // to ensure compatibility with CountTrailingZeros on Big-endian systems | ||
| word = 0; | ||
| for (int i = 0; i < tail_bytes; ++i) { | ||
| word |= static_cast<uint64_t>(bits_tail[i]) << (8 * i); | ||
| } | ||
| } | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed the lint errors and pushed my changes. Thanks..
| #if ARROW_LITTLE_ENDIAN | ||
| bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail); | ||
| #else | ||
| if (tail == 8) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this happen?
I think that tail must not be 8 because unroll is always 8 and tail = num_bits % unroll.
kou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work?
diff --git a/cpp/src/arrow/compute/util.cc b/cpp/src/arrow/compute/util.cc
index b90b3a6405..163a80d9d4 100644
--- a/cpp/src/arrow/compute/util.cc
+++ b/cpp/src/arrow/compute/util.cc
@@ -30,33 +30,41 @@ namespace util {
namespace bit_util {
inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes, int num_bytes) {
- // This will not be correct on big-endian architectures.
-#if !ARROW_LITTLE_ENDIAN
- ARROW_DCHECK(false);
-#endif
ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
if (num_bytes == 8) {
- return util::SafeLoad(reinterpret_cast<const uint64_t*>(bytes));
+ auto word = util::SafeLoad(reinterpret_cast<const uint64_t*>(bytes));
+#if !ARROW_LITTLE_ENDIAN
+ word = bit_util::ByteSwap(word);
+#endif
+ return word;
} else {
uint64_t word = 0;
for (int i = 0; i < num_bytes; ++i) {
+#if ARROW_LITTLE_ENDIAN
word |= static_cast<uint64_t>(bytes[i]) << (8 * i);
+#else
+ word |= static_cast<uint64_t>(bytes[num_bytes - 1 - i]) << (8 * i);
+#endif
}
return word;
}
}
inline void SafeStoreUpTo8Bytes(uint8_t* bytes, int num_bytes, uint64_t value) {
- // This will not be correct on big-endian architectures.
-#if !ARROW_LITTLE_ENDIAN
- ARROW_DCHECK(false);
-#endif
ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
if (num_bytes == 8) {
+#if ARROW_LITTLE_ENDIAN
util::SafeStore(reinterpret_cast<uint64_t*>(bytes), value);
+#else
+ util::SafeStore(reinterpret_cast<uint64_t*>(bytes), bit_util::ByteSwap(value));
+#endif
} else {
for (int i = 0; i < num_bytes; ++i) {
+#if ARROW_LITTLE_ENDIAN
bytes[i] = static_cast<uint8_t>(value >> (8 * i));
+#else
+ bytes[i] = static_cast<uint8_t>(value >> (8 * (num_bytes - 1 - i)));
+#endif
}
}
}|
Mostly looks good to me — just one thought after reading through the recent back-and-forth... Given the updated handling of tail bytes and the SafeLoadUpTo8Bytes discussion, I think this PR’s direction still makes sense. I’d just double-check that the tail==8 path really can’t happen with the current unroll logic, since @kou kou raised that question. Willing to help test once the approach is finalized. |
Rationale for this change
This PR is intended to enable Parquet DB support on Big-endian (s390x) systems. The fix in this PR fixes "arrow-acero-asof-join-node-test" testcase failure.
The "arrow-acero-asof-join-node-test" testcase was Aborted/core dumped on Big-endian platforms.
What changes are included in this PR?
The fix includes changes to "util.cc" file to address the Abort/Core dump issues.
Are these changes tested?
Yes. The changes are tested on s390x arch to make sure things are working fine. The fix is also tested on x86 arch, to make sure there is no new regression introduced.
Are there any user-facing changes?
No