Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDF5IOHandler: Support for float128 on ARM64/PPC64 #1364

Merged

Conversation

skalarproduktraum
Copy link
Contributor

This PR (hopefully) addresses #1363 by adding h5py-compatible float128 data types in HDF5IOHandler. The original issue seems to originate from a AMD64 vs. ARM64/PPC64 difference in the handling of long double types. While on AMD64, 128bit float with 80 bits of precision directly maps to H5T_NATIVE_LDOUBLE, it does not on ARM64/PPC64.

For the time being, I have also added some more debug output to the else branch where an unknown type is encountered, such that the HDF5 type info can be more easily recovered.

The PR currently still has a Heap corruption detected, free list is damaged error, the attributes that failed to read correctly originally read correctly now, though. Maybe @franzpoeschel has an idea there 👍

@skalarproduktraum
Copy link
Contributor Author

Figured to work around the heap corruption. Issue is that on macOS/ARM64, sizeof(long double) is 8, which I did not anticipate, but apparently has its reasons in ABI compatibility...

@franzpoeschel
Copy link
Contributor

Hey Ulrik,
thank you for this! The issue on Summit seems to be another one, the dataset is still not legible after this PR, but I can have a look at that separately. I'll add a test for this PR and add support for float128 to other places in the HDF5 backend where it's relevant.

@franzpoeschel
Copy link
Contributor

Aaah, monday has caught me again. After using the correct branch, this also solves the issue on Summit/PPC64.

@franzpoeschel
Copy link
Contributor

Hmm, we'll probably also need complex 128bit double types then..

@skalarproduktraum
Copy link
Contributor Author

This is actually quite an interesting issue. I had completely forgotten that x86 uses 80bit precision for long double. So would it make actually more sense to not use H5T_NATIVE_LDOUBLE at all?

@franzpoeschel
Copy link
Contributor

So, you would suggest instead to do something like H5Tset_size(m_H5T_LONG_DOUBLE_VARLEN_BE, sizeof(long double));?
Also, is there any reason to have both the big and little endian versions, or could we just use the endianness of the platform instead? (I have not written the HDF5 backend, so I don't know the details of HDF5 here too well..)

@franzpoeschel
Copy link
Contributor

I opened a PR with suggestions skalarproduktraum#1

else if (H5Tequal(attr_type, m_H5T_LONG_DOUBLE_80_LE))
{
// worst case, sizeof(long double) is only 8, so allocate enough
// memory to fit 16 bytes per member
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should maybe use malloc here to avoid running into aliasing issues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that makes much more sense. This was just my stop-gap solution for the prototype.

@ax3l ax3l self-requested a review January 25, 2023 17:39
@ax3l
Copy link
Member

ax3l commented Jan 25, 2023

Yes, I think that generally cross-platform long double varies, also see x86-64 Windows...

long double - extended precision floating-point type. Matches IEEE-754 binary128 format if supported, otherwise matches IEEE-754 binary64-extended format if supported, otherwise matches some non-IEEE-754 extended floating-point format as long as its precision is better than binary64 and range is at least as good as binary64, otherwise matches IEEE-754 binary64 format.

  • binary128 format is used by some HP-UX, SPARC, MIPS, ARM64, and z/OS implementations.
  • The most well known IEEE-754 binary64-extended format is 80-bit x87 extended precision format. It is used by many x86 and x86-64 implementations (a notable exception is MSVC, which implements long double in the same format as double, i.e. binary64).

So far, we did avoid converting (casting) data during I/O and kept it platform-specific. I am open to suggestions to enable platform-portability, but we need to check if that works on all platforms and is worth the performance hit (ideally, opt-in).

@@ -73,6 +73,8 @@ HDF5IOHandlerImpl::HDF5IOHandlerImpl(
, m_H5T_CFLOAT{H5Tcreate(H5T_COMPOUND, sizeof(float) * 2)}
, m_H5T_CDOUBLE{H5Tcreate(H5T_COMPOUND, sizeof(double) * 2)}
, m_H5T_CLONG_DOUBLE{H5Tcreate(H5T_COMPOUND, sizeof(long double) * 2)}
, m_H5T_LONG_DOUBLE_80_BE{H5Tcopy(H5T_IEEE_F64BE)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BE is not really used: virtually all PPC machines in HPC use LE these days :)

@franzpoeschel franzpoeschel force-pushed the enhancement/float128-on-arm64 branch from 3d1190b to 223f2b3 Compare May 4, 2023 13:35
@franzpoeschel franzpoeschel marked this pull request as ready for review May 4, 2023 13:40
@franzpoeschel franzpoeschel force-pushed the enhancement/float128-on-arm64 branch from d711f78 to 223f2b3 Compare May 4, 2023 14:14
@franzpoeschel
Copy link
Contributor

franzpoeschel commented May 4, 2023

@ax3l I have reduced this to little endian and added a comment that this conversion fix specifically targets the AMD64 -> ARM64/PPC64 special case. The workaround is used only if the double type is not recognized as a native type, so I'd say that the opt-in criterium is sufficiently fulfilled.
Should be ready for review :) (Reviewing should ideally include testing if an ARM Mac can open the datasets linked here #1363, I don't have one)

@@ -98,7 +98,7 @@ hid_t openPMD::GetH5DataType::operator()(Attribute const &att)
return H5Tcopy(H5T_NATIVE_DOUBLE);
case DT::LONG_DOUBLE:
case DT::VEC_LONG_DOUBLE:
return H5Tcopy(H5T_NATIVE_LDOUBLE);
return H5Tcopy(m_userTypes.at(typeid(long double).name()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return H5Tcopy(m_userTypes.at(typeid(long double).name()));
return H5Tcopy(H5T_NATIVE_LDOUBLE);

I think that the change can be reverted again. In writing procedures, we can always use the native datatype anyway. In HDF5IOHandlerImpl::readDataset() we need to manually check if the dataset can be read as native long double anyway before attempting the workaround with the manually defined long double. As a result, it's fine to return native long double here in ::openDataset(). Other reading procedures don't use this anyway.

@franzpoeschel franzpoeschel force-pushed the enhancement/float128-on-arm64 branch 2 times, most recently from 88f9f9d to 3aaecc0 Compare May 11, 2023 09:20
@franzpoeschel franzpoeschel added this to the 0.15.2 milestone May 11, 2023
skalarproduktraum and others added 5 commits May 24, 2023 14:27
Unify little/big endian, use double80 also in other places

Long double size 8 -> 16

Use malloc to avoid alignment issues

Same treatment for complex long double

Add this for readAttribute too

Avoid non-native datatypes in writing
@franzpoeschel franzpoeschel force-pushed the enhancement/float128-on-arm64 branch from 3aaecc0 to 7f023fe Compare May 24, 2023 12:27
src/IO/HDF5/HDF5IOHandler.cpp Show resolved Hide resolved
src/IO/HDF5/HDF5IOHandler.cpp Show resolved Hide resolved
Comment on lines 2295 to 2296
auto *tmpBuffer =
static_cast<long double *>(malloc(16 * 2 * dims[0]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In C++, I would use new and delete over malloc and free.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need a reinterpret_cast then though, as you can't new a void pointer. But I'll push a commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We found today: can be new long double 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out, we cannot 🙃
This branch is active when sizeof(long double) == 8, but the dataset on disk has 16-byte doubles. There is no way to allocate a correctly-sized buffer using native float types, so new char[] it is.

@ax3l
Copy link
Member

ax3l commented Jun 27, 2023

ping @franzpoeschel @skalarproduktraum pls see review comments to finalize :)

We would like to cut the 0.15.2 release soon, let me know if this works to be updated :)

Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks awesome, we just want to update the new long double.

@franzpoeschel franzpoeschel enabled auto-merge (squash) July 25, 2023 18:14
@franzpoeschel
Copy link
Contributor

Looks awesome, we just want to update the new long double.

I added an update and made a comment a bit clearer

Also, add a more explanative comment
Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, good point!

@franzpoeschel franzpoeschel merged commit e7d7a4b into openPMD:dev Jul 26, 2023
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants