HDF5IOHandler: Support for float128 on ARM64/PPC64 #1364

skalarproduktraum · 2023-01-20T23:19:25Z

This PR (hopefully) addresses #1363 by adding h5py-compatible float128 data types in HDF5IOHandler. The original issue seems to originate from a AMD64 vs. ARM64/PPC64 difference in the handling of long double types. While on AMD64, 128bit float with 80 bits of precision directly maps to H5T_NATIVE_LDOUBLE, it does not on ARM64/PPC64.

For the time being, I have also added some more debug output to the else branch where an unknown type is encountered, such that the HDF5 type info can be more easily recovered.

The PR currently still has a Heap corruption detected, free list is damaged error, the attributes that failed to read correctly originally read correctly now, though. Maybe @franzpoeschel has an idea there 👍

src/IO/HDF5/HDF5IOHandler.cpp

skalarproduktraum · 2023-01-21T00:19:42Z

Figured to work around the heap corruption. Issue is that on macOS/ARM64, sizeof(long double) is 8, which I did not anticipate, but apparently has its reasons in ABI compatibility...

franzpoeschel · 2023-01-23T10:47:25Z

Hey Ulrik,
thank you for this! The issue on Summit seems to be another one, the dataset is still not legible after this PR, but I can have a look at that separately. I'll add a test for this PR and add support for float128 to other places in the HDF5 backend where it's relevant.

franzpoeschel · 2023-01-23T10:50:41Z

Aaah, monday has caught me again. After using the correct branch, this also solves the issue on Summit/PPC64.

franzpoeschel · 2023-01-23T11:00:27Z

Hmm, we'll probably also need complex 128bit double types then..

skalarproduktraum · 2023-01-23T11:44:53Z

This is actually quite an interesting issue. I had completely forgotten that x86 uses 80bit precision for long double. So would it make actually more sense to not use H5T_NATIVE_LDOUBLE at all?

franzpoeschel · 2023-01-23T12:25:52Z

So, you would suggest instead to do something like H5Tset_size(m_H5T_LONG_DOUBLE_VARLEN_BE, sizeof(long double));?
Also, is there any reason to have both the big and little endian versions, or could we just use the endianness of the platform instead? (I have not written the HDF5 backend, so I don't know the details of HDF5 here too well..)

franzpoeschel · 2023-01-23T13:18:58Z

I opened a PR with suggestions skalarproduktraum#1

franzpoeschel · 2023-01-23T13:35:54Z

src/IO/HDF5/HDF5IOHandler.cpp

+        else if (H5Tequal(attr_type, m_H5T_LONG_DOUBLE_80_LE))
+        {
+            // worst case, sizeof(long double) is only 8, so allocate enough
+            // memory to fit 16 bytes per member


We should maybe use malloc here to avoid running into aliasing issues?

Yep, that makes much more sense. This was just my stop-gap solution for the prototype.

ax3l · 2023-01-25T17:44:53Z

Yes, I think that generally cross-platform long double varies, also see x86-64 Windows...

long double - extended precision floating-point type. Matches IEEE-754 binary128 format if supported, otherwise matches IEEE-754 binary64-extended format if supported, otherwise matches some non-IEEE-754 extended floating-point format as long as its precision is better than binary64 and range is at least as good as binary64, otherwise matches IEEE-754 binary64 format.

binary128 format is used by some HP-UX, SPARC, MIPS, ARM64, and z/OS implementations.

The most well known IEEE-754 binary64-extended format is 80-bit x87 extended precision format. It is used by many x86 and x86-64 implementations (a notable exception is MSVC, which implements long double in the same format as double, i.e. binary64).

So far, we did avoid converting (casting) data during I/O and kept it platform-specific. I am open to suggestions to enable platform-portability, but we need to check if that works on all platforms and is worth the performance hit (ideally, opt-in).

ax3l · 2023-02-21T18:25:03Z

src/IO/HDF5/HDF5IOHandler.cpp

@@ -73,6 +73,8 @@ HDF5IOHandlerImpl::HDF5IOHandlerImpl(
    , m_H5T_CFLOAT{H5Tcreate(H5T_COMPOUND, sizeof(float) * 2)}
    , m_H5T_CDOUBLE{H5Tcreate(H5T_COMPOUND, sizeof(double) * 2)}
    , m_H5T_CLONG_DOUBLE{H5Tcreate(H5T_COMPOUND, sizeof(long double) * 2)}
+    , m_H5T_LONG_DOUBLE_80_BE{H5Tcopy(H5T_IEEE_F64BE)}


BE is not really used: virtually all PPC machines in HPC use LE these days :)

franzpoeschel · 2023-05-04T14:27:20Z

@ax3l I have reduced this to little endian and added a comment that this conversion fix specifically targets the AMD64 -> ARM64/PPC64 special case. The workaround is used only if the double type is not recognized as a native type, so I'd say that the opt-in criterium is sufficiently fulfilled.
Should be ready for review :) (Reviewing should ideally include testing if an ARM Mac can open the datasets linked here #1363, I don't have one)

franzpoeschel · 2023-05-04T14:30:42Z

src/IO/HDF5/HDF5Auxiliary.cpp

@@ -98,7 +98,7 @@ hid_t openPMD::GetH5DataType::operator()(Attribute const &att)
        return H5Tcopy(H5T_NATIVE_DOUBLE);
    case DT::LONG_DOUBLE:
    case DT::VEC_LONG_DOUBLE:
-        return H5Tcopy(H5T_NATIVE_LDOUBLE);
+        return H5Tcopy(m_userTypes.at(typeid(long double).name()));


Suggested change

return H5Tcopy(m_userTypes.at(typeid(long double).name()));

return H5Tcopy(H5T_NATIVE_LDOUBLE);

I think that the change can be reverted again. In writing procedures, we can always use the native datatype anyway. In HDF5IOHandlerImpl::readDataset() we need to manually check if the dataset can be read as native long double anyway before attempting the workaround with the manually defined long double. As a result, it's fine to return native long double here in ::openDataset(). Other reading procedures don't use this anyway.

…n ARM64/PPC64

for more information, see https://pre-commit.ci

Unify little/big endian, use double80 also in other places Long double size 8 -> 16 Use malloc to avoid alignment issues Same treatment for complex long double Add this for readAttribute too Avoid non-native datatypes in writing

src/IO/HDF5/HDF5IOHandler.cpp

ax3l · 2023-06-06T16:47:09Z

src/IO/HDF5/HDF5IOHandler.cpp

+            auto *tmpBuffer =
+                static_cast<long double *>(malloc(16 * 2 * dims[0]));


In C++, I would use new and delete over malloc and free.

This will need a reinterpret_cast then though, as you can't new a void pointer. But I'll push a commit.

We found today: can be new long double 😅

It turns out, we cannot 🙃
This branch is active when sizeof(long double) == 8, but the dataset on disk has 16-byte doubles. There is no way to allocate a correctly-sized buffer using native float types, so new char[] it is.

ax3l · 2023-06-27T09:46:06Z

ping @franzpoeschel @skalarproduktraum pls see review comments to finalize :)

We would like to cut the 0.15.2 release soon, let me know if this works to be updated :)

ax3l

Looks awesome, we just want to update the new long double.

franzpoeschel · 2023-07-25T18:14:49Z

Looks awesome, we just want to update the new long double.

I added an update and made a comment a bit clearer

Also, add a more explanative comment

ax3l

Excellent, good point!

github-advanced-security bot found potential problems Jan 20, 2023

View reviewed changes

src/IO/HDF5/HDF5IOHandler.cpp Fixed Show fixed Hide fixed

skalarproduktraum force-pushed the enhancement/float128-on-arm64 branch from 846b8f7 to f61263b Compare January 21, 2023 00:14

franzpoeschel reviewed Jan 23, 2023

View reviewed changes

ax3l self-requested a review January 25, 2023 17:39

ax3l added the backend: HDF5 label Jan 25, 2023

ax3l reviewed Feb 21, 2023

View reviewed changes

franzpoeschel force-pushed the enhancement/float128-on-arm64 branch from 3d1190b to 223f2b3 Compare May 4, 2023 13:35

franzpoeschel marked this pull request as ready for review May 4, 2023 13:40

franzpoeschel force-pushed the enhancement/float128-on-arm64 branch from d711f78 to 223f2b3 Compare May 4, 2023 14:14

franzpoeschel reviewed May 4, 2023

View reviewed changes

franzpoeschel force-pushed the enhancement/float128-on-arm64 branch 2 times, most recently from 88f9f9d to 3aaecc0 Compare May 11, 2023 09:20

franzpoeschel added this to the 0.15.2 milestone May 11, 2023

franzpoeschel mentioned this pull request May 23, 2023

Cleanup for 128bit double support skalarproduktraum/openPMD-api#1

Closed

2 tasks

skalarproduktraum and others added 5 commits May 24, 2023 14:27

HDF5IOHandler: Support for float128 data types with 80bit precision o…

d9cc0d0

…n ARM64/PPC64

[pre-commit.ci] auto fixes from pre-commit.com hooks

02218fa

for more information, see https://pre-commit.ci

Cleanup

c06322f

Unify little/big endian, use double80 also in other places Long double size 8 -> 16 Use malloc to avoid alignment issues Same treatment for complex long double Add this for readAttribute too Avoid non-native datatypes in writing

Make this fix little-endian only

43594b5

Add comment

7f023fe

franzpoeschel force-pushed the enhancement/float128-on-arm64 branch from 3aaecc0 to 7f023fe Compare May 24, 2023 12:27

ax3l requested changes Jun 6, 2023

View reviewed changes

Suggestions from review

59f1a3d

ax3l requested changes Jul 25, 2023

View reviewed changes

franzpoeschel enabled auto-merge (squash) July 25, 2023 18:14

Use new instead of malloc everywhere

0c53b05

Also, add a more explanative comment

franzpoeschel force-pushed the enhancement/float128-on-arm64 branch from cb44a14 to 0c53b05 Compare July 25, 2023 18:16

ax3l approved these changes Jul 26, 2023

View reviewed changes

franzpoeschel merged commit e7d7a4b into openPMD:dev Jul 26, 2023
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDF5IOHandler: Support for float128 on ARM64/PPC64 #1364

HDF5IOHandler: Support for float128 on ARM64/PPC64 #1364

skalarproduktraum commented Jan 20, 2023

skalarproduktraum commented Jan 21, 2023

franzpoeschel commented Jan 23, 2023

franzpoeschel commented Jan 23, 2023

franzpoeschel commented Jan 23, 2023

skalarproduktraum commented Jan 23, 2023

franzpoeschel commented Jan 23, 2023

franzpoeschel commented Jan 23, 2023

franzpoeschel Jan 23, 2023

skalarproduktraum Jan 23, 2023

ax3l commented Jan 25, 2023

ax3l Feb 21, 2023

franzpoeschel commented May 4, 2023 •

edited

Loading

franzpoeschel May 4, 2023

ax3l Jun 6, 2023

franzpoeschel Jun 27, 2023

ax3l Jul 25, 2023

franzpoeschel Jul 25, 2023

ax3l commented Jun 27, 2023

ax3l left a comment

franzpoeschel commented Jul 25, 2023

ax3l left a comment

	return H5Tcopy(m_userTypes.at(typeid(long double).name()));
	return H5Tcopy(H5T_NATIVE_LDOUBLE);

		auto *tmpBuffer =
		static_cast<long double >(malloc(16 2 * dims[0]));

HDF5IOHandler: Support for float128 on ARM64/PPC64 #1364

HDF5IOHandler: Support for float128 on ARM64/PPC64 #1364

Conversation

skalarproduktraum commented Jan 20, 2023

skalarproduktraum commented Jan 21, 2023

franzpoeschel commented Jan 23, 2023

franzpoeschel commented Jan 23, 2023

franzpoeschel commented Jan 23, 2023

skalarproduktraum commented Jan 23, 2023

franzpoeschel commented Jan 23, 2023

franzpoeschel commented Jan 23, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ax3l commented Jan 25, 2023

Choose a reason for hiding this comment

franzpoeschel commented May 4, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ax3l commented Jun 27, 2023

ax3l left a comment

Choose a reason for hiding this comment

franzpoeschel commented Jul 25, 2023

ax3l left a comment

Choose a reason for hiding this comment

franzpoeschel commented May 4, 2023 •

edited

Loading