bfloat16: byteswap does not swap bytes

The bfloat16 datatype does not byteswap. Here is my reproducer:
```
import ml_dtypes
import numpy as np
import sys

print ('sys.version: {}'.format (sys.version))
print ('sys.byteorder: {}'.format (sys.byteorder))
print ('ml_dtypes.__version__: {}'.format (ml_dtypes.__version__))
print ('np.__version__: {}'.format (np.__version__))
print ('----------------')

print ('Input bytes:')
raw_data=bytes.fromhex('f5 3e f6 3e 00 3f 52 3f f1 3e 51 3f 58 3e 39 3f c0 7f 80 7f 80 7f 80 ff')
print (' '.join('{:02x}'.format(x) for x in raw_data))
print ('----------------')

np_dtype = np.dtype(ml_dtypes.bfloat16)
np_arr = np.frombuffer(raw_data, dtype=np_dtype)
np_arr_swapped = np_arr.byteswap()
np_arr_uint16 = np.frombuffer(raw_data, dtype=np.uint16)
np_arr_uint16_swapped = np_arr_uint16.byteswap()

print ('np_arr:')
print (np_arr)
print ('np_arr_swapped:')
print (np_arr_swapped)
print ('----------------')

print ('np_arr.tobytes():')
np_arr_bytes = np_arr.tobytes()
print (' '.join('{:02x}'.format(x) for x in np_arr_bytes))

print ('np_arr_swapped.tobytes():')
np_arr_swapped_bytes = np_arr_swapped.tobytes()
print (' '.join('{:02x}'.format(x) for x in np_arr_swapped_bytes))

print ('np_arr_uint16.tobytes():')
print (' '.join('{:02x}'.format(x) for x in np_arr_uint16.tobytes()))

print ('np_arr_uint16_swapped.tobytes():')
print (' '.join('{:02x}'.format(x) for x in np_arr_uint16_swapped.tobytes()))
print ('----------------')

i = 0
errors = 0
while i < len(raw_data):
    if i % 2 == 0:
        j = i + 1
    else:
        j = i - 1
    if np_arr_swapped_bytes[i] == np_arr_bytes[j]:
        msg = 'OK'
    else:
        msg = 'ERROR'
        errors += 1
    print ('np_arr_swapped_bytes[{}]={:02x} == np_arr_bytes[{}]={:02x} => {}'
           .format (i, np_arr_swapped_bytes[i], j, np_arr_bytes[j], msg))
    i += 1

if errors == 0:
    print ('=> SUCCESS')
    exit (0)
else:
    print ('=> ERRORS')
    exit (1)

```

It fails on x86_64(little-endian) and s390x(big-endian; of course with different sys.byteorder) with this:
```
sys.version: 3.13.5 (main, Jun 12 2025, 00:00:00) [GCC 15.1.1 20250521 (Red Hat 15.1.1-2)]
sys.byteorder: little
ml_dtypes.__version__: 0.5.3
np.__version__: 2.3.2
----------------
Input bytes:
f5 3e f6 3e 00 3f 52 3f f1 3e 51 3f 58 3e 39 3f c0 7f 80 7f 80 7f 80 ff
----------------
np_arr:
[0.478516 0.480469 0.5 0.820312 0.470703 0.816406 0.210938 0.722656 nan
 inf inf -inf]
np_arr_swapped:
[0.478516 0.480469 0.5 0.820312 0.470703 0.816406 0.210938 0.722656 nan
 inf inf -inf]
----------------
np_arr.tobytes():
f5 3e f6 3e 00 3f 52 3f f1 3e 51 3f 58 3e 39 3f c0 7f 80 7f 80 7f 80 ff
np_arr_swapped.tobytes():
f5 3e f6 3e 00 3f 52 3f f1 3e 51 3f 58 3e 39 3f c0 7f 80 7f 80 7f 80 ff
np_arr_uint16.tobytes():
f5 3e f6 3e 00 3f 52 3f f1 3e 51 3f 58 3e 39 3f c0 7f 80 7f 80 7f 80 ff
np_arr_uint16_swapped.tobytes():
3e f5 3e f6 3f 00 3f 52 3e f1 3f 51 3e 58 3f 39 7f c0 7f 80 7f 80 ff 80
----------------
np_arr_swapped_bytes[0]=f5 == np_arr_bytes[1]=3e => ERROR
np_arr_swapped_bytes[1]=3e == np_arr_bytes[0]=f5 => ERROR
np_arr_swapped_bytes[2]=f6 == np_arr_bytes[3]=3e => ERROR
np_arr_swapped_bytes[3]=3e == np_arr_bytes[2]=f6 => ERROR
np_arr_swapped_bytes[4]=00 == np_arr_bytes[5]=3f => ERROR
np_arr_swapped_bytes[5]=3f == np_arr_bytes[4]=00 => ERROR
np_arr_swapped_bytes[6]=52 == np_arr_bytes[7]=3f => ERROR
np_arr_swapped_bytes[7]=3f == np_arr_bytes[6]=52 => ERROR
np_arr_swapped_bytes[8]=f1 == np_arr_bytes[9]=3e => ERROR
np_arr_swapped_bytes[9]=3e == np_arr_bytes[8]=f1 => ERROR
np_arr_swapped_bytes[10]=51 == np_arr_bytes[11]=3f => ERROR
np_arr_swapped_bytes[11]=3f == np_arr_bytes[10]=51 => ERROR
np_arr_swapped_bytes[12]=58 == np_arr_bytes[13]=3e => ERROR
np_arr_swapped_bytes[13]=3e == np_arr_bytes[12]=58 => ERROR
np_arr_swapped_bytes[14]=39 == np_arr_bytes[15]=3f => ERROR
np_arr_swapped_bytes[15]=3f == np_arr_bytes[14]=39 => ERROR
np_arr_swapped_bytes[16]=c0 == np_arr_bytes[17]=7f => ERROR
np_arr_swapped_bytes[17]=7f == np_arr_bytes[16]=c0 => ERROR
np_arr_swapped_bytes[18]=80 == np_arr_bytes[19]=7f => ERROR
np_arr_swapped_bytes[19]=7f == np_arr_bytes[18]=80 => ERROR
np_arr_swapped_bytes[20]=80 == np_arr_bytes[21]=7f => ERROR
np_arr_swapped_bytes[21]=7f == np_arr_bytes[20]=80 => ERROR
np_arr_swapped_bytes[22]=80 == np_arr_bytes[23]=ff => ERROR
np_arr_swapped_bytes[23]=ff == np_arr_bytes[22]=80 => ERROR
=> ERRORS
```

Debugging with gdb showed, that we end up in ml_dtypes/_src/custom_float.h:NPyCustomFloat_CopySwapN() where swap=1 and src is NULL. As src is NULL, it just returns without swapping instead of swapping dst directly. From just looking at the sources, the same applies for NPyCustomFloat_CopySwap().
The same would in theory also apply for ml_dtypes/_src/intn_numpy.h, but I think there are no >=2byte datatypes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bfloat16: byteswap does not swap bytes #308

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bfloat16: byteswap does not swap bytes #308

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions