feat: Binary value type for optimized binary arrays #6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The original UBJSON solution for binary data was an array of
uint8
values. While this does sufficiently address the encoding of such data in the UBJSON format, it does not allow parsers to differentiate between a generic list of numbers and binary data.When dealing with large quantities of binary data this can have a significant negative impact on performance, as many languages provide optimized storage for binary data that is much more efficient than a standard array.
In the nlohmann C++ JSON library for example, a standard array can require 16 bytes per byte of data, while an optimized binary format would require exactly one.
The introduction of the other unsigned data types in BJData furthers the need for a dedicated byte type.
uint8
is no longer the lone unsigned data type, and for parsers to treatuint8
arrays differently as suggested in the UBJSON solution would lead to further confusion.This proposal aims to address this issue with the introduction of a dedicated
byte
(B
) type. This type would be identical to auint8
, but would be explicitly recommended for serializers/parsers to implement as an optimized data format type. Where such a type is not available, or parsers have not been upgraded to support the format, a standard integer array can be used instead.C++ provides
std::vector<std::byte or uint8_t>
, JavaScript providesUint8Array
, Dart providesUint8List
and Python providesbytearray
.UBJSON also states:
This solution does not fundamentally add any complexity, and without it many may be forced to use these other data formats along with all their baggage in order to achieve the desired efficiency.