Skip to content

[C++] parquet-reader cannot display Decimal column stats when precision >= 38 #47596

@pitrou

Description

@pitrou

Describe the bug, including details regarding any error messages, version, and platform.

Snippet:

$ /build/build-test/debug/parquet-reader --only-metadata /tmp/pqfuzz/pq-table-1
...
Column 5: col_6 (FIXED_LEN_BYTE_ARRAY(11) / Decimal(precision=24, scale=7) / DECIMAL(24,7))
Column 6: col_7 (FIXED_LEN_BYTE_ARRAY(18) / Decimal(precision=43, scale=7) / DECIMAL(43,7))
...
Column 5
  Values: 375, Null Values: 74, Distinct Values: 0
  Max (exact: true): 98505381700645007.0205463, Min (exact: true): -99708959786297168.1726196
  Compression: UNCOMPRESSED, Encodings: PLAIN(DICT_PAGE) RLE_DICTIONARY
  Uncompressed Size: 3754, Compressed Size: 3754
Column 6
  Values: 375, Null Values: 69, Distinct Values: 0
  Max (exact: true): Parquet error: Failed to parse decimal value: Length of byte array passed to Decimal128::FromBigEndian was 18, but must be between 1 and 16
...

This should be relatively easy to fix, as we can just use Decimal256 instead of Decimal128 when displaying statistics.

Component(s)

C++, Parquet

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions