You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The table metadata read by the JSON reader contains an extra child for string columns located within deeply nested structs and lists. The extra child results is caught in assert when writing the read table and metadata to (say) parquet here. Writing the read table to parquet without using the metadata from read_json succeeds.
Paste the contents of the attached parquet_io.txt file to parquet_io.cpp example as is and build libcudf and parquet_io example.
Expected behavior
The read table metadata should not have an extra child for string column.
Environment overview (please complete the following information)
Machine: dgx05 at RDS Lab, cudf branch-24.12, cudf conda devconntainer with cuda12.5
Environment details
N/A
Additional context
Note that once the write_parquet succeeds with the fix, the last read_parquet (verification) portion of the example may still fail until the changes from #17059 have been pulled in.
The text was updated successfully, but these errors were encountered:
For more context, here's the schema seen by write_parquet (column names normalized) when using the metadata from read_jsonwith_schema.txt and here is the schema seen by write_parquet otherwise without_schema.txt
Describe the bug
The table metadata read by the JSON reader contains an extra child for string columns located within deeply nested structs and lists. The extra child results is caught in assert when writing the read table and metadata to (say) parquet here. Writing the read table to parquet without using the
metadata
from read_json succeeds.Steps/Code to reproduce bug
parquet_io.cpp
example as is and build libcudf andparquet_io
example.Expected behavior
The read table metadata should not have an extra child for string column.
Environment overview (please complete the following information)
Machine: dgx05 at RDS Lab, cudf branch-24.12, cudf conda devconntainer with cuda12.5
Environment details
N/A
Additional context
Note that once the
write_parquet
succeeds with the fix, the lastread_parquet
(verification) portion of the example may still fail until the changes from #17059 have been pulled in.The text was updated successfully, but these errors were encountered: