Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix arrow-based round trip of empty dataframes #15373

Merged
merged 3 commits into from
Mar 23, 2024

Conversation

wence-
Copy link
Contributor

@wence- wence- commented Mar 22, 2024

Description

When materializing range indices we were not previously creating the correct metadata. So do that.

While here, tidy up a few corner cases around creating range indices when constructing empty data frames.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@wence- wence- requested a review from a team as a code owner March 22, 2024 12:47
@wence- wence- requested review from bdice and mroeschke March 22, 2024 12:47
@wence-
Copy link
Contributor Author

wence- commented Mar 22, 2024

cc @rjzamora and @mhaseeb123

@github-actions github-actions bot added the Python Affects Python cuDF API. label Mar 22, 2024
@wence- wence- added bug Something isn't working non-breaking Non-breaking change labels Mar 22, 2024
Copy link
Contributor

@galipremsagar galipremsagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these fixes @wence- ! Minor code suggestions..

python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
For an empty data frame (with no columns) we would previously not
write the correct metadata, resulting in not correctly round-tripping
through the parquet read/write cycle.

- Closes rapidsai#12243
When preserving the index and we have a RangeIndex, we must
materialize it, and write that information in the metadata correctly.

- Closes rapidsai#14159
@wence-
Copy link
Contributor Author

wence- commented Mar 22, 2024

Dropped the attempt to fix #15372 because that unravelled a huge ball of string.

@wence-
Copy link
Contributor Author

wence- commented Mar 23, 2024

/merge

@rapids-bot rapids-bot bot merged commit dda3f31 into rapidsai:branch-24.06 Mar 23, 2024
68 checks passed
@wence- wence- deleted the wence/fix/12243 branch March 23, 2024 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Archived in project
3 participants