Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Implementation of L1 arrow compaction #642

Closed
wants to merge 1 commit into from

Conversation

gernest
Copy link
Contributor

@gernest gernest commented Dec 20, 2023

Closes #433

@gernest gernest force-pushed the arrow-compaction branch 3 times, most recently from 88bd738 to 06ae0f7 Compare December 20, 2023 02:46
@gernest
Copy link
Contributor Author

gernest commented Dec 20, 2023

Parts are not guaranteed to have the same number of columns( when dyncamic columns are involved). For each compaction call we collect all columns observed across all parts. For dynamic columns, to ensure all columns have the same number of rows, we fill them with nulls when they are not found in a part.

Initially I had thought since we expect parts to be sorted , then merging them should be enough. However, I have doubts about this.

  • We support multiple columns sort (including dynamic columns), the new part will contain nulls in no guaranteed order which messes up sorting expectation.
  • We operate under assumptions that main sorting column behaves like timestamp which is always increasing. When using different main sorting column. It doesn't matter much if the part was sorted before. New record must be sorted for sorting to make sense.

This commit adds `parts.ArrowCompact` which joints multiple arrow parts into a
single new part.
@gernest
Copy link
Contributor Author

gernest commented Dec 21, 2023

Seems to be bigger than I anticipated. Closing this for now, will focus on smaller patches.

@gernest gernest closed this Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

L1 arrow compaction
1 participant