-
Notifications
You must be signed in to change notification settings - Fork 890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Implement inequality joins by translating to cross + filter #17000
Draft
wence-
wants to merge
23
commits into
rapidsai:branch-24.12
Choose a base branch
from
wence-:wence/fea/16926-polars-iejoin
base: branch-24.12
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
WIP: Implement inequality joins by translating to cross + filter #17000
wence-
wants to merge
23
commits into
rapidsai:branch-24.12
from
wence-:wence/fea/16926-polars-iejoin
+2,372
−336
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Needs pola-rs/polars#19104 |
github-actions
bot
added
Python
Affects Python cuDF API.
cudf.polars
Issues specific to cudf.polars
labels
Oct 4, 2024
Which is now merged but not yet released. |
We will use this to provide infrastructure for making IR nodes easier to traverse. Expr nodes already use this facility, but we want to share it.
And tests of basic functionality.
This way we will be able to write generic traversals more easily.
Now that we have a uniform child attribute, this is easier.
We will use this for inequality joins and filter pushdown in the parquet reader. The handling is a bit complicated, since the subset of expressions that the parquet filter accepts is smaller than all possible expressions. Since much of the logic is similar, however, we just dispatch on a transformer state variable to determine which case we're handling.
We attempt to turn the predicate into a filter expression that the parquet reader understands. If successful then we don't have to apply the predicate as a post-filter. We can only do this when a row index is not requested.
Before working through the plumbing in pylibcudf for mixed and conditional joins and the ast evaluator, let's just support inequality joins by doing the basic thing.
Expressions referring to the right table must be suffixed if the name overlaps with that in the left table.
wence-
force-pushed
the
wence/fea/16926-polars-iejoin
branch
from
October 16, 2024 16:01
fdfe737
to
4350006
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
cudf.polars
Issues specific to cudf.polars
pylibcudf
Issues specific to the pylibcudf package
Python
Affects Python cuDF API.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Before working through the plumbing in pylibcudf for mixed and conditional joins and the ast evaluator, let's just support inequality joins by doing the basic thing.
Checklist