-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1830524 Add decoder logic for Dataframe.join
#2802
base: vbudati/SNOW-1794510-merge-decoder
Are you sure you want to change the base?
SNOW-1830524 Add decoder logic for Dataframe.join
#2802
Conversation
@@ -6,9 +6,10 @@ df2 = session.create_dataframe([[1, 2, 3, 4, 5]], schema=['\"A\"','\"B\"','\"C\" | |||
|
|||
df3 = df1.filter(col("\"A\"") == 1).join(df2.select((col("\"A\"") + 1).as_("\"A\""), col("\"B\""), col("\"C\""), col("\"l_0001_C\""), col("\"l_0003_B\""))) | |||
|
|||
df4 = df3.sort(df3.columns) | |||
# Commented out since df3.columns produces different results in the first encoding and in the encode-decode-encode result. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you find out why this is happening? We shouldn't be removing valid test cases here. I don't see a reason why encode-decode-encode needs to be value equivalent, this is a good example, as long as they are semantically equivalent (the uniqueness of generated columns correspond correctly) that is all that is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unique columns correspond to each other in both cases but the values are different. I'm not sure what the best way around this is since hardcoding the decoder seems like a bad idea. I can add the test back in.
I'm not familiar with how the column names are generated but can try to figure that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed!
…wflakedb/snowpark-python into vbudati/SNOW-1830524-decoder-df-join
Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes SNOW-1830524
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
Added decoder logic for
Dataframe.join
. All join tests should work now.