-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-859943: Add basic support for functions.window #2545
SNOW-859943: Add basic support for functions.window #2545
Conversation
CHANGELOG.md
Outdated
@@ -27,6 +27,7 @@ | |||
- Added support for `Index.to_numpy`. | |||
- Added support for `DataFrame.align` and `Series.align` for `axis=0`. | |||
- Added support for `size` in `GroupBy.aggregate`, `DataFrame.aggregate`, and `Series.aggregate`. | |||
- Added partial support for `snowflake.snowpark.functions.window` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move it up to 1.25.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we call it partial support? lets documents its capabilities to function description and not word it as partial support.
"year": {"year", "y", "yy", "yyy", "yyyy", "yr", "years", "yrs"}, | ||
"quarter": {"quarter", "q", "qtr", "qtrs", "quarters"}, | ||
"month": {"month", "mm", "mon", "mons", "months"}, | ||
"week": {"week", "w", "wk", "weekofyear", "woy", "wy"}, | ||
"day": {"day", "d", "dd", "days", "dayofmonth"}, | ||
"hour": {"hour", "h", "hh", "hr", "hours", "hrs"}, | ||
"minute": {"minute", "m", "mi", "min", "minutes", "mins"}, | ||
"second": {"second", "s", "sec", "seconds", "secs"}, | ||
"millisecond": {"millisecond", "ms", "msec", "milliseconds"}, | ||
"microsecond": {"microsecond", "us", "usec", "microseconds"}, | ||
"nanosecond": { | ||
"nanosecond", | ||
"ns", | ||
"nsec", | ||
"nanosec", | ||
"nsecond", | ||
"nanoseconds", | ||
"nanosecs", | ||
"nseconds", | ||
}, | ||
"dayofweek": {"dayofweek", "weekday", "dow", "dw"}, | ||
"dayofweekiso": {"dayofweekiso", "weekday_iso", "dow_iso", "dw_iso"}, | ||
"dayofyear": {"dayofyear", "yearday", "doy", "dy"}, | ||
"weekiso": {"weekiso", "week_iso", "weekofyeariso", "weekofyear_iso"}, | ||
"yearofweek": {"yearofweek"}, | ||
"yearofweekiso": {"yearofweekiso"}, | ||
"epoch_second": {"epoch_second", "epoch", "epoch_seconds"}, | ||
"epoch_millisecond": {"epoch_millisecond", "epoch_milliseconds"}, | ||
"epoch_microsecond": {"epoch_microsecond", "epoch_microseconds"}, | ||
"epoch_nanosecond": {"epoch_nanosecond", "epoch_nanoseconds"}, | ||
"timezone_hour": {"timezone_hour", "tzh"}, | ||
"timezone_minute": {"timezone_minute", "tzm"}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you do a lot of trial and error for this? Are these aliases documented somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a handy chart here: https://docs.snowflake.com/en/sql-reference/functions-date-time#label-supported-date-time-parts
src/snowflake/snowpark/functions.py
Outdated
# SNOW-1063685: slideDuration changes this function from a 1:1 mapping to a 1:N mapping. That | ||
# currently would require a udtf which may have significantly different performance. | ||
raise NotImplementedError( | ||
"snowflake.snowpark.functions.window does not support slideDuration parameter yet." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since @sfc-gh-qding volunteered as a sql expert for our team, we can discuss if doing this is possible without udtf.
Co-authored-by: Afroz Alam <[email protected]>
Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes SNOW-859943
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
This PR adds partial support for functions.window. This involved a few notable changes:
window
. These columns are often used as aggregate keys though which cannot be aliased. In order to support this use case I've modified the aggregation logic to remove the alias if present. This has the side-effect of also allowing users to have a statement like this:df.groupby(upper(col("cat")).alias("cat").agg(...)
. The resulting aggregate key column would have the namecat
instead ofUPPER(""CAT"")"
Column
so that if you alias an already aliased column it replaces the alias instead of trying to alias twice. This allows a statement likecol("cat").alias("a1").alias("a2")
which results in a column of namea2
.