-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-54442][SQL][FOLLOWUP] Add codegen for TIME numeric conversion functions #53370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[SPARK-54442][SQL][FOLLOWUP] Add codegen for TIME numeric conversion functions #53370
Conversation
|
cc @cloud-fan |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala
Show resolved
Hide resolved
|
|
||
| override def replacement: Expression = StaticInvoke( | ||
| classOf[DateTimeUtils.type], | ||
| DecimalType(14, 6), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| DecimalType(14, 6), | |
| dataType, |
to avoid duplication
| case e: DateTimeException => | ||
| throw QueryExecutionErrors.ansiDateTimeArgumentOutOfRange(e) | ||
| case e: ArithmeticException => | ||
| val wrapped = new DateTimeException(s"Overflow in TIME conversion: ${e.getMessage}", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| val wrapped = new DateTimeException(s"Overflow in TIME conversion: ${e.getMessage}", e) | |
| val wrapped = new DateTimeException(s"Overflow in TIME conversion", e) |
do not duplicate the info in the causedBy error
| * @param seconds Seconds (0 to 86399.999999) | ||
| * @return Nanoseconds since midnight | ||
| */ | ||
| def timeFromSeconds(seconds: Float): Long = withTimeConversionErrorHandling { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need the float version? the double version can cover it.
| if (seconds.isNaN || seconds.isInfinite) { | ||
| throw new DateTimeException("Cannot convert NaN or Infinite value to TIME") | ||
| } | ||
| (seconds * NANOS_PER_SECOND).toLong |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to consider overflow here when calling toLong?
| * @param nanos Nanoseconds since midnight | ||
| * @return Seconds as Decimal(14, 6) | ||
| */ | ||
| def timeToSeconds(nanos: Long): Decimal = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we wrap with withTimeConversionErrorHandling?
| SELECT time_from_seconds(-1); | ||
| SELECT time_from_seconds(86400); | ||
| SELECT time_from_seconds(90000); | ||
| SELECT time_from_seconds(-1); -- invalid: negative → exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| SELECT time_from_seconds(-1); -- invalid: negative → exception | |
| SELECT time_from_seconds(-1); -- invalid: negative -> exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's avoid unicode
| "errorClass" : "DATETIME_FIELD_OUT_OF_BOUNDS.WITH_SUGGESTION", | ||
| "sqlState" : "22023", | ||
| "messageParameters" : { | ||
| "ansiConfig" : "\"spark.sql.ansi.enabled\"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the full message? why do we mention ansi conf?
907abae to
f921fcf
Compare
f921fcf to
ec99c46
Compare
| if (seconds.isNaN || seconds.isInfinite) { | ||
| throw new DateTimeException("Cannot convert NaN or Infinite value to TIME") | ||
| } | ||
| (seconds * NANOS_PER_SECOND).toLong |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to consider overflow here when calling toLong?
What changes were proposed in this pull request?
Implemented whole-stage codegen for TIME conversion functions (IntegralToTimeBase, TimeFromSeconds, TimeToSeconds) by removing CodegenFallback
Why are the changes needed?
Performance: CodegenFallback is 10-50x slower than generated code
Pattern consistency: Follows existing Spark patterns (IntegralToTimestampBase)
Does this PR introduce any user-facing change?
No. performance optimization with identical results.
How was this patch tested?
All existing tests pass
Generated code inspection confirms optimizations
Was this patch authored or co-authored using generative AI tooling?
No