Skip to content

Conversation

@yzeng1618
Copy link
Contributor

#10137

Purpose of this pull request

This pull request fixes several edge‑case bugs in the SQL transform functions (mainly numeric, string and date/time functions) and adds/updates SQL‑level tests such as SQLNumericFunctionsTest, SQLStringFunctionsTest and SQLSystemFunctionsTest to cover them. It also cleans up test comments to remove Chinese characters so that ChineseCharacterCheckTest and the CI checks can pass.

Does this PR introduce any user-facing change?

Yes.

Some date/time functions (for example month difference and week/day‑of‑week related functions) are corrected to follow the documented and common SQL semantics.
Numeric casting and SIGN behavior are adjusted (e.g. support for SHORT type, case‑insensitive type names, and SIGN(NULL) now returning NULL instead of 0).
String functions such as ASCII, LEFT and RIGHT now handle NULL, empty strings and negative/zero lengths more consistently (e.g. ASCII(NULL)/ASCII('') return NULL, LEFT/RIGHT with negative length return an empty string and over‑length counts return the whole string).
These are all bug fixes to existing SQL behavior; no new configuration or connector‑level behavior is introduced.

How was this patch tested?

Added and updated SQL‑level unit tests for numeric, string and system functions, including SQLNumericFunctionsTest, SQLStringFunctionsTest and SQLSystemFunctionsTest.
Verified the behavior locally by running module tests such as mvn -pl seatunnel-transforms-v2 -Dtest=SQL*FunctionsTest test and ensuring ChineseCharacterCheckTest passes after removing Chinese comments.

Check list

@github-actions github-actions bot added the e2e label Dec 4, 2025
Copy link
Contributor

@corgy-w corgy-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contributions. I would like to sync on two points:

We can temporarily keep the historical unit tests (UTs) for now without deleting them. I noticed that you've made some changes related to edge cases, and these historical UTs may still be useful. We can clean them up after verification to ensure we are thorough.

The granularity of the UTs is still too large. I didn’t go through the entire test code, but I took a look at the cast function and noticed that it’s only being used for a simple conversion. However, there are many different types, and we have had historical issues where certain types couldn’t be converted to others. These are points that need validation. While it may not be possible to cover all cases, I encourage you to make the tests as comprehensive as possible.

Thanks again for your contribution, and feel free to refer to my suggestions. If there are any questions, please reach out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants