-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[Improve][Elasticsearch] Add LocalDateTime serialization test and simplify serializer #10135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...pache/seatunnel/connectors/seatunnel/elasticsearch/serialize/ElasticsearchRowSerializer.java
Outdated
Show resolved
Hide resolved
LiJie20190102
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
corgy-w
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for improving
Carl-Zhou-CN
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Purpose of this pull request
Fixes #9806.
When using MySQL-CDC as source and Elasticsearch as sink, MySQL
TIMESTAMPcolumns are deserialized into JDK8LocalDateTimeand then serialized inElasticsearchRowSerializerviaTemporal#toString(), producing ISO-8601 strings such as2023-01-01T12:34:56or2023-01-01T12:34:56.123456789.This behaviour is already compatible with Elasticsearch’s default
strict_date_optional_time[_nanos]formats, which expect ISO-8601 style timestamps. Changing the pattern toyyyy-MM-dd HH:mm:ssas suggested in #9806 would not be compatible with ES defaults and would require usersto customize index mappings.In this PR I keep the existing
Temporal#toString()behaviour, add a unit test to lock in the ISO-8601 expectation, and slightly simplify the serialization code by removing unnecessary formatters.Does this PR introduce any user-facing change?
No.
The connector continues to serialize date/time values using
Temporal#toString(), as before. The only changes are:LocalDateTimeis serialized in ISO‑8601 form with a'T'separator.ElasticsearchRowSerializerto reuse the existingTemporal#toString()behaviour instead of introducing custom formatters.Why not use
yyyy-MM-dd HH:mm:ssas suggested in #9806?The pattern
yyyy-MM-dd HH:mm:ssis not accepted by Elasticsearch’s defaultstrict_date_optional_timeparser. Using it would require users to adjust theformatof theirdatefields in index mappings, and would break out-of-the-box compatibility.In contrast, ISO-8601 strings produced by
LocalDateTime#toString()(for example2023-01-01T12:34:56or with fractional seconds) are directly supported bystrict_date_optional_time[_nanos], so no mapping changes are needed.How was this patch tested?
testSerializeLocalDateTimeFieldFormatinElasticsearchRowSerializerTestto verify that the serializedLocalDateTimeuses ISO‑8601 with a'T'separator.Check list
New License Guide
incompatible-changes.mdto describe the incompatibility caused by this PR.