Skip to content

Conversation

@yzeng1618
Copy link
Contributor

@yzeng1618 yzeng1618 commented Dec 1, 2025

Purpose of this pull request

Fixes #9806.

When using MySQL-CDC as source and Elasticsearch as sink, MySQL TIMESTAMP columns are deserialized into JDK8 LocalDateTime and then serialized in ElasticsearchRowSerializer via Temporal#toString(), producing ISO-8601 strings such as 2023-01-01T12:34:56 or 2023-01-01T12:34:56.123456789.

This behaviour is already compatible with Elasticsearch’s default strict_date_optional_time[_nanos] formats, which expect ISO-8601 style timestamps. Changing the pattern to yyyy-MM-dd HH:mm:ss as suggested in #9806 would not be compatible with ES defaults and would require usersto customize index mappings.

In this PR I keep the existing Temporal#toString() behaviour, add a unit test to lock in the ISO-8601 expectation, and slightly simplify the serialization code by removing unnecessary formatters.

Does this PR introduce any user-facing change?

No.

The connector continues to serialize date/time values using Temporal#toString(), as before. The only changes are:

  • A new unit test to verify that LocalDateTime is serialized in ISO‑8601 form with a 'T' separator.
  • A small internal refactor in ElasticsearchRowSerializer to reuse the existing Temporal#toString() behaviour instead of introducing custom formatters.

Why not use yyyy-MM-dd HH:mm:ss as suggested in #9806?

The pattern yyyy-MM-dd HH:mm:ss is not accepted by Elasticsearch’s default strict_date_optional_time parser. Using it would require users to adjust the format of their date fields in index mappings, and would break out-of-the-box compatibility.

In contrast, ISO-8601 strings produced by LocalDateTime#toString() (for example 2023-01-01T12:34:56 or with fractional seconds) are directly supported by strict_date_optional_time[_nanos], so no mapping changes are needed.

How was this patch tested?

  • Added testSerializeLocalDateTimeFieldFormat in ElasticsearchRowSerializerTest to verify that the serialized LocalDateTime uses ISO‑8601 with a 'T' separator.
  • Existing tests continue to pass.

Check list

@github-actions github-actions bot added the CI&CD label Dec 2, 2025
@github-actions github-actions bot removed the CI&CD label Dec 3, 2025
@github-actions github-actions bot removed the e2e label Dec 5, 2025
@yzeng1618 yzeng1618 changed the title [Fix][Elasticsearch] Format LocalDateTime timestamps with ES-compatible pattern [Improve][Elasticsearch] Add LocalDateTime serialization test and cleanup Dec 5, 2025
@yzeng1618 yzeng1618 changed the title [Improve][Elasticsearch] Add LocalDateTime serialization test and cleanup [Improve][Elasticsearch] Add LocalDateTime serialization test and simplify serializer Dec 5, 2025
Copy link
Contributor

@LiJie20190102 LiJie20190102 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@corgy-w corgy-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for improving

Copy link
Member

@Carl-Zhou-CN Carl-Zhou-CN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@Carl-Zhou-CN Carl-Zhou-CN merged commit ddd9238 into apache:dev Dec 7, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] [ElasticSearch connector] Timestamp format incompatibility between MySQL-CDC and Elasticsearch connector

4 participants