Skip to content

Conversation

@xuyangzhong
Copy link
Contributor

What is the purpose of the change

Support filter and project between source and delta join. After this pr, the join with pattern "source -> calc -> join" can be optimized into delta join.

Brief change log

  • Support filter and project between source and delta join in planner
  • Add calc flat map function in runtime
  • Add UT, IT and harness tests

Verifying this change

UT, IT and harness tests are added.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): no
  - The public API, i.e., is any changed class annotated with @Public(Evolving): no
  - The serializers: no
  - The runtime per-record code paths (performance sensitive): no
  - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  - The S3 file system connector: no

## Documentation

  - Does this pull request introduce a new feature? no
  - If yes, how is the feature documented? 

@xuyangzhong xuyangzhong marked this pull request as ready for review October 27, 2025 12:40
@flinkbot
Copy link
Collaborator

flinkbot commented Oct 27, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@Au-Miner
Copy link
Contributor

Thanks for advancing the feature. Let me leave some comments

@github-actions github-actions bot added the community-reviewed PR has been reviewed by the community. label Oct 27, 2025
@xuyangzhong
Copy link
Contributor Author

@Au-Miner Thanks for the review. Updated!

}

ChangelogMode changelogMode = getChangelogMode((StreamPhysicalRel) tableScan);
if (changelogMode.containsOnly(RowKind.INSERT)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that CDC source was allowed before, so why is it only allowed to use insert here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also allows cdc source. This logic is:

  1. if the source is INSERT-ONLY source, then pass and return true
  2. else the source is CDC source
    2.1 if the filter pushed down into the source is not applied on one set of upsert key, return false
    2.2 else pass and return true

About the logic why the filter must be applied on one set of upsert key when consuming cdc, we have talked before
#27111 (comment), and I also raise a jira for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants