Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-8004] avoid unnecessary record rewrite during merging with base file #12683

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

cshuo
Copy link
Contributor

@cshuo cshuo commented Jan 21, 2025

Change Logs

  • Fix the logic of AvroSchemaUtils#isStrictProjectionOf, as it always return false when there exists decimal type field in the source schema.

Impact

  • Avoid unnecessary record rewriting during merging with records in base file if there exists decimal type field, writing performance will be improved.

Risk level (write none, low medium or high below)

low

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".

  • The config description must be updated if new configs are added or the default value of the configs are changed
  • Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
    ticket number here and follow the instruction to make
    changes to the website.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@github-actions github-actions bot added the size:S PR with lines of changes in (10, 100] label Jan 21, 2025
@cshuo cshuo changed the title [HUDI-8004] avoid unnecessary record rewriting for merge handler [HUDI-8004] avoid unnecessary record rewrite during merging with base file Jan 21, 2025
@cshuo
Copy link
Contributor Author

cshuo commented Jan 21, 2025

@hudi-bot run azure

return isProjectionOfInternal(sourceSchema, targetSchema, AvroSchemaUtils::isAtomicTypeStrictProject);
}

private static boolean isAtomicTypeStrictProject(Schema source, Schema target) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isAtomicTypeEquals ?

@danny0405
Copy link
Contributor

@cshuo Can you take care the Azure CI failures?

@cshuo
Copy link
Contributor Author

cshuo commented Jan 24, 2025

@cshuo Can you take care the Azure CI failures?

The azure CI failed due to timeout, there is a JIRA tracking the problem, https://issues.apache.org/jira/browse/HUDI-8893.

@cshuo
Copy link
Contributor Author

cshuo commented Jan 24, 2025

@hudi-bot run azure

@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:S PR with lines of changes in (10, 100]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants