Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow sync all columns for Delta incremental models #1088

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Jeremynadal33
Copy link

resolves #594
docs dbt-labs/docs.getdbt.com/#

Problem

When having an incremental model with Delta file format, we could not use the sync_all_columns as on_schema_change because in the beginning, Delta did not support dropping columns. It now can (see issue 594 for details) but only when having certains table properties.

Solution

As suggested in issue 594, I added a check on current table properties, comparing it to expected table properties for allowing dropping columns. It then raises an error if trying to remove columns with right table properties.

If table properties are correct, it then first add new columns and then remove columns.

DISCLAIMERS :

  • There is a repeating step, I don't know if I should refacto this into another macro ?
  • I did not add unit tests for this new behavior. I saw PR Convert incremental on_schema_change tests #593 but I did not have time to understand the process yet. If possible, I would love an explanation on this, else I will try to understand on my own when I can!

Any feedback on this one will be much appreciated since this is my first open source PR on such a big project, thanks in advance !

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

@Jeremynadal33 Jeremynadal33 force-pushed the feature/allow-delta-drop-columns branch from 6ba88de to 2ccad38 Compare August 21, 2024 08:05
@sp-cveeragandham
Copy link

Any update on this PR? We have implemented an override macro as a workaround but would be nice if this gets fixed in the adapter code base.

@ggng-jaz
Copy link

ggng-jaz commented Sep 9, 2024

Hi there, is there an update on this PR? We have the same issue with incremental models in databricks in databricks/dbt-databricks#780 , thanks.

@Jeremynadalmirakl
Copy link

I apologize, I don't have time to investigate more currently. If anyone wants to add its stone, feel free to copy the code or tell me and I will give you access to my repo so you can push... Otherwise, I will come back to this as soon as I have enough time 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-1842] [Feature] Support on_schema_change='sync_all_columns' for Delta tables
4 participants