-
Notifications
You must be signed in to change notification settings - Fork 98
Feature/v0.0.10 #229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/v0.0.10 #229
Conversation
ravi-databricks
commented
Sep 12, 2025
- Added support for CDC Multiple sequence cols PR
- Added custom function support for kafka and delta tables PR
- Update project overview and features tables in docs + readme
- Updated release note and change logs
Adding multiple col support for auto_cdc api
Added custom function support for kafka and delta tables
sequence_by = cdc_apply_changes.sequence_by | ||
if ',' in sequence_by: | ||
sequence_cols = [col.strip() for col in sequence_by.split(',')] | ||
sequence_by = struct(*sequence_cols) # Use struct() from pyspark.sql.functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we guarantee consistent ordering of the sequencing columns. "col1, col2" vs. "col2, col1" are different structs and will have a different sorting and uniqueness logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Code preserves the order exactly as provided:
- split(',') returns tokens in the original order
- struct(sequence_cols) builds the struct with fields in that same order
So "col1, col2" and "col2, col1" produce different structs, which is expected and will change sequencing semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall, it looks good. just one concern regarding the sequency by columns ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good