Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix invalid UPDATE stmt when none of the columns change #883

Conversation

arajkumar
Copy link
Contributor

@arajkumar arajkumar commented Sep 18, 2024

When all column values in the SET clause are equal to those in the WHERE clause, we remove all columns from the SET clause. This results in an invalid UPDATE statement like the one shown below:

UPDATE table SET WHERE "id" = 1;

Usually, the above could happen when REPLICA IDENTITY is set to FULL, and the UPDATE statement executed with the same values as the old ones.
For e.g.

UPDATE table SET "id" = 1 WHERE "id" = 1;

Solution: Skip the UPDATE when all column values in SET clause are equal to the WHERE clause values.

Fixes #750

@arajkumar arajkumar marked this pull request as draft September 18, 2024 14:35
@arajkumar
Copy link
Contributor Author

Hmmm, I think the alternative approach is the right one.

@dimitri
Copy link
Owner

dimitri commented Sep 18, 2024

Do we need to worry about triggers on the target side? UPDATE statement that look like they change nothing can be used to execute a trigger... on the replica system though, this would have to be a specifically crafted trigger that still is executed with session_replication_role set to replica.

@arajkumar
Copy link
Contributor Author

@dimitri, Should we really worry about replica triggers? Other option would be just eliminating the IDENTITY column and revert the duplication removal, as we already have the generated column infra.

@arajkumar arajkumar force-pushed the arajkumar/fix-update-on-single-column-table-upstream branch from 0f8ab67 to e1096ff Compare September 18, 2024 17:34
@arajkumar arajkumar marked this pull request as ready for review September 18, 2024 17:36
@arajkumar
Copy link
Contributor Author

@dimitri In this PR, I'm skipping the UPDATE stmt when all columns in SET clause are duplicates of WHERE clause values.

Once we fix #844, we shall also remove the duplicate removal logic.

@arajkumar arajkumar force-pushed the arajkumar/fix-update-on-single-column-table-upstream branch from e1096ff to d4bed85 Compare September 18, 2024 18:13
@arajkumar arajkumar force-pushed the arajkumar/fix-update-on-single-column-table-upstream branch from d4bed85 to a6a586b Compare October 3, 2024 14:19
Copy link
Owner

@dimitri dimitri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally had time to review. Thanks for the great work again!

I was initially confused about what we already do in pgcopydb before your PR and what's added in your PR. I now got clarity on that, and appreciate your changes. Apart from the nitpicking in the review, I think we should add some mention of what we're doing here in the docs.

Maybe between https://pgcopydb.readthedocs.io/en/latest/ref/pgcopydb_follow.html#replica-identity-and-lack-of-primary-keys and https://pgcopydb.readthedocs.io/en/latest/ref/pgcopydb_follow.html#logical-decoding-pre-fetching in a new section titled “Statement Rewriting” or something.

src/bin/pgcopydb/ld_transform.c Outdated Show resolved Hide resolved
src/bin/pgcopydb/ld_transform.c Outdated Show resolved Hide resolved
@arajkumar
Copy link
Contributor Author

@dimitri Thanks for reviewing this PR.

I think we should add some mention of what we're doing here in the docs.

I don't think we should add to the doc, This is just a big fix to avoid generation of invalid SQL UPDATE statement. Also, I believe we should just implement skipping identity columns being included as part of UPDATE's SET clause(#844) and remove the deduplication logic all together. WDYT?

We need at least two columns to be able to skip a column
if the value is the same in the old and new rows.

Otherwise, we would end up with an invalid UPDATE statement
like below:
```
UPDATE table SET WHERE "id" = 1;
```

Usually, the above could happen when REPLICA IDENTITY is set
to FULL, and the UPDATE statement executed with the same
values as the old ones.
For e.g.
```
UPDATE table SET "id" = 1 WHERE "id" = 1;
```

Solution: Skip the update when all columns in SET clause is equal to the WHERE clause.

Signed-off-by: Arunprasad Rajkumar <[email protected]>
@arajkumar arajkumar force-pushed the arajkumar/fix-update-on-single-column-table-upstream branch from a6a586b to 50af2d3 Compare October 4, 2024 03:03
@arajkumar arajkumar requested a review from dimitri October 4, 2024 03:16
@dimitri dimitri merged commit 8361a93 into dimitri:main Oct 4, 2024
20 checks passed
@dimitri
Copy link
Owner

dimitri commented Oct 4, 2024

I don't think we should add to the doc, This is just a big fix to avoid generation of invalid SQL UPDATE statement.

As a change of behavior, I think you're right, no need to document the fix. That said, I think our overall approach should be documented. Saying that makes me realize it's not fair to add that burden to your PR (now merged), I will think more about what I think is needed and take care of it.

Also, I believe we should just implement skipping identity columns being included as part of UPDATE's SET clause(#844) and remove the deduplication logic all together. WDYT?

First part, agreed. Second part (remove the deduplication logic), I'm not sure I understand it. Can you provide a couple examples showing what we do, why it's wrong, and what you propose we should do instead?

@dimitri dimitri added the bug Something isn't working label Oct 4, 2024
@dimitri dimitri added this to the v0.18 milestone Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migration failed while update for schema containing composite primary key
2 participants