Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap updates in transaction to avoid one BEGIN/COMMIT per row #56

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

JasonBarnabe
Copy link
Contributor

@JasonBarnabe JasonBarnabe commented Mar 29, 2018

Given n rows processed in m batches, currently 3n statements are sent to the DB for updates: BEGIN, UPDATE, COMMIT. If the updates were wrapped in a transaction, then it would only send n + 2m updates.

On a local postgres table with 40000 rows, batch size 1000, anonymizing a single email field.

Before changes: 2m 52s
With transactions: 2m 26s (15% faster)

I'm not sure if this would have undesired effects for others, so maybe this should be configurable?

@coveralls
Copy link

coveralls commented Mar 29, 2018

Coverage Status

Coverage decreased (-2.3%) to 91.541% when pulling d4f1c30 on kickbooster:transactions into db4f509 on sunitparekh:master.

@JasonBarnabe
Copy link
Contributor Author

Note that #57 avoids transactions altogether which brings the number of statements to n.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants