Skip to content
This repository was archived by the owner on Sep 2, 2025. It is now read-only.

[Microbatch] Optimizations: use view for temp relation + remove using clause during delete statement#1192

Merged
MichelleArk merged 6 commits intomainfrom
microbatch-use-temp-view
Nov 6, 2024
Merged

[Microbatch] Optimizations: use view for temp relation + remove using clause during delete statement#1192
MichelleArk merged 6 commits intomainfrom
microbatch-use-temp-view

Conversation

@MichelleArk
Copy link
Copy Markdown
Contributor

@MichelleArk MichelleArk commented Sep 26, 2024

resolves #1228
docs dbt-labs/docs.getdbt.com/# N/A

Problem

Microbatch performance is underwhelming for dbt-snowflake! A couple underlying reasons:

  1. Microbatch largely inherits from delete+insert strategy in snowflake, which has been optimized to use a temp view instead of table. However, microbatch is still using a temp table! More details: https://docs.getdbt.com/reference/resource-configs/snowflake-configs#temporary-tables
  • Note that it is always safe to do so for microbatch since there is no unique_key necessary
  1. The using clause is unnecessary in the microbatch delete statement, because microbatch does not require a merge key and the where clause is sufficient for a safe deletion.

Solution

  1. Apply temp view optimization to microbatch. Was just a matter of extending a couple if statements that only checked for insert+delete previously
  2. Remove extraneous using clause

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

@cla-bot cla-bot Bot added the cla:yes label Sep 26, 2024
@github-actions
Copy link
Copy Markdown
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the dbt-snowflake contributing guide.

@MichelleArk MichelleArk changed the title optimization: use view for temp relation for microbatch [Microbatch] Optimizations: use view for temp relation + remove using clause during insert Nov 1, 2024
@MichelleArk MichelleArk changed the title [Microbatch] Optimizations: use view for temp relation + remove using clause during insert [Microbatch] Optimizations: use view for temp relation + remove using clause during delete statement Nov 1, 2024
@MichelleArk MichelleArk marked this pull request as ready for review November 5, 2024 17:03
@MichelleArk MichelleArk requested a review from a team as a code owner November 5, 2024 17:03
Copy link
Copy Markdown
Contributor

@QMalcolm QMalcolm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! :shipit:

{% elif strategy in ("default", "merge", "append") %}
{{ return("view") }}
{% elif strategy == "delete+insert" and unique_key is none %}
{% elif strategy in ["delete+insert", "microbatch"] and unique_key is none %}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 This is a greeeeaaaat catch! Nice!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Cartesian Join based deletion is causing performance problems when it hits a certain scale for microbatch models

4 participants