Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spike] Conflict with concurrent transaction #903

Open
2 tasks done
jennlw opened this issue Sep 6, 2024 · 5 comments
Open
2 tasks done

[Spike] Conflict with concurrent transaction #903

jennlw opened this issue Sep 6, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@jennlw
Copy link

jennlw commented Sep 6, 2024

Is this a new bug in dbt-redshift?

  • I believe this is a new bug in dbt-redshift
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Job runs sporadically hitting "conflict with concurrent transaction error." The failure is inconsistent as the customer cannot reproduce it on every execution. On one of their dbt Cloud environment (5 threads) it happened twice but on our their local DBT core instances(20+ threads) they were able to reproduce it every third execution. Their local execution has another model victim.

They checked their logs and were not able to locate any other thread running parallel execution that would cause conflict.

Expected Behavior

Job run does not fail on concurrent transactions.

Steps To Reproduce

Not yet, but customer mentioned:

"Neither of the models are using incremental materialization, only thing that might be related to this failure is that these models being used in the materialized view downstream."

I had ran this past our team internally and it was mentioned:

I suspect this could be the culprit. I remember that querying the object dependency tree in Redshift is not easy when using MVs. So I wonder if dbt is dropping the MV in parallel to the refresh of this table and that's attempting to drop this table at the same time.

Relevant log output

2024-09-03 00:40:17.026263 (Thread-3 (worker)): 00:40:17 On model.A_dbt.modelA: /* {"app": "dbt", "dbt_version": "1.7.17", "profile_name": "user", "target_name": "default", "node_id": "model.A_dbt.modelA"} */ drop table if exists "abcd"."dbt_1234"."modelA__dbt_backup" cascade

2024-09-03 00:40:22.922586 (Thread-3 (worker)): 00:40:22 Redshift adapter: Redshift error: could not complete because of conflict with concurrent transaction 2024-09-03 00:40:22.923500 (Thread-3 (worker)): 00:40:22 On model.A_dbt.modelA: ROLLBACK 2024-09-03 00:40:23.147961 (Thread-3 (worker)): 00:40:23 Redshift adapter: Error running SQL: macro drop_relation

Environment

- OS:
- Python:
- dbt-core: 1.7.17 (cloud)
- dbt-redshift: 1.7.7

Additional Context

Please feel free to reach out to me and we can further discuss logs. I did not add them as they are customer logs and not dbt sandbox or test logs from dbt.

@jennlw jennlw added the triage label Sep 6, 2024
@mikealfare mikealfare self-assigned this Sep 6, 2024
@mikealfare mikealfare added bug Something isn't working and removed triage labels Sep 6, 2024
@jennlw
Copy link
Author

jennlw commented Sep 6, 2024

I should also note that I removed customer model names, and added fake ones. Please reach out to me if you need more information.

@colin-rogers-dbt
Copy link
Contributor

@jennlw has this behavior been seen in earlier dbt versions as well? When was this behavior first seen?

@colin-rogers-dbt colin-rogers-dbt changed the title Conflict with concurrent transaction [Spike] Conflict with concurrent transaction Sep 13, 2024
@jennlw
Copy link
Author

jennlw commented Sep 16, 2024

Hey Colin, I've reached out and I'm asking if it was seen prior to version 1.7. I'll let you know what I find.

@jennlw
Copy link
Author

jennlw commented Sep 18, 2024

Hey Colin, the customer stated that the issue they experienced was only on 1.7 as this piece of code had ran only on that version.

@VersusFacit
Copy link
Contributor

VersusFacit commented Nov 5, 2024

Spike

My goal was to whittle this down to the simplest way to reproduce this error as often as possible to gather insight from those results.

First, this is a very tricky issue to reproduce. I had much, much more luck on main than 1.7.7 or 1.7.18.

  • project: Jaffle shop classic
  • all seeds and models already in jaffle shop classic are set to materialize as table
  • orders[1-100].sql are either view, table, or materialized_view (at one point I was testing with 500 but that took too long and I observed the concurrency error occurred early if it did at all)
  • 25 threads
  • these tests were done at night to minimize other CI runs conflicting

Results

[debug] [Thread-1 (]: Applying DROP to: "ci"."dbt_mila"."stg_customers__dbt_backup"
[debug] [Thread-1 (]: Using redshift connection "model.jaffle_shop.stg_customers"
[debug] [Thread-1 (]: On model.jaffle_shop.stg_customers: /* {"app": "dbt", "dbt_version": "1.9.0b3", "profile_name": "redshift", "target_name": "dev", "node_id": "model.jaffle_shop.stg_customers"} */
drop table if exists "ci"."dbt_mila"."stg_customers__dbt_backup" cascade
[debug] [Thread-3 (]: Redshift adapter: Redshift error: could not complete because of conflict with concurrent transaction
  1. stg_payments table is where the concurrency transaction failure happens most often if not every time
  2. table materialization has this only happen rarely, if not once out of 10+ runs trying to trigger it
  3. view materialization never failed out on me out of 10 runs
  4. materialized_view materializations have this SUPER common, independent of on_configuration_change -- all on main
    • Apply -- 2/5 (only lower by laws of probability imo, but reporting my formal trials as they happened for transparency)
    • Continue -- 3/5
    • Fail -- 4/5
  5. Out of 10 attempts only once did I get the concurrency error on 1.7.18

Other questions I had

Why is the focal point stg_payments, the last of the non materialized view tables?

Why is its drop causing this error?

Why materialized views as the main issue here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants