Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Index errors when using split_part #1132

Open
2 tasks done
benc-db opened this issue Oct 29, 2024 · 0 comments
Open
2 tasks done

[Bug] Index errors when using split_part #1132

benc-db opened this issue Oct 29, 2024 · 0 comments
Labels
bug Something isn't working triage

Comments

@benc-db
Copy link

benc-db commented Oct 29, 2024

Is this a new bug in dbt-spark?

  • I believe this is a new bug in dbt-spark
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When called with a part index that is out of bounds, and ansi-mode on, the split_part macro leads to an exception

Expected Behavior

Per the tests in BaseSplitPart in the adapter tests, the expectation is that this macro can be invoked with part indexes greater than the number of parts generated without throwing an exception specifically this row in the seed:

,|,,,,

We can accommodate this behavior by using get, rather than indexing the array, but only in Spark 3.4.0 or later.

Steps To Reproduce

  1. set spark.sql.ansi.enabled=true
  2. use split_part passing an out of bounds index
  3. observe exception

Relevant log output

No response

Environment

This issue has been in there for a while, but I'm just hitting it now due to new defaults in a Databricks environment I was asked to test against.

Additional Context

No response

@benc-db benc-db added bug Something isn't working triage labels Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

1 participant