Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column descriptions with backslashes cause database errors #529

Open
1 of 6 tasks
PaddyAlton opened this issue Feb 5, 2025 · 0 comments
Open
1 of 6 tasks

Column descriptions with backslashes cause database errors #529

PaddyAlton opened this issue Feb 5, 2025 · 0 comments
Labels
bug Something isn't working triage

Comments

@PaddyAlton
Copy link

PaddyAlton commented Feb 5, 2025

Describe the bug

We had been using v0.8.0 of dbt_project_evaluator for some time. When attempting to upgrade we encountered a database error in the base_source_columns and base_node_columns models (introduced in v0.9.0).

It looks like the specific bug I am reporting emerged in v0.11.0 (following a fix for a different bug related to multiline column descriptions). It's possible (but not certain) that the bug only occurs with BigQuery.

Example error message: Database Error in model base_node_columns (models/staging/graph/base/base_node_columns.sql) Syntax error: Illegal escape sequence: \S

Running in debug mode (using v1.0.0 of the package), I was able to verify that the issue is due to backslashes being present in a multiline column description which is then passed to BigQuery when we try to insert values into the table.

Steps to reproduce

  • use v0.11.0 - v0.1.0 of the package
  • use the BigQuery DBT adaptor (v1.9.1 in my tests)
  • set up a model YAML file that contains a column description with single backslashes preceding a letter
  • run dbt run --select base_node_columns

Expected results

I expect the problem models to run without incident. The literal backslashes included in my column descriptions should be escaped if necessary.

Actual results

Received the following error:
Database Error in model base_node_columns (models/staging/graph/base/base_node_columns.sql) Syntax error: Illegal escape sequence: \S

(plus another similar error in base_source_columns

Screenshots and log output

Here is the relevant output from dbt run --debug --select base_node_columns:

insert into [project name].`dbt_dev`.`base_node_columns` values
(
...
  
    '''model.[project].[model name]'''
  
, 

  
    '''[column_name]'''
  
, 

  
    '''A multiline description containing example values that might
be held in this column. These examples include one with a format
like something\Something\Another\Thing 
'''
  
, 

  
    '''None'''
  
, 

  
    '''[]'''
  
, 
False, 
0, 

  
    '''None'''

...
)

System information

The contents of your packages.yml file:

packages:
  - package: calogica/dbt_expectations
    version: 0.10.4
  - package: data-mie/dbt_profiler
    version: 0.8.4
  - package: dbt-labs/codegen
    version: 0.13.1
  - package: dbt-labs/dbt_project_evaluator
    version: 1.0.0
  - package: dbt-labs/dbt_utils
    version: 1.3.0
  - package: dbt-labs/logging
    version: 0.8.0

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • trino/starburst
  • other (specify: ____________)

The output of dbt --version:

Core:
  - installed: 1.9.2
  - latest:    1.9.2 - Up to date!

Plugins:
  - bigquery: 1.9.1 - Up to date!

Additional context

I think that the fix should probably be 'escape backslashes in column descriptions'

Are you interested in contributing the fix?

I'm not sure how I would get started, it would be my first contribution to this project. But in principle I would be happy to contribute.

@PaddyAlton PaddyAlton added bug Something isn't working triage labels Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

1 participant