Handle empty SQL resultsets gracefully in assessment pipeline #2172

sundarshankar89 · 2025-12-02T07:30:38Z

Changes

What does this PR do?

Improves the assessment pipeline to handle SQL queries that return empty resultsets (0 rows) without causing failures or creating empty tables. This change adds validation to check row counts before attempting table operations and provides clear logging for operations.

Relevant implementation details

Caveats/things to watch out for when reviewing:

Linked issues

Resolves #..

Functionality

added relevant user documentation
added new CLI command
modified existing command: databricks labs lakebridge ...
... +add your own

Tests

manually tested
added unit tests
added integration tests

codecov · 2025-12-02T07:32:22Z

Codecov Report

❌ Patch coverage is 0% with 9 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (patch/profiler_test_tmp_path@55b3b87). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
...databricks/labs/lakebridge/assessments/pipeline.py	0.00%	9 Missing ⚠️

Additional details and impacted files

@@                       Coverage Diff                       @@
##             patch/profiler_test_tmp_path    #2172   +/-   ##
===============================================================
  Coverage                                ?   63.52%           
===============================================================
  Files                                   ?      100           
  Lines                                   ?     8508           
  Branches                                ?      886           
===============================================================
  Hits                                    ?     5405           
  Misses                                  ?     2936           
  Partials                                ?      167

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-12-02T07:36:29Z

✅ 52/52 passed, 6 flaky, 4m28s total

Flaky tests:

🤪 test_transpiles_informatica_to_sparksql_non_interactive[True] (22.751s)
🤪 test_transpiles_informatica_to_sparksql (22.537s)
🤪 test_transpiles_informatica_to_sparksql_non_interactive[False] (4.284s)
🤪 test_transpile_teradata_sql_non_interactive[False] (21.108s)
🤪 test_transpile_teradata_sql_non_interactive[True] (21.306s)
🤪 test_transpile_teradata_sql (6.441s)

_{Running from acceptance #3156}

src/databricks/labs/lakebridge/assessments/pipeline.py

asnare · 2025-12-02T10:24:40Z

src/databricks/labs/lakebridge/assessments/pipeline.py

        with duckdb.connect(db_path) as conn:
            # TODO: Add support for figuring out data types from SQLALCHEMY result object result.cursor.description is not reliable
-            schema = ' STRING, '.join(result.columns) + ' STRING'
+            schema = ', '.join(f"{col} STRING" for col in result.columns)


I'm not sure what the source of the column names is, but they should probably be escaped.

A quick glance at the DuckDB docs hints that they don't provide a method for this, but essentially escaping identifiers means something like:

def escape_duckdb_identifier(name: str) -> str: return f'"{name.replace('"', '""')}"'

This is for extracting information schema, do we want to handle this here, I would rather not accept any sql query which has non complaint column names.

This escape has a larger footprint, requiring updates for data retrieval and dashboard creation. Can we create a ticket to address this issue everywhere?

asnare · 2025-12-02T10:29:26Z

src/databricks/labs/lakebridge/assessments/pipeline.py

+            placeholders = ', '.join(['?'] * len(result.columns))
+            conn.executemany(f"INSERT INTO {step_name} VALUES ({placeholders})", result.rows)
+            logging.info(f"Successfully inserted {row_count} rows into table '{step_name}'.")


Is there way for us to see this statement in the logging before it executes?

I can store it pandas dataframe and print them

tests/integration/assessments/test_pipeline.py

tests/resources/assessments/empty_resultset.sql

tests/resources/assessments/pipeline_config_empty_result.yml

tests/integration/assessments/test_pipeline.py

tests/resources/assessments/empty_resultset.sql

…_sql_output

Handle empty SQL resultsets gracefully in assessment pipeline

1e0a883

sundarshankar89 requested a review from a team as a code owner December 2, 2025 07:30

sundarshankar89 requested a review from goodwillpunning December 2, 2025 07:30

sundarshankar89 had a problem deploying to tool December 2, 2025 07:30 — with GitHub Actions Failure

added resources

57a3f9d

sundarshankar89 temporarily deployed to tool December 2, 2025 07:45 — with GitHub Actions Inactive

asnare reviewed Dec 2, 2025

View reviewed changes

Fixed Review Comments

3cad168

sundarshankar89 had a problem deploying to tool December 3, 2025 08:14 — with GitHub Actions Error

Merge branch 'main' into feature/handle_empty_sql_output

c46b6d3

sundarshankar89 had a problem deploying to tool December 3, 2025 08:14 — with GitHub Actions Failure

asnare assigned sundarshankar89 Dec 3, 2025

sundarshankar89 had a problem deploying to tool December 3, 2025 15:05 — with GitHub Actions Failure

Fix DuckDB file locking errors in concurrent profiler validator tests

839af11

sundarshankar89 mentioned this pull request Dec 4, 2025

Fix DuckDB file locking errors in concurrent profiler validator tests #2176

Merged

7 tasks

Fix DuckDB file locking errors in concurrent profiler validator tests

7f151f3

sundarshankar89 changed the base branch from main to patch/profiler_test_tmp_path December 4, 2025 08:11

sundarshankar89 added stacked PR Should be reviewed, but not merged feat/profiler Issues related to profilers labels Dec 4, 2025

Merge branch 'patch/profiler_test_tmp_path' into feature/handle_empty…

aae9455

…_sql_output

sundarshankar89 had a problem deploying to tool December 4, 2025 08:13 — with GitHub Actions Failure

sundarshankar89 had a problem deploying to tool December 4, 2025 10:28 — with GitHub Actions Failure

isolating failing tests

f495f05

sundarshankar89 had a problem deploying to tool December 4, 2025 10:58 — with GitHub Actions Error

isolating failing tests

0ff326e

sundarshankar89 had a problem deploying to tool December 4, 2025 10:59 — with GitHub Actions Failure

isolating failing tests

71211a6

sundarshankar89 temporarily deployed to tool December 4, 2025 13:33 — with GitHub Actions Inactive

isolating failing tests

514a752

sundarshankar89 temporarily deployed to tool December 4, 2025 13:57 — with GitHub Actions Inactive

isolating failing tests

ee008b3

sundarshankar89 temporarily deployed to tool December 4, 2025 14:18 — with GitHub Actions Inactive

sundarshankar89 linked an issue Dec 5, 2025 that may be closed by this pull request

[BUG]: SQL Script execution crashes the profiler where empty output returns #2148

Open

3 tasks

sundarshankar89 requested review from asnare and lolo115 December 10, 2025 03:14

sundarshankar89 added 2 commits December 10, 2025 11:42

Merge branch 'main' into patch/profiler_test_tmp_path

67abfde

Merge branch 'patch/profiler_test_tmp_path' into feature/handle_empty…

e536890

…_sql_output

sundarshankar89 temporarily deployed to tool December 10, 2025 06:14 — with GitHub Actions Inactive

sundarshankar89 added 2 commits December 11, 2025 08:07

Merge branch 'main' into patch/profiler_test_tmp_path

a873787

Merge branch 'patch/profiler_test_tmp_path' into feature/handle_empty…

6d649ce

…_sql_output

sundarshankar89 temporarily deployed to tool December 11, 2025 02:37 — with GitHub Actions Inactive

sundarshankar89 and others added 2 commits December 11, 2025 08:11

isolating failing tests

55b3b87

Merge branch 'patch/profiler_test_tmp_path' into feature/handle_empty…

7aa7761

…_sql_output

sundarshankar89 temporarily deployed to tool December 11, 2025 02:43 — with GitHub Actions Inactive

Base automatically changed from patch/profiler_test_tmp_path to main December 12, 2025 15:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Handle empty SQL resultsets gracefully in assessment pipeline #2172

Handle empty SQL resultsets gracefully in assessment pipeline #2172

Uh oh!

sundarshankar89 commented Dec 2, 2025

Uh oh!

codecov bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

asnare Dec 2, 2025

Uh oh!

sundarshankar89 Dec 3, 2025

Uh oh!

sundarshankar89 Dec 5, 2025 •

edited

Loading

Uh oh!

asnare Dec 2, 2025

Uh oh!

sundarshankar89 Dec 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Handle empty SQL resultsets gracefully in assessment pipeline #2172

Are you sure you want to change the base?

Handle empty SQL resultsets gracefully in assessment pipeline #2172

Uh oh!

Conversation

sundarshankar89 commented Dec 2, 2025

Changes

What does this PR do?

Relevant implementation details

Caveats/things to watch out for when reviewing:

Linked issues

Functionality

Tests

Uh oh!

codecov bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

asnare Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

sundarshankar89 Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

sundarshankar89 Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asnare Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

sundarshankar89 Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Dec 2, 2025 •

edited

Loading

github-actions bot commented Dec 2, 2025 •

edited

Loading

sundarshankar89 Dec 5, 2025 •

edited

Loading