Skip to content

Conversation

asnare
Copy link
Contributor

@asnare asnare commented Oct 2, 2025

Changes

This PR updates our transpile command to support some additional arguments:

  • --overrides-file: for specifying an overrides (JSON) file with BB-based conversions.
  • --target-technology: for specifying the target technology when using BB to perform ETL conversions.

Relevant implementation details

Adding support for these options uncovered some related problems relating to the way dialect-specific options are configured and handled at runtime. This meant some yak-shaving, including:

  • During transpile, always checking the dialect-specific options even if the configuration file is missing or invalid.
  • Properly handling the <none> sentinel value used by BB at both configuration time and transpile-time.
  • Tidying up some of the unit test fixtures and tests that cover this area of the code.

Caveats/things to watch out for when reviewing:

Previously it looks like during install-transpile if the user hit enter on the overrides-file option they would end up with the string "None" as the value in the transpiler_options mapping. It's unclear to me where this was ultimately dealt with, or whether these configurations will just cause transpile to fail.

The integration tests here are flaky, and it's unclear why.
[Late addition: I've just added some additional plumbing to the integration tests to help with understanding why they're flaky. For some reason the underlying dbxconv process sometimes terminates with 255 as the exit code. Hopefully this will help understand why.]

Linked issues

Resolves #2070.

Functionality

  • updated relevant user documentation
  • modified existing command: databricks labs lakebridge install

Tests

  • manually tested
  • added unit tests
  • added integration tests

@asnare asnare self-assigned this Oct 2, 2025
@asnare asnare added enhancement New feature or request feat/cli actions that are visible to the user bb converter Issues related to BB converter labels Oct 2, 2025
Copy link

github-actions bot commented Oct 2, 2025

✅ 32/32 passed, 6 flaky, 2m29s total

Flaky tests:

  • 🤪 test_transpiles_informatica_to_sparksql_non_interactive[True] (13.438s)
  • 🤪 test_transpile_teradata_sql_non_interactive[False] (14.427s)
  • 🤪 test_transpiles_informatica_to_sparksql (13.532s)
  • 🤪 test_transpiles_informatica_to_sparksql_non_interactive[False] (4.081s)
  • 🤪 test_transpile_teradata_sql (18.502s)
  • 🤪 test_transpile_teradata_sql_non_interactive[True] (7.398s)

Running from acceptance #2603

asnare added 2 commits October 2, 2025 19:50
Other changes include:
 - Always test against the current (published) version of the BB plugin.
 - Only install and set up the plugin once.
 - Include the config/loading (from the workspace) and verification.
 - Check more of the details from running the transpiler, including the JSON output summary.
 - Remove the async bits.
@asnare asnare requested a review from m-abulazm October 6, 2025 17:20
Copy link
Collaborator

@sundarshankar89 sundarshankar89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I know why this is required, but we need to simplify on a better lsp_config.yml so this plumbing goes away.

@sundarshankar89
Copy link
Collaborator

LGTM, I know why this is required, but we need to simplify on a better lsp_config.yml so this plumbing goes away.

Bold Thought, what if we have install interface from the server side ?

if self.method == LSPPromptMethod.CONFIRM:
return prompts.confirm(self.prompt)
if self.method == LSPPromptMethod.QUESTION:
default = self.default if self.default else "None"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sundarshankar89: This is the bit where I think it would insert the string "None" as a value into the config.yml instead of null.

Copy link
Contributor

@m-abulazm m-abulazm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not blocking. some suggestions need alignment first

return
if transpiler_options is None:
transpiler_options = {}
elif not isinstance(transpiler_options, Mapping):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we properly type this since we know what it looks like instead of passing around Maps e.g.

class  TranspilerOptions
  target_tech: str | None
  overrides_file: Path | None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really know what it looks like though: it's filled in during the questionnaire based on a configuration file bundled with each LSP server.

It certainly can be more constrained than RootJsonValue, but we can't just use a static dataclass.

This PR adds specific support for 2 known options, but there might be others passed via this mechanism. (The code is supposed to prompt for missing values, but #2086 describes how this is futile and leads to the CLI simply hanging.)

# As a corner case, if there is no entry present it means the user wasn't prompted. Here we have
# some complexity. We have two ways of obtaining a value:
# - The user could provide it on the command-line, using --target-technology or --overrides-file.
# Problem: via command-line options there's no way to indicate 'no value'.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use_target_technology and use_overrides_file set <none> if user does not provide it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not the way the code was authored, and configuration files are already out there.

…ather than having the CLI check directly for the sentinel.
@asnare
Copy link
Contributor Author

asnare commented Oct 8, 2025

This PR is blocked on #2089, and once that's merged into this PR the documentation changes in here will need to be refreshed.

github-merge-queue bot pushed a commit that referenced this pull request Oct 9, 2025
## Changes

This PR updates the `labs.yml` file used by the Databricks CLI to locate
and provide `--help` for `databricks labs lakebridge` and subcommands.

Changes include:

- Fixing argument placeholders in flags; many of these were accidentally
filled with misleading text.
 - Marking all flags as optional: we don't have any mandatory flags yet.
- Consistent and concise styling of descriptions. For example, no
trailing periods (`.`).
 - Removing `null` as default values; this is unnecessary.
- Removing the `--skip-validation` default: it's incorrect because the
default value comes from `config.yml` in the workspace, which in turn
normally comes from the value provided during `install-transpile` (if
any).

### Linked issues

Blocks: #2072 

### Functionality

- [x] updated relevant user documentation
- [X] modified existing commands:

  - `databricks labs lakebridge [--help]`
  - `databricks labs lakebridge aggregates-reconcile --help`
  - `databricks labs lakebridge analyse --help`
  - `databricks labs lakebridge configure-database-provider --help`
  - `databricks labs lakebridge configure-reconcile --help`
  - `databricks labs lakebridge describe-transpile --help`
  - `databricks labs lakebridge install-transpile --help`
  - `databricks labs lakebridge reconcile --help`
  - `databricks labs lakebridge transpile --help`

### Tests

- [X] manually tested
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bb converter Issues related to BB converter enhancement New feature or request feat/cli actions that are visible to the user
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE]: Allow dialect-specific command-line arguments during transpile
3 participants