Skip to content

Conversation

asnare
Copy link
Contributor

@asnare asnare commented Sep 23, 2025

Changes

This PR implements unit tests for the TranspilerRepository class, as well as extending the coverage of the integration tests. This is aimed mainly at capturing existing behaviour and ensuring bugs are not introduced.

Tests

  • added unit tests
  • added integration tests

@asnare asnare self-assigned this Sep 23, 2025
@asnare asnare requested a review from a team as a code owner September 23, 2025 16:01
@asnare asnare added tech debt design flaws and other cascading effects internal technical pr's not end user facing labels Sep 23, 2025
Copy link

github-actions bot commented Sep 23, 2025

✅ 36/36 passed, 2 flaky, 1m16s total

Flaky tests:

  • 🤪 test_transpiles_informatica_with_sparksql (9.687s)
  • 🤪 test_transpile_sql_file (8.453s)

Running from acceptance #2395

asnare added a commit that referenced this pull request Sep 25, 2025
…e installed transpilers (#2051)

## Changes

### What does this PR do?

This PR implements a `describe-transpile` subcommand that describes the
currently installed transpilers and associated dialects and
configuration. This is intended for diagnostics and use by the UI. When
run normally, it provides output like this:
```
% databricks labs lakebridge describe-transpile
Transpiler   Installed Version  Plugin Configuration
==========   =================  ====================
Morpheus     0.6.6              /Users/me/.databricks/labs/remorph-transpilers/databricks-morph-plugin/lib/config.yml
Bladebridge  0.1.15             /Users/me/.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml

Supported Source Dialects
=========================
 - datastage
 - informatica (desktop edition)
 - informatica cloud
 - mssql
 - netezza
 - oracle
 - snowflake
 - synapse
 - teradata
 - tsql
```

When the `--output=json` option is provided to the Databricks CLI, more
information is available:
```json
% databricks labs lakebridge describe-transpile --output=json
{
  "available-dialects": [
    "datastage",
    "informatica (desktop edition)",
    "informatica cloud",
    "mssql",
    "netezza",
    "oracle",
    "snowflake",
    "synapse",
    "teradata",
    "tsql"
  ],
  "installed-transpilers": [
    {
      "config-path":"/Users/andrew.snare/.databricks/labs/remorph-transpilers/databricks-morph-plugin/lib/config.yml",
      "name":"Morpheus",
      "supported-dialects": {
        "snowflake": {
          "options": []
        },
        "tsql": {
          "options": []
        }
      },
      "versions": {
        "installed":"0.6.6",
        "latest":null
      }
    },
    {
      "config-path":"/Users/andrew.snare/.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml",
      "name":"Bladebridge",
      "supported-dialects": {
        "datastage": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            },
            {
              "choices": [
                "SPARKSQL",
                "PYSPARK"
              ],
              "flag":"target-tech",
              "method":"CHOICE",
              "prompt":"Specify which technology should be generated"
            }
          ]
        },
        "informatica (desktop edition)": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            },
            {
              "choices": [
                "SPARKSQL",
                "PYSPARK"
              ],
              "flag":"target-tech",
              "method":"CHOICE",
              "prompt":"Specify which technology should be generated"
            }
          ]
        },
        "informatica cloud": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            }
          ]
        },
        "mssql": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            }
          ]
        },
        "netezza": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            }
          ]
        },
        "oracle": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            }
          ]
        },
        "synapse": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            }
          ]
        },
        "teradata": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            }
          ]
        }
      },
      "versions": {
        "installed":"0.1.15",
        "latest":null
      }
    }
  ]
}
```

### Relevant implementation details

The formatting of the details is handled by a new
`TranspilerDescription` class, so that the output format can be properly
controlled. (For compatibility this will need to be controlled tightly,
even if the internal details change.)

Currently there is no lookup to figure out the latest version of an
installed transpiler, that's out of scope for this PR although it has a
position in the JSON output.

### Linked issues

Additional tests in the areas modified by this code are implemented in:

 - #2041 
 - #2042

### Functionality

- added new CLI command: `databricks labs lakebridge describe-transpile`

### Tests

- manually tested
- added unit tests
- added integration tests
Copy link
Collaborator

@sundarshankar89 sundarshankar89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@asnare
Copy link
Contributor Author

asnare commented Sep 26, 2025

❌ test_describe_installed_transpilers: AssertionError: […]

This is a flaky test on main, addressed in #2060.

The choices field for dialect-specific options now defaults to None instead of an empty list.
Copy link
Contributor

@m-abulazm m-abulazm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the thorough testing. a good example PR on how to write tests

I added a few questions inline that will guide the review forward. Regarding my remark about the interface, I dont expect to solve this in this PR but we should create a follow up to address this. because this point blocked my PR about the transpiler version telemetry so would be helpful to have a plan forward

"""
all_configs = self._all_transpiler_configs()
return {config.name: config for _, config in all_configs}
return {config.name: config for _, config in self._all_transpiler_configs()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we really need top standardize this. as discussed before, transpiler path is the ultimate identifier of the transpiler but all the methods that deal with transpiler pathes are private.
a few methods return transpiler names instead of path then the public interface is mostly using transpiler names not pathes.
in some places it is called product name and in other name or transpiler name. we should keep one going forward. please check my other comment


transpiler_names = transpiler_repository.all_transpiler_names()

assert transpiler_names == {"A Transpiler", "B Transpiler"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a test for all_transpiler_configs() for this scenario



def test_get_installed_transpiler_version() -> None:
"""Verify that the installed version of a transpiler can be queried."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in some places we treat product names as transpiler names and in other places not

assert transpilers == {'Bladebridge'}


@pytest.mark.parametrize(("product_name", "version"), (("morpheus", "0.4.0"), ("bladebridge", "0.1.9")))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a moving target?

assert config_path.is_file()


@pytest.mark.parametrize(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal technical pr's not end user facing tech debt design flaws and other cascading effects
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants