Skip to content

Conversation

asnare
Copy link
Contributor

@asnare asnare commented Sep 23, 2025

Changes

This PR updates type hints on the configuration classes, and implements proper checking of the configuration during loading. Changes include:

  • We now distinguish between missing attributes and attributes that are present but malformed.
  • With the options for a dialect, the prompt is now optional instead of stubbing with an empty string. (The prompt is not needed for the FIXED option type.)
  • For options, the combinations of choices/default are now dealt with properly depending on the kind of option it is.
  • Some tests that were using invalid fixtures have been fixed.

Caveats/things to watch out for when reviewing

During deserialisation, in many places we now handle unhappy paths via exceptions rather than checking in advance: the latter is unambiguous whereas its easy to have subtle issues when the check doesn't quite correctly anticipate or correspond to the actual usage.

Relevant implementation details

Sadly, most of this has been done by hand even though there are libraries that do this sort of thing for us.

Tests

  • existing unit tests

…from YAML.

Error messages are now more accurate when something is wrong.

Further to the above, some test fixtures were using invalid configurations, and this needed fixing.
@asnare asnare self-assigned this Sep 23, 2025
@asnare asnare requested a review from a team as a code owner September 23, 2025 15:10
@asnare asnare added tech debt design flaws and other cascading effects internal technical pr's not end user facing labels Sep 23, 2025
Copy link

github-actions bot commented Sep 23, 2025

✅ 29/29 passed, 2 flaky, 1m39s total

Flaky tests:

  • 🤪 test_transpiles_informatica_with_sparksql (9.94s)
  • 🤪 test_transpile_sql_file (9.592s)

Running from acceptance #2383

Copy link
Contributor

@m-abulazm m-abulazm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this greatly improves the user experience. thanks for this

but I would have not implemented it this way. if you are working on this part, why didnt you try to use cattrs or something similar?

I also added a few suggestions to improve readability.

@asnare
Copy link
Contributor Author

asnare commented Sep 24, 2025

why didnt you try to use cattrs or something similar?

That's a good question, and we've asked ourselves the same question especially with respect to similar code in blueprint. I think there's a preference for Pydantic over cattrs, but the point stands.

The general situation with respect to dependencies is:

  • Any new dependencies need to go through legal review. (In this case it would probably be fine, and from a practical viewpoint I think it's a transitive dependency anyway. But still legal need to look at it.)
  • We need to consider the situation with respect to other dependencies that use the library and how likely conflicts are. This tends to be a bit messy, and cause headaches.
  • It needs to be a library that we think that all developers can use and understand.
  • Needs to be a strong benefit to including it.

Back to the situation at hand: I agree we should be using a library to do this, and have created #2050 to cover this.

This hints at being more than just a getter.
Copy link
Contributor

@m-abulazm m-abulazm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added to some of the open points and resolved others

Copy link
Contributor

@m-abulazm m-abulazm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks better. it is a good basis if more config is added in the future.
I left one more comment regarding one more refactoring.
I also don't mind leaving things as is as long we are aligned

asnare added a commit that referenced this pull request Sep 25, 2025
…e installed transpilers (#2051)

## Changes

### What does this PR do?

This PR implements a `describe-transpile` subcommand that describes the
currently installed transpilers and associated dialects and
configuration. This is intended for diagnostics and use by the UI. When
run normally, it provides output like this:
```
% databricks labs lakebridge describe-transpile
Transpiler   Installed Version  Plugin Configuration
==========   =================  ====================
Morpheus     0.6.6              /Users/me/.databricks/labs/remorph-transpilers/databricks-morph-plugin/lib/config.yml
Bladebridge  0.1.15             /Users/me/.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml

Supported Source Dialects
=========================
 - datastage
 - informatica (desktop edition)
 - informatica cloud
 - mssql
 - netezza
 - oracle
 - snowflake
 - synapse
 - teradata
 - tsql
```

When the `--output=json` option is provided to the Databricks CLI, more
information is available:
```json
% databricks labs lakebridge describe-transpile --output=json
{
  "available-dialects": [
    "datastage",
    "informatica (desktop edition)",
    "informatica cloud",
    "mssql",
    "netezza",
    "oracle",
    "snowflake",
    "synapse",
    "teradata",
    "tsql"
  ],
  "installed-transpilers": [
    {
      "config-path":"/Users/andrew.snare/.databricks/labs/remorph-transpilers/databricks-morph-plugin/lib/config.yml",
      "name":"Morpheus",
      "supported-dialects": {
        "snowflake": {
          "options": []
        },
        "tsql": {
          "options": []
        }
      },
      "versions": {
        "installed":"0.6.6",
        "latest":null
      }
    },
    {
      "config-path":"/Users/andrew.snare/.databricks/labs/remorph-transpilers/bladebridge/lib/config.yml",
      "name":"Bladebridge",
      "supported-dialects": {
        "datastage": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            },
            {
              "choices": [
                "SPARKSQL",
                "PYSPARK"
              ],
              "flag":"target-tech",
              "method":"CHOICE",
              "prompt":"Specify which technology should be generated"
            }
          ]
        },
        "informatica (desktop edition)": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            },
            {
              "choices": [
                "SPARKSQL",
                "PYSPARK"
              ],
              "flag":"target-tech",
              "method":"CHOICE",
              "prompt":"Specify which technology should be generated"
            }
          ]
        },
        "informatica cloud": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            }
          ]
        },
        "mssql": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            }
          ]
        },
        "netezza": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            }
          ]
        },
        "oracle": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            }
          ]
        },
        "synapse": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            }
          ]
        },
        "teradata": {
          "options": [
            {
              "default":"\u003cnone\u003e",
              "flag":"overrides-file",
              "method":"QUESTION",
              "prompt":"Specify the config file to override the default[Bladebridge] config - press \u003center\u003e for none"
            }
          ]
        }
      },
      "versions": {
        "installed":"0.1.15",
        "latest":null
      }
    }
  ]
}
```

### Relevant implementation details

The formatting of the details is handled by a new
`TranspilerDescription` class, so that the output format can be properly
controlled. (For compatibility this will need to be controlled tightly,
even if the internal details change.)

Currently there is no lookup to figure out the latest version of an
installed transpiler, that's out of scope for this PR although it has a
position in the JSON output.

### Linked issues

Additional tests in the areas modified by this code are implemented in:

 - #2041 
 - #2042

### Functionality

- added new CLI command: `databricks labs lakebridge describe-transpile`

### Tests

- manually tested
- added unit tests
- added integration tests
Copy link
Contributor

@m-abulazm m-abulazm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved

Copy link
Collaborator

@sundarshankar89 sundarshankar89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@asnare asnare added this pull request to the merge queue Sep 26, 2025
Merged via the queue into main with commit 7efd6db Sep 26, 2025
9 checks passed
@asnare asnare deleted the lsp-config-type-checking branch September 26, 2025 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal technical pr's not end user facing tech debt design flaws and other cascading effects
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants