Skip to content

[BUG]: CREATE TABLE transpiler leaves DISTRIBUTION / HEAP / ROUND_ROBIN clauses even with custom config (Synapse -> Databricks) #1995

@Lomeek

Description

@Lomeek

Is there an existing issue for this?

  • I have searched the existing issues

Category of Bug / Issue

Converter bug

Current Behavior

I’m using a custom configuration for the Synapse -> Databricks transpiler that correctly converts data types and syntax to Databricks. However, when converting DDL (CREATE TABLE) statements, the transpiler still leaves unsupported clauses such as DISTRIBUTION = HASH(...), HEAP and ROUND_ROBIN.

Example:

  • CREATE TABLE #IND ... WITH (DISTRIBUTION = HASH(ROW_WID_SKEY), HEAP) or DISTRIBUTION = ROUND_ROBIN is partially converted to CREATE OR REPLACE TABLE TEMP_TABLE_IND with proper type conversions, but the unsupported WITH clause remains, which is invalid in Databricks.

Custom Config:

{
  "inherit_from": [
    "base_synapse2databricks_sql.json"
  ],
  "line_subst": [
    {
      "from": "\\bSET ANSI_NULLS ON\\b",
      "to": "__BLANK__"
    },
    {
      "from": "\\bSET QUOTED_IDENTIFIER ON\\b",
      "to": "__BLANK__"
    },
    {
      "from": "\\bGO\\b",
      "to": "__BLANK__"
    },
    {
      "from": "\\buniqueidentifier\\b",
      "to": "STRING"
    },
    {
      "from": "\\bnvarchar\\b",
      "to": "STRING"
    },
    {
      "from": "\\bdatetime\\b",
      "to": "TIMESTAMP"
    },
    {
      "from": "\\bnumeric\\b",
      "to": "DECIMAL"
    },
    {
      "from": "\\bint\\b",
      "to": "INT"
    },
    {
      "from": "\\bCREATE\\s+TABLE\\b",
      "to": "CREATE OR REPLACE TABLE"
    },
    {
      "from": "\\[(.*?)\\]",
      "to": "`$1`"
    },
    {
      "from": "^\\s*;\\s*$",
      "to": "__BLANK__"
    }
  ],
  "block_subst": [
    {
      "from": "/\\*[\\s\\S]*?\\*/",
      "to": "__BLANK__"
    },
    {
      "from": "WITH\\s*\\([\\s\\S]*?\\)\\s*(?:GO)?",
      "to": "__BLANK__"
    },
    {
      "from": "IDENTITY\\s*\\([^)]*\\)",
      "to": "__BLANK__"
    }
  ],
  "function_subst": [
    {
      "from": "STRING_AGG",
      "output_template": "array_join(collect_list($1), $2)",
      "num_args": 2
    }
  ]
}
Image Image

Expected Behavior

The transpiler should remove unsupported clauses (DISTRIBUTION, HEAP, ROUND_ROBIN) even when using a custom config. Data types and syntax should be converted correctly, and unsupported clauses removed automatically.

Steps To Reproduce

  • Apply your custom config (line, block, function substitutions) to a CREATE TABLE statement containing WITH (DISTRIBUTION = HASH(...), HEAP) or DISTRIBUTION = ROUND_ROBIN
  • Observe that the output still contains the unsupported clause.

Relevant log output or Exception details

Logs Confirmation

  • I ran the command line with --debug
  • I have attached the lsp-server.log under USER_HOME/.databricks/labs/remorph-transpilers/<converter_name>/lib/lsp-server.log

Sample Query

Operating System

Windows

Version

latest via Databricks CLI

Metadata

Metadata

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions