Skip to content

Conversation

sundarshankar89
Copy link
Collaborator

@sundarshankar89 sundarshankar89 commented Sep 15, 2025

Changes

What does this PR do?

  • Introduces the Profiler Skeleton for the Lakebridge project.
  • Adds initial Profiler class with supporting utilities and constants.
  • Implements core logic for profiling supported source technologies with placeholder support for MSSQL and Synapse.
  • Sets up the structure for profiling pipelines, including config file handling and extraction logic.

Relevant implementation details

Caveats/things to watch out for when reviewing:

Linked issues

Resolves #..

Functionality

  • added relevant user documentation
  • added new CLI command
  • modified existing command: databricks labs lakebridge ...
  • ... +add your own

Tests

  • manually tested
  • added unit tests
  • added integration tests

@sundarshankar89 sundarshankar89 requested a review from a team as a code owner September 15, 2025 08:37
@sundarshankar89 sundarshankar89 self-assigned this Sep 15, 2025
@sundarshankar89 sundarshankar89 added feat/profiler Issues related to profilers stacked PR Should be reviewed, but not merged labels Sep 15, 2025
Copy link

github-actions bot commented Sep 15, 2025

✅ 32/32 passed, 3 flaky, 2m14s total

Flaky tests:

  • 🤪 test_validate_mixed_checks (193ms)
  • 🤪 test_transpiles_informatica_with_sparksql (12.05s)
  • 🤪 test_transpile_sql_file (12.5s)

Running from acceptance #2538

@sundarshankar89 sundarshankar89 changed the base branch from main to feature/configure-assessment September 15, 2025 08:46
Copy link
Contributor

@m-abulazm m-abulazm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposed two small refactoring.

  1. use better name for the constant PLATFORM_TO_PIPELINE_CFG
  2. inject the config in the constructor so we can get rid of patching in the tests

Copy link
Contributor

@goodwillpunning goodwillpunning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great foundation for an extensible profiler class. LGTM!

"synapse": "src/databricks/labs/lakebridge/resources/assessments/synapse/pipeline_config.yml",
}

# TODO modify this PLATFORM_TO_SOURCE_TECHNOLOGY.keys() once all platforms are supported
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

pipeline_config: PipelineConfig | None = None,
) -> None:
platform = self._platform.lower()
if not pipeline_config:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems to only validate that pipeline_config is not None, but _execute() also handles FNF exceptions. Maybe consider collapsing these 2 functions into 1 for simplicity's sake.


@staticmethod
def path_modifier(*, config_file: str | Path, path_prefix: Path = PRODUCT_PATH_PREFIX) -> PipelineConfig:
# TODO: Make this work install during developer mode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this TODO mean

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

databricks labs install .
This is the developer mode we use, with the way the current path_modifier is defined, and it doesn't pick up the latest changes unless it is installed in ~./.databricks/labs

@gueniai gueniai requested a review from m-abulazm October 6, 2025 16:10
Base automatically changed from feature/configure-assessment to main October 6, 2025 17:03
@sundarshankar89 sundarshankar89 changed the base branch from main to feature/synapse_profiler_scripts October 7, 2025 06:12
Copy link
Contributor

@m-abulazm m-abulazm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat/profiler Issues related to profilers stacked PR Should be reviewed, but not merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants