-
Notifications
You must be signed in to change notification settings - Fork 75
Add llm-transpile
command with Switch runner and integration tests
#2078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
hiroyukinakazato-db
wants to merge
19
commits into
main
Choose a base branch
from
feature/llm-transpile
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Add Switch installer with resource configuration and job creation - Implement uninstall functionality with proper cleanup - Add comprehensive test coverage for SwitchInstaller - Improve path handling and type-safe configuration - Add include-llm-transpiler option for flexible installation
✅ 29/29 passed, 2 flaky, 2m17s total Flaky tests:
Running from acceptance #2615 |
Implement SwitchInstaller to integrate Switch transpiler with Lakebridge: - Install Switch package to local virtual environment and deploy to workspace - Create and manage Databricks job for Switch transpilation - Configure Switch resources (catalog, schema, volume) interactively - Support job-level parameters with JobParameterDefinition for flexibility - Handle installation state and job lifecycle management - Add comprehensive test suite covering installation, job management, and configuration
804507e
to
d884a20
Compare
The SwitchInstaller was failing to find the config when the config.yml used "Switch" (capitalized) as the name, while the code only checked for "switch" (lowercase). This caused job creation to fail with a "config.yml not found" error. Updated _get_switch_job_parameters() to check both the display name (capitalized) and transpiler ID (lowercase) to handle both cases.
Separates Switch transpiler's local installation logic from workspace deployment, following established patterns (BladebridgeInstaller for local installation, ReconDeployment for workspace deployment). Key changes: - Add SwitchDeployment class (~260 lines) for workspace operations - Simplify SwitchInstaller to match BladebridgeInstaller pattern (~20 lines) - Add include_llm and switch_resources fields to TranspileConfig - Update WorkspaceInstallation to use SwitchDeployment - Refactor tests to avoid protected member access using fixture separation - Group Switch-related tests in TestSwitchInstallation class
Implement llm-transpile command for LLM-based code transpilation: - Add SwitchInstaller for Switch transpiler package management - Install Switch package and deploy to workspace - Create and manage Databricks jobs with job-level parameters - Configure Switch resources (catalog, schema, volume) - Add SwitchRunner for executing Switch transpilation jobs - Upload source files to workspace volume - Execute transpilation via Databricks job - Download results and handle job lifecycle - Add llm-transpile CLI command with Switch transpiler support - Add comprehensive unit and integration tests
Move _get_switch_package_path() from WorkspaceInstallation to SwitchDeployment as a protected method, following Single Responsibility Principle. SwitchDeployment now resolves its own package path internally. Changes: - Add _get_switch_package_path() protected method to SwitchDeployment - Update SwitchDeployment.install() signature to remove path parameter - Remove duplicate _get_switch_package_path() from WorkspaceInstallation - Remove unused sys and TranspilerRepository imports from installation.py - Update tests to use new interface with mocked path resolution
Update test_installation.py to match the refactored SwitchDeployment.install() interface that now takes only resources parameter (path resolution is internal). Changes: - Remove switch_repository fixture parameter from test methods - Delete unused _StubTranspilerRepository stub class - Remove unused imports (Path, TranspilerRepository) - Update assertions to check only resources argument The tests verify that: 1. Switch installation uses configured resources correctly 2. Missing resources logs appropriate error message
4f551ce
to
bacd5f6
Compare
Sync with main branch to incorporate latest documentation updates # Conflicts: # labs.yml
The wait_for_completion option is intended for local CLI execution only and should not be included in Databricks job parameters. This change filters it out when building job parameter definitions. Changes: - Add excluded_options set to filter local-only options - Skip wait_for_completion when converting config.yml options - Add test using FriendOfSwitchDeployment pattern to verify exclusion
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
This PR adds the
llm-transpile
command for LLM-powered SQL conversion using the Switch transpiler.What does this PR do?
Adds
llm-transpile
CLI command that runs Switch transpiler jobs with parameter passing support.Relevant implementation details
CLI Integration:
llm-transpile
command to Lakebridge CLISwitch Runner Implementation:
SwitchConfig
: manages Switch resources and job ID retrieval from InstallStateSwitchRunner
: orchestrates Switch job execution with parametersTesting:
Development Environment:
.env
to.gitignore
for local development credentialsCaveats/things to watch out for when reviewing:
transpile
andrecon
command patternsrecon
pattern)--output-ws-folder
(not--output-folder
) to explicitly indicate workspace folder--include-llm-transpiler
flag #2066 (Switch installation) to be merged firstLinked issues
Resolves #2047
Functionality
databricks labs lakebridge llm-transpile
Tests