Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions docs/content/faq/app_faq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
title: "App"
date: 2025-08-29T14:50:11-04:00
weight: 63
draft: false
---

**Q.1. Do I need to run an initial setup before using the DLT Meta App ?**
Yes. Before you can use the DLT Meta App, you must click the Setup button to create the required DLT meta environment. This initializes the app and enables you to onboard or manage Delta Live Tables (DLT) pipelines.

**Q. 2. What are the main features of the DLT Meta App ?**
The DLT Meta App provides several key capabilities:

Onboard new DLT pipelines through an interactive interface.
Deploy and manage DLT pipelines directly in the app.
Run DLT meta app demo flows to explore example pipelines and usage patterns.
Use the command-line interface (CLI) to automate onboarding, deployment, and management operations.

**Q. 3. Who can access and use the DLT Meta App ?**
Only authenticated Databricks workspace users with appropriate permissions can access and use the app:

You need CAN_USE permission to run the app and CAN_MANAGE permission to administer it.
The app can be shared within your workspace or account, but not with external users.
Every user must log in with their Databricks account credentials.

**Q. 4. How does catalog and schema access work in the DLT Meta App ?**
By default, the app uses a dedicated Service Principal (SP) identity for all data and resource access:

The SP must have explicit permissions (such as USE CATALOG, USE SCHEMA, SELECT) on all Unity Catalog resources the DLT pipelines reference.
If a user accesses the app—even via the same URL—their abilities depend on the SP’s granted access. If the SP lacks permissions, the app’s functionality fails for all users.
Optionally, the app can be configured to use users’ own permissions via On-Behalf-Of (OBO) mode, but this requires additional setup.

**Q. 5. How should I resolve access errors or permission issues in the app ?**
If you experience errors related to catalog, schema, or table access:

Verify the app’s Service Principal has the required permissions in Unity Catalog.
Confirm the app is attached to the necessary resources, such as warehouses or secrets.
Check if recent administrative or sharing changes have affected your privileges.
Review audit logs for permission denials or configuration changes.
Consult your Databricks workspace administrator if necessary.

**Q. 6. How is sharing, security, and isolation managed for the DLT Meta App ?**
The app operates in a multi-tenant platform, but provides strong isolation between customer accounts and apps.
Each app runs on a dedicated, isolated, serverless compute environment.
Sharing is restricted to specific users, groups, or all account users; every sharing and permission event is audit-logged.
There is no option for public or anonymous access—only Databricks account users can run the app.

**Q. 7. What are best practices for securing and operating the DLT Meta App ?**
Grant only the minimum required catalog and schema permissions to the app’s Service Principal (principle of least privilege).
Regularly review all permission and sharing changes using audit logs.
Allow only trusted application code to run, especially if enabling OBO mode.
Use workspace monitoring tools and collaborate with your Databricks administrator for access adjustments or troubleshooting.
Binary file added docs/static/images/app_cli.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/static/images/app_deploy_pipeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/static/images/app_onboarding.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/static/images/app_run_demos.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
64 changes: 52 additions & 12 deletions lakehouse_app/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
import logging
import errno
import re
# Use pty to create a pseudo-terminal for better interactive support
import pty
import select
import fcntl
Expand All @@ -16,7 +15,6 @@
import signal
import json

# Configure logging
logging.basicConfig(level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[logging.FileHandler("dlt-meta-app.log"),
Expand Down Expand Up @@ -227,15 +225,15 @@ def start_command():
if 'PYTHONPATH' not in os.environ or not os.path.isdir(os.environ.get('PYTHONPATH', '')):
commands = [
"pip install databricks-cli",
# "git clone https://github.com/databrickslabs/dlt-meta.git",
"git clone https://github.com/dattawalake/dlt-meta.git",
"git clone https://github.com/databrickslabs/dlt-meta.git",
f"python -m venv {current_directory}/dlt-meta/.venv",
f"export HOME={current_directory}",
"cd dlt-meta",
"source .venv/bin/activate",
f"export PYTHONPATH={current_directory}/dlt-meta/",
"pwd",
"pip install databricks-sdk",
"pip install PyYAML",
]
print("Start setting up dlt-meta environment")
for c in commands:
Expand Down Expand Up @@ -322,6 +320,7 @@ def handle_onboard_form():
"silver_schema": request.form.get('silver_schema', 'dltmeta_silver_7b4e981029b843c799bf61a0a121b3ca'),
"dlt_meta_layer": request.form.get('dlt_meta_layer', '1'),
"bronze_table": request.form.get('bronze_table', 'bronze_dataflowspec'),
"silver_table": request.form.get('silver_table', 'silver_dataflowspec'),
"overwrite": "1" if request.form.get('overwrite') == "1" else "0",
"version": request.form.get('version', 'v1'),
"environment": request.form.get('environment', 'prod'),
Expand Down Expand Up @@ -375,26 +374,67 @@ def handle_deploy_form():
def run_demo():
code_to_run = request.json.get('demo_name', '')
print(f"processing demo for :{request.json}")
current_directory = os.environ['PYTHONPATH'] # os.getcwd()
current_directory = os.environ['PYTHONPATH']
demo_dict = {"demo_cloudfiles": "demo/launch_af_cloudfiles_demo.py",
"demo_acf": "demo/launch_acfs_demo.py",
"demo_silverfanout": "demo/launch_silver_fanout_demo.py",
"demo_dias": "demo/launch_dais_demo.py"
"demo_dias": "demo/launch_dais_demo.py",
"demo_dlt_sink": "demo/launch_dlt_sink_demo.py",
"demo_dabs": "demo/generate_dabs_resources.py"
}
demo_file = demo_dict.get(code_to_run, None)
uc_name = request.json.get('uc_name', '')
result = subprocess.run(f"python {current_directory}/{demo_file} --uc_catalog_name {uc_name} --profile DEFAULT",
shell=True,
capture_output=True,
text=True
)

if code_to_run == 'demo_dabs':

# Step 1: Generate Databricks resources
subprocess.run(f"python {current_directory}/{demo_file} --uc_catalog_name {uc_name} "
f"--source=cloudfiles --profile DEFAULT",
shell=True,
capture_output=True,
text=True
)

# Step 2: Change working directory to demo/dabs for all next commands
subprocess.run("databricks bundle validate --profile=DEFAULT", cwd=f"{current_directory}/demo/dabs",
shell=True,
capture_output=True,
text=True)

# Step 4: Deploy the bundle
subprocess.run("databricks bundle deploy --target dev --profile=DEFAULT",
cwd=f"{current_directory}/demo/dabs", shell=True,
capture_output=True,
text=True)

# Step 5: Run 'onboard_people' task
rs1 = subprocess.run("databricks bundle run onboard_people -t dev --profile=DEFAULT",
cwd=f"{current_directory}/demo/dabs", shell=True,
capture_output=True,
text=True)
print(f"onboarding completed: {rs1.stdout}")
# Step 6: Run 'execute_pipelines_people' task
result = subprocess.run("databricks bundle run execute_pipelines_people -t dev --profile=DEFAULT",
cwd=f"{current_directory}/demo/dabs",
shell=True,
capture_output=True,
text=True
)
print(f"execution of pipeline completed: {result.stdout}")
else:
result = subprocess.run(f"python {current_directory}/{demo_file} --uc_catalog_name {uc_name} "
f"--profile DEFAULT",
shell=True,
capture_output=True,
text=True
)
return extract_command_output(result)


def extract_command_output(result):
stdout = result.stdout
job_id_match = re.search(r"job_id=(\d+) | pipeline=(\d+)", stdout)
url_match = re.search(r"url=(https?://[^\s]+)", stdout)
url_match = re.search(r"(https?://[^\s]+)", stdout)

job_id = job_id_match.group(1) or job_id_match.group(2) if job_id_match else None
job_url = url_match.group(1) if url_match else None
Expand Down
14 changes: 11 additions & 3 deletions lakehouse_app/templates/landingPage.html
Original file line number Diff line number Diff line change
Expand Up @@ -655,6 +655,11 @@ <h2 class='step-heading'>Step 1 : Onboarding</h2>
<input id="bronze_table" name="bronze_table" placeholder="Enter bronze table name"
type="text" value="bronze_dataflowspec">
</div>
<div class="form-group">
<label>Provide silver dataflow spec table name:</label>
<input id="silver_table" name="silver_table" placeholder="Enter silver table name"
type="text" value="silver_dataflowspec">
</div>
<div class="form-group">
<label>Overwrite dataflow spec?</label>
<div class="radio-group">
Expand Down Expand Up @@ -841,6 +846,8 @@ <h3 class='step-heading'>Available Demos</h3>
<button class="command-button2" data-command="demo_acf">Demo Apply Changes Snapshot</button>
<button class="command-button2" data-command="demo_silverfanout">Demo Silver fanout</button>
<button class="command-button2" data-command="demo_dias">Demo Dias</button>
<button class="command-button2" data-command="demo_dlt_sink">Demo Sink</button>


</div>

Expand Down Expand Up @@ -956,7 +963,7 @@ <h5 class="modal-title">Please wait...</h5>
const modalContent = `
<div class="modal-content">
<h3 class='step-heading'>${data.modal_content.title}</h3>
<p>Job ID: ${data.modal_content.job_id}</p>
${data.modal_content.job_id ? `<p>Job ID: ${data.modal_content.job_id}</p>`:""}

<p><a href="${url}" target="_blank">Open Job in Databricks</a></p >

Expand Down Expand Up @@ -994,7 +1001,8 @@ <h3 class='step-heading'>${data.modal_content.title}</h3>
const modalContent = `
<div class="modal-content">
<h3 class='step-heading'>${data.modal_content.title}</h3>
<p>Job ID: ${data.modal_content.job_id}</p>
${data.modal_content.job_id ? `<p>Job ID: ${data.modal_content.job_id}</p>
`:""}

<p><a href="${url}" target="_blank">Open Job in Databricks</a></p >

Expand Down Expand Up @@ -1067,7 +1075,7 @@ <h3 class='step-heading'>${data.modal_content.title}</h3>
const modalContent = `
<div class="modal-content">
<h3 class='step-heading'>${data.modal_content.title}</h3>
<p>Job ID: ${data.modal_content.job_id}</p>
${data.modal_content.job_id ? `<p>Job ID: ${data.modal_content.job_id}</p>`:""}

<p><a href="${url}" target="_blank">Open Job in Databricks</a></p >

Expand Down
2 changes: 1 addition & 1 deletion src/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -638,7 +638,7 @@ def _load_onboard_config_ui(self, form_data) -> OnboardCommand:
onboard_cmd_dict["bronze_dataflowspec_path"] = f'{self._install_folder()}/bronze_dataflow_specs'

if onboard_cmd_dict["onboard_layer"] == "silver" or onboard_cmd_dict["onboard_layer"] == "bronze_silver":
onboard_cmd_dict["silver_dataflowspec_table"] = 'silver_dataflowspec' # Not in form, using default
onboard_cmd_dict["silver_dataflowspec_table"] = form_data.get('silver_table', 'silver_dataflowspec')
if not onboard_cmd_dict["uc_enabled"]:
onboard_cmd_dict["silver_dataflowspec_path"] = f'{self._install_folder()}/silver_dataflow_specs'

Expand Down
Loading