A comprehensive collection of SQL functions and alert configurations for Databricks FinOps monitoring and alerting.
This repository contains SQL functions and configurations for creating Databricks alerts using the Databricks SQL Alerts API. The solution provides a template-based, configuration-driven approach to creating monitoring alerts for FinOps use cases.
- π§ Template Variables: SQL files use
{{VARIABLE}}placeholders that get replaced during deployment - βοΈ Configuration-Driven: All parameters (warehouse IDs, emails, thresholds) come from YAML config
- π Environment-Agnostic: Same SQL templates work for dev, staging, and production
- π Modern Python Tool: Beautiful CLI with validation and error handling
finops_alerts/
βββ README.md # This file
βββ deploy.py # Python deployment tool
βββ requirements.txt # Python dependencies
βββ config/ # Configuration files
β βββ deployment.yaml # Main deployment configuration
βββ src/ # Source code
β βββ functions/ # SQL functions
β β βββ create_alert.sql # Base alert creation function (uses template variables)
β βββ finops/ # FinOps specific alerts
β βββ cost_monitoring/ # Cost-related alerts
β βββ usage_monitoring/ # Usage-related alerts
β βββ performance/ # Performance-related alerts
βββ docs/ # Documentation
βββ examples.md # Usage examples
- π§ Template-Based SQL: SQL files use
{{VARIABLE}}placeholders for complete parameterization - βοΈ Configuration-Driven: All settings (warehouse IDs, emails, thresholds) managed via YAML
- π Environment-Agnostic: Same templates work across dev, staging, and production
- π Modern Python Tool: Beautiful CLI with validation, error handling, and colorful output
- π FinOps Focused: Pre-built alert templates for cost, usage, and performance monitoring
- π Comprehensive Documentation: Detailed examples and usage guides
# Install dependencies
pip3 install -r requirements.txt
# List available alerts
python3 deploy.py list-alerts config/deployment.yaml
# Validate configuration
python3 deploy.py validate config/deployment.yaml
# Deploy alerts (alerts defined in config)
python3 deploy.py deploy dev config/deployment.yaml
# Deploy to different environment
python3 deploy.py deploy prod config/deployment-prod.yaml
# For different environments, create environment-specific config files:
# config/deployment-dev.yaml, config/deployment-prod.yaml, etc.# Deploy just the base function manually
# Copy and paste the SQL from deploy.py output into Databricks SQL EditorThe system uses template variables for complete flexibility:
-- Query file: src/finops/your_category/your_alert_query.sql
CREATE OR REPLACE VIEW {{CATALOG_NAME}}.{{SCHEMA_NAME}}.your_alert_summary_vw AS
SELECT COUNT(*) as error_count
FROM error_logs
WHERE error_date >= CURRENT_DATE() - interval '{{TIME_INTERVAL}} hours'
AND warehouse_id = '{{WAREHOUSE_ID}}';
-- Alert file: src/finops/your_category/your_alert_alert.sql
SELECT {{CATALOG_NAME}}.{{SCHEMA_NAME}}.create_alert(
display_name => '{{ALERT_NAME}}_alert',
query_text => 'select * from {{CATALOG_NAME}}.{{SCHEMA_NAME}}.{{ALERT_NAME}}_summary_vw',
warehouse_id => '{{WAREHOUSE_ID}}',
threshold_value => {{THRESHOLD_VALUE}},
user_email => '{{USER_EMAIL}}',
cron_schedule => '{{CRON_SCHEDULE}}'
);All {{VARIABLE}} placeholders get automatically replaced with values from your YAML configuration!
- Databricks SQL Alerts Documentation
- Databricks SQL Functions
- Examples - Detailed usage examples
The deployment system uses YAML configuration files to manage environment-specific settings:
# Deployment configuration
deployment:
alerts_to_deploy:
- idle_time
- queue_time_percentage
# - daily_spend # Uncomment to include
environments:
dev:
# Database settings
catalog_name: "aa_catalog"
schema_name: "dw_ops"
# Infrastructure settings
warehouse_id: "4b9b953939869799"
# Notification settings
user_email: "[email protected]"
# Workspace settings
parent_path_root: "/Workspace/Users/[email protected]/"
# Global query settings (can be overridden per alert)
query_settings:
time_interval_hours: 26
default_statement_type: "SELECT"
alerts:
idle_time:
enabled: true # Enable/disable this alert
category: "usage_monitoring"
threshold_value: 10
cron_schedule: "0 0 */1 * * ?" # Every hour
source_display: "idle_cluster_count"
source_name: "idle_cluster_count"
parent_path_suffix: "finops_alerts/usage_monitoring"
description: "Monitors idle cluster count"
# Optional: Override global query settings
# query_settings:
# time_interval_hours: 48 # Override default 26 hours# Deployment configuration
deployment:
alerts_to_deploy:
- idle_time
- queue_time_percentage
environments:
prod:
warehouse_id: "prod-warehouse-id-67890"
user_email: "[email protected]"
parent_path_root: "/Workspace/Users/[email protected]/"
alerts:
idle_time:
enabled: true # Enable/disable this alert
threshold_value: 5 # More sensitive
cron_schedule: "0 */15 * * * ?" # Every 15 minutes
description: "Monitors idle cluster count"| Parameter | Description | Example |
|---|---|---|
catalog_name |
Databricks catalog name | "aa_catalog" |
schema_name |
Databricks schema name | "dw_ops" |
warehouse_id |
SQL warehouse ID | "4b9b953939869799" |
user_email |
Notification email | "[email protected]" |
parent_path_root |
Workspace path root | "/Workspace/Users/finops/" |
threshold_value |
Alert threshold | 10 |
cron_schedule |
Evaluation schedule | "0 */15 * * * ?" |
enabled |
Enable/disable alert | true or false |
source_display |
Column alias for display | "error_count" |
source_name |
Column alias for API | "error_count" |
description |
Human-readable description | "Monitors idle cluster count" |
Each FinOps alert consists of two files:
_query.sql: Creates a view with the monitoring query (uses template variables)_alert.sql: Creates the alert using the view (uses template variables)
The SQL files use template variables that get replaced during deployment:
Note: The create_alert function itself also uses template variables for catalog and schema names.
| Template Variable | Purpose | Config Source |
|---|---|---|
{{CATALOG_NAME}} |
Database catalog | environments.[env].catalog_name |
{{SCHEMA_NAME}} |
Database schema | environments.[env].schema_name |
{{WAREHOUSE_ID}} |
SQL warehouse ID | environments.[env].warehouse_id |
{{USER_EMAIL}} |
Notification email | environments.[env].user_email |
{{PARENT_PATH_ROOT}} |
Workspace path root | environments.[env].parent_path_root |
{{ALERT_NAME}} |
Alert name | From alert configuration |
{{THRESHOLD_VALUE}} |
Alert threshold | alerts.[alert].threshold_value |
{{CRON_SCHEDULE}} |
Evaluation schedule | alerts.[alert].cron_schedule |
{{TIME_INTERVAL}} |
Query time window | environments.[env].query_settings.time_interval_hours |
{{STATEMENT_TYPE}} |
SQL statement type | environments.[env].query_settings.default_statement_type |
{{SOURCE_DISPLAY}} |
Column display name | alerts.[alert].source_display |
{{SOURCE_NAME}} |
Column API name | alerts.[alert].source_name |
{{PARENT_PATH_SUFFIX}} |
Subfolder path | alerts.[alert].parent_path_suffix |
- queue_time_percentage: Monitors query queue time percentage
- Files:
src/finops/performance/queue_time_percentage_query.sql,queue_time_percentage_alert.sql
- Files:
- idle_time: Monitors idle cluster time
- Files:
src/finops/usage_monitoring/idle_time_query.sql,idle_time_alert.sql
- Files:
- daily_spend: Monitors daily spending
- Files:
src/finops/cost_monitoring/daily_spend_query.sql,daily_spend_alert.sql
- Files:
To add a new alert:
-
Create the query file:
src/finops/[category]/[alert_name]_query.sqlCREATE OR REPLACE VIEW {{CATALOG_NAME}}.{{SCHEMA_NAME}}.[alert_name]_summary_vw AS SELECT your_metric_column FROM your_table WHERE compute.warehouse_id = '{{WAREHOUSE_ID}}' AND start_time >= current_timestamp() - interval '{{TIME_INTERVAL}} hours' AND statement_type = '{{STATEMENT_TYPE}}';
-
Create the alert file:
src/finops/[category]/[alert_name]_alert.sqlSELECT {{CATALOG_NAME}}.{{SCHEMA_NAME}}.create_alert( display_name => '{{ALERT_NAME}}_alert', query_text => 'select * from {{CATALOG_NAME}}.{{SCHEMA_NAME}}.{{ALERT_NAME}}_summary_vw', warehouse_id => '{{WAREHOUSE_ID}}', comparison_operator => 'GREATER_THAN', threshold_value => {{THRESHOLD_VALUE}}, user_email => '{{USER_EMAIL}}', cron_schedule => '{{CRON_SCHEDULE}}', source_display => '{{SOURCE_DISPLAY}}', source_name => '{{SOURCE_NAME}}', parent_path => '{{PARENT_PATH_ROOT}}{{PARENT_PATH_SUFFIX}}' );
-
Add configuration to your config file:
alerts: your_alert_name: enabled: true category: "your_category" threshold_value: 10 cron_schedule: "0 */15 * * * ?" source_display: "your_column_name" source_name: "your_column_name" parent_path_suffix: "finops_alerts/your_category" description: "A brief description of your alert" # Optional: Override global query settings query_settings: time_interval_hours: 48 # Override default 26 hours
-
Add to deployment list:
deployment: alerts_to_deploy: - your_alert_name
- Budget Threshold Alerts: Monitor spending against budget limits
- Cost Spike Detection: Alert on unusual cost increases
- Resource Cost Optimization: Identify expensive resources
- Cluster Utilization: Monitor cluster usage patterns
- Storage Growth: Track storage consumption trends
- User Activity: Monitor user engagement and activity
- Query Performance: Monitor slow-running queries
- Resource Contention: Alert on resource bottlenecks
- System Health: Monitor overall system performance
display_name: Unique name for the alertquery_text: SQL query that returns the value to monitorwarehouse_id: SQL warehouse ID to run the query on
comparison_operator: GREATER_THAN, LESS_THAN, EQUAL, etc.threshold_value: Numeric threshold to compare againstuser_email: Email for notificationscron_schedule: Cron expression for evaluation schedulesource_display/name: Column alias from query_textparent_path: Workspace path for alert location
The Python deployment tool (deploy.py) provides a modern, cross-platform solution for deploying FinOps alerts:
- π§ Template Variable Replacement: Automatically replaces
{{VARIABLE}}placeholders with config values - π¨ Beautiful UI: Rich, colorful output with tables and syntax highlighting
- β Configuration Validation: Comprehensive validation of YAML configuration
- π‘οΈ Error Handling: Robust error handling with clear error messages
- π Cross-Platform: Works on Windows, Mac, and Linux
- π€ Interactive: Confirms deployment before proceeding
list-alerts: Display all available alerts with their statusvalidate: Validate configuration file for errorsdeploy: Deploy alerts to specified environment (with template variable replacement)
During deployment, the tool automatically replaces template variables:
{{WAREHOUSE_ID}}β Your warehouse ID from config{{USER_EMAIL}}β Your email from config{{THRESHOLD_VALUE}}β Alert threshold from config- And many more...
For detailed examples, see examples.md.
- Fork the repository
- Create a feature branch
- Add your alert configurations to the appropriate folder
- Update configuration files as needed
- Test with
python3 deploy.py validate config/deployment.yaml - Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
For issues and questions:
- Check the Databricks documentation
- Review the examples
- Open an issue in this repository