Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

Commit

Permalink
Snowflake support (#1763)
Browse files Browse the repository at this point in the history
* Snowflake support

Signed-off-by: markfinksfx <[email protected]>

Co-authored-by: markfinksfx <[email protected]>
Co-authored-by: Ben Keith <[email protected]>
  • Loading branch information
3 people authored May 14, 2021
1 parent 31049db commit 2299480
Show file tree
Hide file tree
Showing 8 changed files with 287 additions and 34 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,5 @@ integration_results.*
local-etc/
test_output/
tmp/
snowenv.sh
snowsql*
29 changes: 27 additions & 2 deletions docs/monitors/sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ currently support and documentation on the connection string:
- `postgres`: https://godoc.org/github.com/lib/pq#hdr-Connection_String_Parameters
- `mysql`: https://github.com/go-sql-driver/mysql#dsn-data-source-name
- `sqlserver`: https://github.com/denisenkom/go-mssqldb#connection-parameters-and-dsn
- `snowflake`: https://pkg.go.dev/github.com/snowflakedb/gosnowflake#hdr-Connection_Parameters

## Parameterized Connection String

Expand All @@ -125,6 +126,30 @@ the `params` config option map. You interpolate variables into it
with the Go template syntax `{{.varname}}` (see example config
above).

## Snowflake Performance and Usage Metrics

To configure the agent to collect Snowflake performance and usage metrics:
- Copy pkg/sql/snowflake-metrics.yaml from this repo into the same location as your agent.yaml file (for example, /etc/signalfx).
- Configure the sql monitor as follows:
```
monitors:
- type: sql
intervalSeconds: 3600
dbDriver: snowflake
params:
account: "account.region"
database: "SNOWFLAKE"
schema: "ACCOUNT_USAGE"
role: "ACCOUNTADMIN"
user: "user"
password: "password"
connectionString: "{{.user}}:{{.password}}@{{.account}}/{{.database}}/{{.schema}}?role={{.role}}"
queries:
{"#from": "/etc/signalfx/snowflake-metrics.yaml"}
```

You can also cut/paste the contents of snowflake-metrics.yaml into agent.yaml under "queries" if needed or preferred. And you can edit snowflake-metrics.yaml to only include metrics you care about.


## Configuration

Expand All @@ -146,8 +171,8 @@ Configuration](../monitor-config.md#common-configuration).**
| `host` | no | `string` | |
| `port` | no | `integer` | (**default:** `0`) |
| `params` | no | `map of strings` | Parameters to the connectionString that can be templated into that option using Go template syntax (e.g. `{{.key}}`). |
| `dbDriver` | no | `string` | The database driver to use, valid values are `postgres`, `mysql` and `sqlserver`. |
| `connectionString` | no | `string` | A URL or simple option string used to connect to the database. If using PostgreSQL, [see the list of connection string params](https://godoc.org/github.com/lib/pq#hdr-Connection_String_Parameters). |
| `dbDriver` | no | `string` | The database driver to use, valid values are `postgres`, `mysql`, `sqlserver`, and `snowflake`. |
| `connectionString` | no | `string` | A URL or simple option string used to connect to the database. For example, if using PostgreSQL, [see the list of connection string params](https://godoc.org/github.com/lib/pq#hdr-Connection_String_Parameters). |
| `queries` | **yes** | `list of objects (see below)` | A list of queries to make against the database that are used to generate datapoints. |
| `logQueries` | no | `bool` | If true, query results will be logged at the info level. (**default:** `false`) |

Expand Down
3 changes: 2 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ require (
github.com/signalfx/signalfx-go-tracing v1.2.0
github.com/sirupsen/logrus v1.8.0
github.com/smartystreets/goconvey v1.6.4
github.com/snowflakedb/gosnowflake v1.4.3
github.com/soniah/gosnmp v0.0.0-20190220004421-68e8beac0db9 // indirect; required; first version with go modules
github.com/stretchr/testify v1.7.0
github.com/tidwall/gjson v1.6.4 // indirect
Expand All @@ -99,7 +100,7 @@ require (
go.etcd.io/etcd/client/v2 v2.305.0-alpha.0
golang.org/x/net v0.0.0-20210119194325-5f4716e94777
golang.org/x/sync v0.0.0-20201207232520-09787c993a3a
golang.org/x/sys v0.0.0-20210124154548-22da62e12c0c
golang.org/x/sys v0.0.0-20210303074136-134d130e1a04
golang.org/x/tools v0.1.0
google.golang.org/grpc v1.33.2
gopkg.in/fatih/set.v0 v0.1.0
Expand Down
65 changes: 40 additions & 25 deletions go.sum

Large diffs are not rendered by default.

25 changes: 25 additions & 0 deletions pkg/monitors/sql/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ monitors:
- `postgres`: https://godoc.org/github.com/lib/pq#hdr-Connection_String_Parameters
- `mysql`: https://github.com/go-sql-driver/mysql#dsn-data-source-name
- `sqlserver`: https://github.com/denisenkom/go-mssqldb#connection-parameters-and-dsn
- `snowflake`: https://pkg.go.dev/github.com/snowflakedb/gosnowflake#hdr-Connection_Parameters
## Parameterized Connection String
Expand All @@ -118,3 +119,27 @@ monitors:
the `params` config option map. You interpolate variables into it
with the Go template syntax `{{.varname}}` (see example config
above).
## Snowflake Performance and Usage Metrics
To configure the agent to collect Snowflake performance and usage metrics:
- Copy pkg/sql/snowflake-metrics.yaml from this repo into the same location as your agent.yaml file (for example, /etc/signalfx).
- Configure the sql monitor as follows:
```
monitors:
- type: sql
intervalSeconds: 3600
dbDriver: snowflake
params:
account: "account.region"
database: "SNOWFLAKE"
schema: "ACCOUNT_USAGE"
role: "ACCOUNTADMIN"
user: "user"
password: "password"
connectionString: "{{.user}}:{{.password}}@{{.account}}/{{.database}}/{{.schema}}?role={{.role}}"
queries:
{"#from": "/etc/signalfx/snowflake-metrics.yaml"}
```
You can also cut/paste the contents of snowflake-metrics.yaml into agent.yaml under "queries" if needed or preferred. And you can edit snowflake-metrics.yaml to only include metrics you care about.
8 changes: 5 additions & 3 deletions pkg/monitors/sql/monitor.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ import (
_ "github.com/go-sql-driver/mysql"
_ "github.com/jackc/pgx/v4/stdlib"
_ "github.com/lib/pq"
_ "github.com/snowflakedb/gosnowflake"
)

var logger = logrus.WithFields(logrus.Fields{"monitorType": monitorMetadata.MonitorType})
Expand Down Expand Up @@ -82,10 +83,11 @@ type Config struct {
// Go template syntax (e.g. `{{.key}}`).
Params map[string]string `yaml:"params"`

// The database driver to use, valid values are `postgres`, `mysql` and `sqlserver`.
// The database driver to use, valid values are `postgres`, `mysql`, `sqlserver`,
// and `snowflake`.
DBDriver string `yaml:"dbDriver"`
// A URL or simple option string used to connect to the database.
// If using PostgreSQL, [see the list of connection string
// For example, if using PostgreSQL, [see the list of connection string
// params](https://godoc.org/github.com/lib/pq#hdr-Connection_String_Parameters).
ConnectionString string `yaml:"connectionString"`

Expand All @@ -98,7 +100,7 @@ type Config struct {

// Validate that the config is right
func (c *Config) Validate() error {
if c.DBDriver != "postgres" && c.DBDriver != "mysql" && c.DBDriver != "sqlserver" {
if c.DBDriver != "postgres" && c.DBDriver != "mysql" && c.DBDriver != "sqlserver" && c.DBDriver != "snowflake" {
return fmt.Errorf("database driver %s is not supported", c.DBDriver)
}

Expand Down
183 changes: 183 additions & 0 deletions pkg/monitors/sql/snowflake-metrics.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
- query: "SELECT STORAGE_BYTES, STAGE_BYTES, FAILSAFE_BYTES from STORAGE_USAGE ORDER BY USAGE_DATE DESC LIMIT 1;"
metrics:
- metricName: "snowflake.storage.storage_bytes.total"
valueColumn: "STORAGE_BYTES"
- metricName: "snowflake.storage.stage_bytes.total"
valueColumn: "STAGE_BYTES"
- metricName: "snowflake.storage.failsafe_bytes.total"
valueColumn: "FAILSAFE_BYTES"
- query: "SELECT DATABASE_NAME, AVERAGE_DATABASE_BYTES, AVERAGE_FAILSAFE_BYTES from DATABASE_STORAGE_USAGE_HISTORY ORDER BY USAGE_DATE DESC LIMIT 1;"
metrics:
- metricName: "snowflake.storage.database.storage_bytes"
valueColumn: "AVERAGE_DATABASE_BYTES"
dimensionColumns: ["DATABASE_NAME"]
- metricName: "snowflake.storage.database.failsafe_bytes"
valueColumn: "AVERAGE_FAILSAFE_BYTES"
dimensionColumns: ["DATABASE_NAME"]
- query: "select SERVICE_TYPE, NAME, sum(CREDITS_USED_COMPUTE), avg(CREDITS_USED_COMPUTE), sum(CREDITS_USED_CLOUD_SERVICES), avg(CREDITS_USED_CLOUD_SERVICES), sum(CREDITS_USED), avg(CREDITS_USED) from METERING_HISTORY where start_time >= date_trunc(day, current_date) group by 1, 2;"
metrics:
- metricName: "snowflake.billing.virtual_warehouse.sum"
valueColumn: "SUM(CREDITS_USED_COMPUTE)"
dimensionColumns: ["SERVICE_TYPE", "NAME"]
- metricName: "snowflake.billing.virtual_warehouse.avg"
valueColumn: "AVG(CREDITS_USED_COMPUTE)"
dimensionColumns: ["SERVICE_TYPE", "NAME"]
- metricName: "snowflake.billing.cloud_service.sum"
valueColumn: "SUM(CREDITS_USED_CLOUD_SERVICES)"
dimensionColumns: ["SERVICE_TYPE", "NAME"]
- metricName: "snowflake.billing.cloud_service.avg"
valueColumn: "AVG(CREDITS_USED_CLOUD_SERVICES)"
dimensionColumns: ["SERVICE_TYPE", "NAME"]
- metricName: "snowflake.billing.total_credit.sum"
valueColumn: "SUM(CREDITS_USED)"
dimensionColumns: ["SERVICE_TYPE", "NAME"]
- metricName: "snowflake.billing.total_credit.avg"
valueColumn: "AVG(CREDITS_USED)"
dimensionColumns: ["SERVICE_TYPE", "NAME"]
- query: "select WAREHOUSE_NAME, sum(CREDITS_USED_COMPUTE), avg(CREDITS_USED_COMPUTE), sum(CREDITS_USED_CLOUD_SERVICES), avg(CREDITS_USED_CLOUD_SERVICES), sum(CREDITS_USED), avg(CREDITS_USED) from WAREHOUSE_METERING_HISTORY where start_time >= date_trunc(day, current_date) group by 1;"
metrics:
- metricName: "snowflake.billing.warehouse.virtual_warehouse.sum"
valueColumn: "SUM(CREDITS_USED_COMPUTE)"
dimensionColumns: ["WAREHOUSE_NAME"]
- metricName: "snowflake.billing.warehouse.virtual_warehouse.avg"
valueColumn: "AVG(CREDITS_USED_COMPUTE)"
dimensionColumns: ["WAREHOUSE_NAME"]
- metricName: "snowflake.billing.warehouse.cloud_service.sum"
valueColumn: "SUM(CREDITS_USED_CLOUD_SERVICES)"
dimensionColumns: ["WAREHOUSE_NAME"]
- metricName: "snowflake.billing.warehouse.cloud_service.avg"
valueColumn: "AVG(CREDITS_USED_CLOUD_SERVICES)"
dimensionColumns: ["WAREHOUSE_NAME"]
- metricName: "snowflake.billing.warehouse.total_credit.sum"
valueColumn: "SUM(CREDITS_USED)"
dimensionColumns: ["WAREHOUSE_NAME"]
- metricName: "snowflake.billing.warehouse.total_credit.avg"
valueColumn: "AVG(CREDITS_USED)"
dimensionColumns: ["WAREHOUSE_NAME"]
- query: "select REPORTED_CLIENT_TYPE, sum(iff(IS_SUCCESS = 'NO', 1, 0)), sum(iff(IS_SUCCESS = 'YES', 1, 0)), count(*) from LOGIN_HISTORY group by REPORTED_CLIENT_TYPE;"
metrics:
- metricName: "snowflake.logins.fail.count"
valueColumn: "SUM(IFF(IS_SUCCESS = 'NO', 1, 0))"
isCumulative: true
dimensionColumns: ["REPORTED_CLIENT_TYPE"]
- metricName: "snowflake.logins.success.count"
valueColumn: "SUM(IFF(IS_SUCCESS = 'YES', 1, 0))"
isCumulative: true
dimensionColumns: ["REPORTED_CLIENT_TYPE"]
- metricName: "snowflake.logins.total"
valueColumn: "COUNT(*)"
isCumulative: true
dimensionColumns: ["REPORTED_CLIENT_TYPE"]
- query: "select WAREHOUSE_NAME, AVG(AVG_RUNNING), AVG(AVG_QUEUED_LOAD), AVG(AVG_QUEUED_PROVISIONING), AVG(AVG_BLOCKED) from WAREHOUSE_LOAD_HISTORY where start_time >= date_trunc(day, current_date) group by 1;"
metrics:
- metricName: "snowflake.query.executed"
valueColumn: "AVG(AVG_RUNNING)"
dimensionColumns: ["WAREHOUSE_NAME"]
- metricName: "snowflake.query.queued_overload"
valueColumn: "AVG(AVG_QUEUED_LOAD)"
dimensionColumns: ["WAREHOUSE_NAME"]
- metricName: "snowflake.query.queued_provision"
valueColumn: "AVG(AVG_QUEUED_PROVISIONING)"
dimensionColumns: ["WAREHOUSE_NAME"]
- metricName: "snowflake.query.blocked"
valueColumn: "AVG(AVG_BLOCKED)"
dimensionColumns: ["WAREHOUSE_NAME"]
- query: "select QUERY_TYPE, WAREHOUSE_NAME, DATABASE_NAME, SCHEMA_NAME, AVG(EXECUTION_TIME), AVG(COMPILATION_TIME), AVG(BYTES_SCANNED), AVG(BYTES_WRITTEN), AVG(BYTES_DELETED), AVG(BYTES_SPILLED_TO_LOCAL_STORAGE), AVG(BYTES_SPILLED_TO_REMOTE_STORAGE) from QUERY_HISTORY where start_time >= date_trunc(day, current_date) group by 1, 2, 3, 4;"
metrics:
- metricName: "snowflake.query.execution_time"
valueColumn: "AVG(EXECUTION_TIME)"
dimensionColumns: ["QUERY_TYPE", "WAREHOUSE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- metricName: "snowflake.query.compilation_time"
valueColumn: "AVG(COMPILATION_TIME)"
dimensionColumns: ["QUERY_TYPE", "WAREHOUSE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- metricName: "snowflake.query.bytes_scanned"
valueColumn: "AVG(BYTES_SCANNED)"
dimensionColumns: ["QUERY_TYPE", "WAREHOUSE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- metricName: "snowflake.query.bytes_written"
valueColumn: "AVG(BYTES_WRITTEN)"
dimensionColumns: ["QUERY_TYPE", "WAREHOUSE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- metricName: "snowflake.query.bytes_deleted"
valueColumn: "AVG(BYTES_DELETED)"
dimensionColumns: ["QUERY_TYPE", "WAREHOUSE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- metricName: "snowflake.query.bytes_spilled.local"
valueColumn: "AVG(BYTES_SPILLED_TO_LOCAL_STORAGE)"
dimensionColumns: ["QUERY_TYPE", "WAREHOUSE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- metricName: "snowflake.query.bytes_spilled.remote"
valueColumn: "AVG(BYTES_SPILLED_TO_REMOTE_STORAGE)"
dimensionColumns: ["QUERY_TYPE", "WAREHOUSE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- query: "select source_cloud, source_region, target_cloud, target_region, transfer_type, avg(bytes_transferred), sum(bytes_transferred) from DATA_TRANSFER_HISTORY where start_time >= date_trunc(day, current_date) group by 1, 2, 3, 4, 5;"
metrics:
- metricName: "snowflake.data_transfer.bytes.avg"
valueColumn: "AVG(BYTES_TRANSFERRED)"
dimensionColumns: ["SOURCE_CLOUD", "SOURCE_REGION", "TARGET_CLOUD", "TARGET_REGION", "TRANSFER_TYPE"]
- metricName: "snowflake.data_transfer.bytes.sum"
valueColumn: "SUM(BYTES_TRANSFERRED)"
dimensionColumns: ["SOURCE_CLOUD", "SOURCE_REGION", "TARGET_CLOUD", "TARGET_REGION", "TRANSFER_TYPE"]
- query: "select table_name, database_name, schema_name, avg(credits_used), sum(credits_used), avg(num_bytes_reclustered), sum(num_bytes_reclustered), avg(num_rows_reclustered), sum(num_rows_reclustered) from automatic_clustering_history where start_time >= date_trunc(day, current_date) group by 1, 2, 3;"
metrics:
- metricName: "snowflake.auto_recluster.credits_used.avg"
valueColumn: "AVG(CREDITS_USED)"
dimensionColumns: ["TABLE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- metricName: "snowflake.auto_recluster.credits_used.sum"
valueColumn: "SUM(CREDITS_USED)"
dimensionColumns: ["TABLE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- metricName: "snowflake.auto_recluster.bytes_reclustered.avg"
valueColumn: "AVG(NUM_BYTES_RECLUSTERED)"
dimensionColumns: ["TABLE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- metricName: "snowflake.auto_recluster.bytes_reclustered.sum"
valueColumn: "SUM(NUM_BYTES_RECLUSTERED)"
dimensionColumns: ["TABLE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- metricName: "snowflake.auto_recluster.rows_reclustered.avg"
valueColumn: "AVG(NUM_ROWS_RECLUSTERED)"
dimensionColumns: ["TABLE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- metricName: "snowflake.auto_recluster.rows_reclustered.sum"
valueColumn: "SUM(NUM_ROWS_RECLUSTERED)"
dimensionColumns: ["TABLE_NAME", "DATABASE_NAME", "SCHEMA_NAME"]
- query: "select table_name, table_schema, avg(ACTIVE_BYTES), avg(TIME_TRAVEL_BYTES), avg(FAILSAFE_BYTES), avg(RETAINED_FOR_CLONE_BYTES) from table_storage_metrics group by 1, 2;"
metrics:
- metricName: "snowflake.storage.table.active_bytes.avg"
valueColumn: "AVG(ACTIVE_BYTES)"
dimensionColumns: ["TABLE_NAME", "TABLE_SCHEMA"]
- metricName: "snowflake.storage.table.time_travel_bytes.avg"
valueColumn: "AVG(TIME_TRAVEL_BYTES)"
dimensionColumns: ["TABLE_NAME", "TABLE_SCHEMA"]
- metricName: "snowflake.storage.table.failsafe_bytes.avg"
valueColumn: "AVG(FAILSAFE_BYTES)"
dimensionColumns: ["TABLE_NAME", "TABLE_SCHEMA"]
- metricName: "snowflake.storage.table.retained_bytes.avg"
valueColumn: "AVG(RETAINED_FOR_CLONE_BYTES)"
dimensionColumns: ["TABLE_NAME", "TABLE_SCHEMA"]
- query: "select pipe_name, avg(credits_used), sum(credits_used), avg(bytes_inserted), sum(bytes_inserted), avg(files_inserted), sum(files_inserted) from pipe_usage_history where start_time >= date_trunc(day, current_date) group by 1;"
metrics:
- metricName: "snowflake.pipe.credits_used.avg"
valueColumn: "AVG(CREDITS_USED)"
dimensionColumns: ["PIPE_NAME"]
- metricName: "snowflake.pipe.credits_used.sum"
valueColumn: "SUM(CREDITS_USED)"
dimensionColumns: ["PIPE_NAME"]
- metricName: "snowflake.pipe.bytes_inserted.avg"
valueColumn: "AVG(BYTES_INSERTED)"
dimensionColumns: ["PIPE_NAME"]
- metricName: "snowflake.pipe.bytes_inserted.sum"
valueColumn: "SUM(BYTES_INSERTED"
dimensionColumns: ["PIPE_NAME"]
- metricName: "snowflake.pipe.files_inserted.avg"
valueColumn: "AVG(FILES_INSERTED)"
dimensionColumns: ["PIPE_NAME"]
- metricName: "snowflake.pipe.files_inserted.sum"
valueColumn: "SUM(FILES_INSERTED)"
dimensionColumns: ["PIPE_NAME"]
- query: "select database_name, avg(credits_used), sum(credits_used), avg(bytes_transferred), sum(bytes_transferred) from replication_usage_history where start_time >= date_trunc(day, current_date) group by 1;"
metrics:
- metricName: "snowflake.replication.credits_used.avg"
valueColumn: "AVG(CREDITS_USED)"
dimensionColumns: ["DATABASE_NAME"]
- metricName: "snowflake.replication.credits_used.sum"
valueColumn: "SUM(CREDITS_USED)"
dimensionColumns: ["DATABASE_NAME"]
- metricName: "snowflake.replication.bytes_transferred.avg"
valueColumn: "AVG(BYTES_TRANSFERRED)"
dimensionColumns: ["DATABASE_NAME"]
- metricName: "snowflake.replication.bytes_transferred.sum"
valueColumn: "SUM(BYTES_TRANSFERRED)"
dimensionColumns: ["DATABASE_NAME"]
Loading

0 comments on commit 2299480

Please sign in to comment.