Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing host Parameter Support in spark-snowflake Connector #601

Open
maciekgrochowicz opened this issue Feb 3, 2025 · 0 comments
Open

Comments

@maciekgrochowicz
Copy link

Description:

I'm encountering an issue where some connection parameters available in the standard Snowflake JDBC (and Python connector) are missing in the spark-snowflake connector. In particular, the host parameter is not available.

Current Behavior:

When configuring the connection with the spark-snowflake connector, the options available are as follows:

var sfOptions = Map(
    "sfURL" -> "<account_identifier>.snowflakecomputing.com",
    "sfUser" -> "<user_name>",
    "sfPassword" -> "<password>",
    "sfDatabase" -> "<database>",
    "sfSchema" -> "<schema>",
    "sfWarehouse" -> "<warehouse>"
)

In contrast, the standard JDBC (Python) connection allows for a host parameter:

ctx = snowflake.connector.connect(
    user="<username>",
    host="<hostname>",
    account="<account_identifier>",
    authenticator="oauth",
    token="<oauth_access_token>",
    warehouse="test_warehouse",
    database="test_db",
    schema="test_schema"
)

Impact:

I am running my Spark application inside a Snowpark container. With the standard spark-snowflake connector, my connection uses a public IP address to access Snowflake, which is blocked by our network policy. I would prefer for my container to be recognized as originating from Snowflake's network—something that appears possible when the host parameter is provided, as with the standard JDBC connector. Since the spark-snowflake connector uses the Snowflake JDBC under the hood, it seems that exposing this parameter would be a straightforward enhancement.

Expected Behavior:

It would be ideal if the spark-snowflake connector allowed the host parameter (and any similar missing parameters) to be passed through to the JDBC connector, similar to how it is done in the Python connector.

Additional Information:

Could you please consider exposing this parameter in the connector configuration? This would greatly simplify network configuration issues without requiring changes to our network policies.

I could create a PR mysefl. Please let me know if you are okay with this approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant