Skip to content

Add Redshift Iam Idc token authentication method with an eye towards future supported Idps #970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jan 16, 2025

Conversation

VersusFacit
Copy link
Contributor

@VersusFacit VersusFacit commented Dec 17, 2024

resolves #898

Problem

Add refresh_token-based authentication method. refresh_token allows for use of Iam Idc tokens that we generate adhoc for the redshift_connector.connect call. We had originally sought to provide a single access token and reuse it until TTL reached, but this is impossible -- tokens are only good for one use in this integration. Hence we must adhoc generate one each time and refresh tokens are an expedient and industry compliant method for doing so

Solution

Provide a refresh token endpoint and necessary information to enable this.

Additional testing

End to end test using the following profile which points to a test redshift cluster on an AWS account with an integrated Okta <> Iam idc <> Redshift token authentication suite.

image

class TestMyTest:
    @pytest.fixture(scope="class")
    def models(self):
        return {
            "base_table.sql": "{{ config(materialized='table') }} select 1 as id",
        }

    @pytest.fixture(scope="class")
    def dbt_profile_target(self):
        return {
            "type": "redshift",
            "host": "...",
            "port": 5439,
            "dbname": "dev",
            "threads": 1,
            "token_endpoint": {
                "request_url": "https://....oktapreview.com/oauth2/default/v1/token",
                "idp_auth_credentials": ...,
                "request_data": 'grant_type=refresh_token&redirect_uri=http%3A%2F%2Flocalhost%3A8080%2Flogin%2Foauth2%2Fcode%2Fokta&refresh_token=...'

            },
            "method": "oauth_token_identity_center",
            "schema": "dbt_mila",
        }


    def test_my_test(self, project):
        run_dbt()

where ... is some credential I've used for testing.

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

@cla-bot cla-bot bot added the cla:yes label Dec 17, 2024
Copy link
Contributor

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the dbt-redshift contributing guide.

@VersusFacit VersusFacit self-assigned this Dec 18, 2024
We expect users of this method to provide a YAML-structured set of params including a uri, an authentication string, and whatever paramters might be needed to construct the correct payload equivalent to data in a curl request. There is an all-important under the hood POST which needs a set of params unique to each identity provider to generate access tokens for use with TokenAuthIdpPlugin.
@VersusFacit VersusFacit force-pushed the ADAP-898/add_token_iam_authentication branch from 829426f to afd9d13 Compare January 14, 2025 05:08
@VersusFacit VersusFacit marked this pull request as ready for review January 14, 2025 05:46
@VersusFacit VersusFacit requested a review from a team as a code owner January 14, 2025 05:46
@VersusFacit VersusFacit changed the title Fix tests and add token authentication method to auth flow Add Redshift Iam Idc token authentication method with an eye towards future supported Idps Jan 14, 2025
normal request failures.
"""
# Handle the 429 rate-limiting case first
if response.status_code == 429:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're hitting the rate limit could we add an exponential backoff/retry strategy here?

Copy link
Contributor Author

@VersusFacit VersusFacit Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha, I kind of knew you were going to prompt me for that. We're actively determining that customer experience. Let me filter this up to the wider team to see if it makes sense for now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the scenarios where we're hitting the rate limit? I don't see a retry on this, so is this using some generalized retry? Does this get run on each model? Each run?

Copy link
Contributor Author

@VersusFacit VersusFacit Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we actually do a retry strategy -- the necessity of which is undetermined and trending towards 'not needed' based on @jenniferjsmmiller 's current research -- it'd be used on every connection. The 429 is specifically a rate limiting error status from okta

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're reusing connections where possible, this feels like a low occurrence event. In other words, if we hit the rate limit, my guess is that something else is actually going wrong or the user is simply abusing the application (e.g. running a lot of threads of dbt-redshift in parallel via something like airflow, which isn't officially supported).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't reuse redshift connections do we? And also there was concern about 100 threads being set and this occuring at startup but even that was something Jenn didn't seem to trigger in their research 🤞

@VersusFacit
Copy link
Contributor Author

image

Was an internal request to check in a version of this integration test without the creds that we skip for now. We can't support this on our current infra. I've made a Jira ticket to track this for our backlog though.

normal request failures.
"""
# Handle the 429 rate-limiting case first
if response.status_code == 429:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the scenarios where we're hitting the rate limit? I don't see a retry on this, so is this using some generalized retry? Does this get run on each model? Each run?

@VersusFacit
Copy link
Contributor Author

VersusFacit commented Jan 15, 2025

Okay team, I've generalized the framework out to include an Entra option which may work (may need some slight adjustment but I've tested this pretty well considering our infra options imo). Moreover, the okta framework is better:tm: now.

A profile that wants to use this method will thus specify at least:

method: oauth_token_identity_center
token_endpoint:
    type: okta|entra|... 
    request_url: <https url to host with api endpoint>
    request_data: <data params needed>
   ...specific fields for each individual Idp...

I've added a bunch more unit tests and re-tested this using my refresh token end to end model build test :D

@VersusFacit VersusFacit enabled auto-merge (squash) January 15, 2025 23:43
@VersusFacit VersusFacit merged commit de078b8 into main Jan 16, 2025
23 checks passed
@VersusFacit VersusFacit deleted the ADAP-898/add_token_iam_authentication branch January 16, 2025 00:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support IAM Identity Center Authentication - browser and token based
5 participants