-
Notifications
You must be signed in to change notification settings - Fork 52
Option to disable getting secrets from env vars/global scope #182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
63e35e9
introduce option to disable getting secrets from env vars/global scope
Tmonster 9677977
PR comments. extens KeyValueSecretReader to use a reader that can che…
Tmonster 041b8e8
remove dead code, fix typo
Tmonster d95c868
move keyvalue reader to s3fs.cpp
Tmonster 45dd657
Merge branch 'v1.4-andium' into no_credentials_from_env
Tmonster 8790732
fix merge conflict with v1.4-andium
Tmonster 24f0394
rename config option
Tmonster File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| # name: test/sql/test_read_public_bucket.test | ||
| # description: test aws extension with different chain configs | ||
| # group: [sql] | ||
|
|
||
| require parquet | ||
|
|
||
| require httpfs | ||
|
|
||
| require-env AWS_ACCESS_KEY_ID | ||
|
|
||
| require-env AWS_SECRET_ACCESS_KEY | ||
|
|
||
| # override the default behaviour of skipping HTTP errors and connection failures: this test fails on connection issues | ||
| set ignore_error_messages | ||
|
|
||
| statement ok | ||
| set s3_region='us-east-2'; | ||
|
|
||
| # set endpoint to the correct default, otherwise it will pick up the env variable | ||
| statement ok | ||
| set s3_endpoint='s3.amazonaws.com'; | ||
|
|
||
| # see duckdb-internal/issues/6620 | ||
| # env vars for access_key_id and secret_key_id are used | ||
| # which results in 403 | ||
| statement error | ||
| SELECT * FROM read_parquet('s3://coiled-datasets/timeseries/20-years/parquet/part.0.parquet') LIMIT 5; | ||
| ---- | ||
| <REGEX>:.*HTTP Error:.*403.*Authentication Failure.* | ||
|
|
||
| # default to not using globally scoped settings for secrets | ||
| statement ok | ||
| set auto_fetch_secret_info_from_env=false; | ||
|
|
||
| statement ok | ||
| SELECT * FROM read_parquet('s3://coiled-datasets/timeseries/20-years/parquet/part.0.parquet') LIMIT 5; |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these functions are a little bit hard to understand right now. I think we should make it a bit more clear what order of checking for settings is when using this.
Perhaps this can be rewritten to match the API used by the KeyValueSecretReader to read settings in a cascading way? I would propose to create a new class
CustomKeyValueReaderwhich wraps the base KeyValueReader and injects the env variables .There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR does not look up a separate
env_var_namefor secret variables. We can still add the option, but I believe that will make secret initialization more confusing, and unnecessary logic to the PR. This PR is to just check a setting if env vars (Or GLOBAL scope settings) should be used when creating secrets.The environment variables for
AWS_ACCESS_KEY_ID,AWS_SECRET_KEY_IDare set inAWSEnvironmentCredentialsProvider::SetExtensionOptionValueon extension load, and then they are placed in db_config.options.set_options. We cannot disable loading the AWS_ACCESS_KEY, or AWS_SECRET_KEY_ID variables, because it is only run once on extension load before theauto_fetch_secret_info_from_envsetting can be set/unset.Also, I wonder if wrapping the KeyValueSecretReader is necessary. Happy to do it for the bug fix PR because it touches less code. I think a cleaner idea for v1.6 would be to add a
max_scopeparameter that defaults toSettingScope::GLOBAL. For Secret initialization you can passmax_scope = SettingScope::LocalorSettingScope::SECRETdepending on how strict secret initialization should be, and for other settings you can leave it as the default. If more cases like this come along, it will be easier to modify if the global or local setting should be used.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh maybe i'm misunderstanding things here, but i'm confused now. The issue states that:
I feel like this does not align with the statement you make in this PR:
Are we sure this issue is solving the right correct problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> environment variables like AWS_REGION, AWS_ACCESS_KEY_ID etc.
-> Environment variables like AWS_REGION, AWS_ACCESS_KEY_ID etc. These variablees are used when no secret has been created/matches scope for the requested url
I think the confusion comes from when the environment variables are loaded and when they get used. I can expand
On httpfs extension load, the environment variables
AWS_REGION,AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY_IDetc. are loaded into the httpfs optionss3_region,s3_access_key_id,s3_secret_access_key. See here for the whole list of loaded environment variables. Here is where they actually loaded and stored in the extension configuration optionsEnvironment variables are never checked or read after extension load. After extension load only the extension configuration options are used
After httpfs is loaded, when a user attempts to read a remote url and the S3KeyValueReader cannot find a secret that matches scope etc., the extension config options "s3_access_key_id", "s3_secret_access_key" etc. are used. If these settings are not set by the user after the httpfs extension loads, the environment variable values that were loaded into the extension options are used.
This is how DuckDB ends up "silently" using env variables for secrets, or automatically uses AWS environment credentials when no secret has explicitly been created.
The issue our client is experiencing suffers from this
-> here the invalid AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY_ID values are used when creating the auth params
This PR will allow a user to set the
auto_fetch_secret_info_from_envoption to false, so that when a secret options are initialized inReadFrom(S3KeyValueReader &secret_reader, const std::string &file_path), the string settings that are returned withGLOBAL_SCOPEare not used.GLOBAL_SCOPEis returned for default values, or env values. Default values for all secret config options are empty, and I don't see that ever changing, so I think this way is relatively safe. Otherwise we have to still check global scope, and also check the current env variable value