-
Notifications
You must be signed in to change notification settings - Fork 80
Implement databricks creds manager #2123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2123 +/- ##
==========================================
+ Coverage 65.14% 65.25% +0.11%
==========================================
Files 100 100
Lines 8503 8540 +37
Branches 885 886 +1
==========================================
+ Hits 5539 5573 +34
- Misses 2774 2777 +3
Partials 190 190 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
✅ 51/51 passed, 11 flaky, 3m41s total Flaky tests:
Running from acceptance #3131 |
# Conflicts: # src/databricks/labs/lakebridge/reconcile/connectors/jdbc_reader.py # src/databricks/labs/lakebridge/reconcile/connectors/oracle.py
…e plus major cleanup
… in one test for it to be green
asnare
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've highlighted some style and design issues that I think need to be resolved, but appreciate that this is the start of what is needed for #1008. On the testing side I really like that we've eliminated some monkey-patching during tests. (Some integration tests would be nice.)
One big concern I have is that I don't see where we're using the new provider because we don't pass the WorkspaceClient in anywhere that I can see. Can you elaborate a bit on the situation there?
src/databricks/labs/lakebridge/connections/credential_manager.py
Outdated
Show resolved
Hide resolved
src/databricks/labs/lakebridge/connections/credential_manager.py
Outdated
Show resolved
Hide resolved
src/databricks/labs/lakebridge/connections/credential_manager.py
Outdated
Show resolved
Hide resolved
src/databricks/labs/lakebridge/connections/credential_manager.py
Outdated
Show resolved
Hide resolved
src/databricks/labs/lakebridge/connections/credential_manager.py
Outdated
Show resolved
Hide resolved
| @dataclass | ||
| class ReconcileCredentialConfig: | ||
| vault_type: str # supports local, env, databricks creds. | ||
| source_creds: dict[str, str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a better name for this: it doesn't hold the credentials… it's more of a vault configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree but renaming this will require changing across three PRs. I would rename it after reviewing all PRs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid that's not a compelling argument, even though I appreciate the frustration. The situation is:
- Once it's merged to
main, we need to treat it as live: it could be released in that state without the other PRs. We can't just say "mainis not releasable because these other PRs need to be merged": that would hamstring us in terms of when we release, particularly if something urgent pops up. - Even if we don't release, end users can install
mainin advance of a release. This is something we often do when asking them to test a bug-fix or check if something has been fixed in advance of a release being made. - Once it's 'live' we have to either live with it or those other PRs will then need to implement migration. That's even more work.
src/databricks/labs/lakebridge/connections/credential_manager.py
Outdated
Show resolved
Hide resolved
| spark=spark, | ||
| ws=ws_client, | ||
| secret_scope=reconcile_config.secret_scope, | ||
| secret_scope=reconcile_config.creds.source_creds["__secret_scope"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this now be using the CredentialManager mechanism?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/databricks/labs/lakebridge/connections/credential_manager.py
Outdated
Show resolved
Hide resolved
| except NotFound as e: | ||
| raise KeyError(f'Secret does not exist with scope: {scope} and key: {key_only} : {e}') from e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is different to the other providers: they just return the key if the secret cannot be found, whereas here we raise an exception instead.
What do you think the providers should do? I think they need to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should not raise an error. the return type should be optional and it is up to the caller how to handle missing secrets.
I did not want to change lots of things in one go so the
the implementation you see here of DatabricksSecretProvider is copied from src/databricks/labs/lakebridge/reconcile/connectors/secrets.py without changing the way it works which led to some inconsistency.
I would address your comment here in a later PR if you dont mind
Co-authored-by: Andrew Snare <[email protected]>
Changes
What does this PR do?
Linked issues
Progresses #1008
Functionality
databricks labs lakebridge ...Tests