-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GCP] Activate service account for storage and controller #4529
Conversation
/quicktest-core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome @Michaelvll!
sky/cloud_stores.py
Outdated
@@ -114,7 +114,11 @@ class GcsCloudStorage(CloudStorage): | |||
def _gsutil_command(self): | |||
gsutil_alias, alias_gen = data_utils.get_gsutil_command() | |||
return (f'{alias_gen}; GOOGLE_APPLICATION_CREDENTIALS=' | |||
f'{gcp.DEFAULT_GCP_APPLICATION_CREDENTIAL_PATH} {gsutil_alias}') | |||
f'{gcp.DEFAULT_GCP_APPLICATION_CREDENTIAL_PATH}; ' | |||
'gcloud auth activate-service-account ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does something similar need to be done for our VM provisioning pipeline on the jobs/serve controller?
For example, if the user is using service account for auth, runs a controller on k8s and wants to launch a managed job on GCP, will he run into the same issue (see #4512)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I changed the PR to fix the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks for the quick turnaround @Michaelvll!
Actually, after testing, the controller could correctly launch clusters on GCP even without the activation of service account. It seems that the only GCP-related commands that require activating service account is the storage, so we don't have to activate the service account for controller. (Note: the kubernetes cluster is started with sky local up, so there should be no default gcp credential). I reverted the changes for controller. PTAL @romilbhardwaj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for verifying @Michaelvll!
Fixes #4528
and #4512When a service account is used locally, although the service account key is uploaded, it cannot be used by the code for cloud storage. This is because
gcloud auth activate-service-account
is required for service account key, which is not needed for the normal user application key. We now activate the service account anyway, to make sure the service account works.To reproduce: #4528
~/.sky/config.yaml
gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIAL
sky jobs launch test.yaml
gsutil
fail to find the credential.To reproduce: #45121. Set a~/.sky/config.yaml
2. Have the service account activated for GCP locally:gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIAL
3.sky jobs launch -n test --cloud gcp --cpus 2 echo hi
Actually, after testing, the controller could correctly launch clusters on GCP even without the activation of service account. It seems that the only GCP-related commands that require activating service account is the storage, so we don't have to activate the service account for controller. (Note: the kubernetes cluster is started with
sky local up
, so there should be no default gcp credential)Tested (run the relevant ones):
bash format.sh
pytest tests/test_smoke.py
pytest tests/test_smoke.py --managed-jobs --gcp
with a service account activated and controller on k8s.pytest tests/test_smoke.py::test_fill_in_the_name
conda deactivate; bash -i tests/backward_compatibility_tests.sh