You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We run a Vault cluster in each of our datacenters around the world. All of these Vault clusters are configured via Terraform using parallel GitLab jobs. Whenever we add a new roleset we always have at least one job that fails with:
URL: PUT https://vault.datacenterX.sub.tld/v1/gcp/roleset/testingsecrets
Code: 400. Errors:
* unable to create new service account under project 'projects/our-project': googleapi: Error 409: Service account vaulttestingsecrets-1583518825 already exists within project projects/our-project., alreadyExists
If this happened 1-in-100 or even 1-in-50 we would probably not be concerned with this and we'd just retry our GitLab job, but we run into this every time a new roleset is added.
From the docs, the name is like:
vault<roleset-prefix>-<creation-unix-timestamp>
It makes sense that the Unix timestamp is included since GCP doesn't maintain a creation timestamp on service accounts. And it makes sense that the vault prefix is included. Therefore the roleset-prefix is limited to 14 characters since the SA name max is 30.
I wanted to see if we could support a naming convention to avoid these 409 conflicts? Obviously, 14 characters for the roleset-prefix is pretty limiting and so cutting into that with a timestamp that includes milliseconds isn't really ideal. Same is true for including a random number. But, what about HMAC'ing the first 14 characters of the roleset name? The drawback here is that it is a one-way hash so going from a service account found in GCP to a roleset in Vault isn't really possible. And going from a roleset in Vault to a GCP service account requires querying the HMAC endpoint prior to searching GCP. A possible workaround might be to include the roleset-prefix in the response that way audit logs include that HMAC'd value and since audit logs are usually/hopefully kept around for a long period of time it could be consulted as needed.
We run a Vault cluster in each of our datacenters around the world. All of these Vault clusters are configured via Terraform using parallel GitLab jobs. Whenever we add a new roleset we always have at least one job that fails with:
If this happened 1-in-100 or even 1-in-50 we would probably not be concerned with this and we'd just retry our GitLab job, but we run into this every time a new roleset is added.
From the docs, the name is like:
It makes sense that the Unix timestamp is included since GCP doesn't maintain a creation timestamp on service accounts. And it makes sense that the
vault
prefix is included. Therefore theroleset-prefix
is limited to 14 characters since the SA name max is 30.I wanted to see if we could support a naming convention to avoid these 409 conflicts? Obviously, 14 characters for the
roleset-prefix
is pretty limiting and so cutting into that with a timestamp that includes milliseconds isn't really ideal. Same is true for including a random number. But, what about HMAC'ing the first 14 characters of the roleset name? The drawback here is that it is a one-way hash so going from a service account found in GCP to a roleset in Vault isn't really possible. And going from a roleset in Vault to a GCP service account requires querying the HMAC endpoint prior to searching GCP. A possible workaround might be to include theroleset-prefix
in the response that way audit logs include that HMAC'd value and since audit logs are usually/hopefully kept around for a long period of time it could be consulted as needed.relevant code
The text was updated successfully, but these errors were encountered: