Skip to content
Zachary Seguin edited this page Jul 5, 2021 · 4 revisions

This document outlines how the Data Analytics as a Service (DAaaS) project is using Vault to manage secrets.

Custom Vault image

Our setup uses a custom Vault image to add a plugin which will generate MinIO credentials dynamically. The dockerfile which generates this is:

FROM golang:1.14.2-stretch AS build

# Set workdir
WORKDIR /work

# Add dependencies
COPY go.mod .
COPY go.sum .
RUN go mod download

# Build
COPY . .
RUN CGO_ENABLED=0 go build ./cmd/vault-plugin-secrets-minio

FROM vault:1.4.0
COPY --from=build /work/vault-plugin-secrets-minio /plugins/vault-plugin-secrets-minio

Vault auto-unseal

Vault is setup to auto-unseal itself via an Azure Managed Identity assigned to the pod via AAD Pod Identity, and a key stored in an Azure Key Vault.

The sequence of events is:

  1. At pod creation, a pod annotation indicate the AAD Pod Identity to assign the Managed Identity to the pod aadpodidbinding=vault
  2. Vault is started and requests the Managed Identity token. This request is intercepted by AAD Pod Identity who performs the request to Azure and returns the token
  3. The returned token is utilized to connect to the Azure Key Vault which contains the decryption key for the information stored in the Storage Account
  4. With the Azure token and decryption key, Vault connects to the Storage Account and reads information stored in a Blob container. The information stored in the container is encrypted (and is decrypted with the key from Key Vault).

Manual unseal

If for some reason Vault has re-sealed itself, you can manually unseal vault by:

  1. kubectl -n vault exec -it vault-0 -- sh
  2. vault operator unseal [3 times, pasting a different unseal key each time]
  3. vault status to confirm that it is no longer sealed

Integration with Boathouse

Vault integrates with Boathouse to enable the auto-mount of MinIO backed storage into Notebook containers. https://github.com/StatCan/boathouse/blob/master/docs/diagram.pdf

  1. A Notebook is launched in a user's namespace

  2. The goofys-injector webhook is called and adds the volume spec and volume mount for boathouse

    volumes:
      - flexVolume:
          name: $NAME
          driver: statcan.gc.ca/boathouse
          options:
            bucket: shared
            endpoint: https://MINIO
            gid: "100"
            region: us-east-1
            uid: "1000"
            vault-path: $INSTANCE/keys/profile-$PROFILE
            vault-ttl: 24h
    volumeMounts:
      - mountPath: /home/jovyan/minio/$INSTANCE/$SHARED_OR_PRIVATE
        name: $NAME
  3. Boathouse flex volume driver (binary at /etc/kubernetes/volumeplugins/statcan.gc.ca~boathouse/boathouse, copied there by the boathouse daemonset)

  4. Boathouse flexvolume driver calls boathouse daemonset pod on that node and asks for credentials from Vault

  5. Boathouse daemonset pod contacts Vault and requests credentials, returns them to the flexvolume driver

  6. Flexvolume driver starts goofys with the credentials

  7. Flexvolume driver updates the goofys credentials once per day following the same steps as #5

Common issues

Vault pod startup error

Error parsing Seal configuration: error fetching Azure Key Vault wrapper key information: azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://$VAULT.vault.azure.net/keys/vault/?api-version=7.0: StatusCode=403 -- Original Error: adal: Refresh request failed. Status Code = '403'. Response body: failed to refresh token, error: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"}

This is normal to see once or twice on startup as the identity is being assigned to the underlying VMSS.

If this error persists, then there is an error in the AAD Pod Identity configuration.

  1. Check the clientID and resourceID in the kubectl -n vault get azure identities.aadpodidentity.k8s.io -o yaml to ensure they are correct

New node: boathouse errors

It is normal to see errors related to boathouse for up to 5 minutes after a node has been added to the cluster. This is because the DaemonSet pod that sets up boathouse is being deployed to the node.