Skip to content

GCS storage - failing to fetch credentials in GKE using workload identity #4396

@nmdanny

Description

@nmdanny

Describe the bug

Dragonfly with GCS storage, fails to initialize when running within a GKE (Kubernetes) pod configured to authenticate via workload identity.

I20250102 14:47:52.847270    14 gcs.cc:46] Could not find ~/.config/gcloud
E20250102 14:47:52.847338     1 server_family.cc:895] Failed to initialize GCS snapshot storage: No such file or directory

Looking at the relevant code
https://github.com/romange/helio/blob/493804db4110cf1631f787dd14484efc57f9575d/util/cloud/gcp/gcs.cc#L203C1-L227C1

The folder ~/.config/gcloud (or the file ~/.config/gcloud/gce) doesn't exist when using GKE (and I assume other containerized envs like Cloud Run)

IMO, a solution here is to assume is_cloud_env = True if ~/.config/gcloud doesn't exist.

Furthermore, mounting the file True to /home/dfly/.config/gcloud/gce still yields the same error. (Even after creating and chowning the entire /home/dfly folder to dfly)
Adding --reset-env to the exec setpriv command in entrypoint.sh fixes the problem

To Reproduce

(GKE instructions, I assume Cloud Run, or using VMs might be simpler?)

  1. Create a GCP service account, and K8S service account (or use the default one in a namespace)
  2. Grant that GCP SA 'Storage Admin' permissions on a bucket
  3. Link the GCP SA to K8S via the following guide
  4. Create the following K8S manifest (adjust gs://my-dragonfly-bucket and my-service-account accordingly
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dragonflydb
spec:
  replicas: 1
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: dragonflydb
  template:
    metadata:
      labels:
        app: dragonflydb
    spec:
      terminationGracePeriodSeconds: 5
      serviceAccountName: my-service-account
      containers:
        image: docker.dragonflydb.io/dragonflydb/dragonfly:v1.26.0
        args:
          - "--dir"
          - "gs://my-dragonfly-bucket"
          - "--v=1"
          - "--logtostderr"
        imagePullPolicy: Always
        ports:
        - containerPort: 6379

Expected behavior

Dragonfly should fetch the creds via the metadata endpoint despite ~/.config/gcloud not existing

Environment (please complete the following information):

  • OS: Same as the image (Ubuntu 22.04)
  • Kernel: 6.1.100+
  • Containerized?: Kubernetes (GKE)
  • Dragonfly Version: v1.26.0

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions