Bug: store-gateway loads all its block before being ready with lazy loading option enabled #10649

agardiman · 2025-02-14T12:35:58Z

What is the bug?

When lazy loading is enabled, the store gateway should become ready very fast after it starts, because the blocks should be loaded on-demand.
Instead that store-gateways can take up to 2.5h to start, slowing down deployments.
We have been experiencing this in older versions of Mimir in one or 2 instances sporadically, during normal restarts.
Now it happened also with the new version of Mimir, 2.15. The last time it happened not only with one instance, but with all of them. The difference was that we deleted all store-gateways PVs in one zone and we were expecting that zone to become ready very fast, but all instances in the zone took hours to load all their blocks from S3.

We are running in K8s and the following is the configuration of the store-gateway

apiVersion: apps/v1
kind: StatefulSet
metadata:
  annotations:
    rollout-max-unavailable: "50"
  labels:
    rollout-group: store-gateway
    zone: a
  name: store-gateway-zone-a
  namespace: cortex
spec:
  podManagementPolicy: Parallel
  replicas: 55
  selector:
    matchLabels:
      name: store-gateway-zone-a
      rollout-group: store-gateway
  serviceName: store-gateway-zone-a
  template:
    metadata:
      labels:
        gossip_ring_member: "true"
        name: store-gateway-zone-a
        rollout-group: store-gateway
        zone: a
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: cortex.pharos.inday.io/zone
                operator: In
                values:
                - mimir-a
              - key: kubernetes.io/arch
                operator: In
                values:
                - arm64
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                name: store-gateway-zone-a
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -auth.multitenancy-enabled=true
        - -blocks-storage.bucket-store.chunks-cache.backend=memcached
        - -blocks-storage.bucket-store.chunks-cache.memcached.addresses=dnssrvnoa+memcached.cortex.svc.cluster.local.:11211
        - -blocks-storage.bucket-store.chunks-cache.memcached.max-async-concurrency=50
        - -blocks-storage.bucket-store.chunks-cache.memcached.max-get-multi-batch-size=500
        - -blocks-storage.bucket-store.chunks-cache.memcached.max-get-multi-concurrency=100
        - -blocks-storage.bucket-store.chunks-cache.memcached.max-idle-connections=50
        - -blocks-storage.bucket-store.chunks-cache.memcached.max-item-size=1048576
        - -blocks-storage.bucket-store.chunks-cache.memcached.min-idle-connections-headroom-percentage=50
        - -blocks-storage.bucket-store.chunks-cache.memcached.timeout=4s
        - -blocks-storage.bucket-store.index-cache.backend=memcached
        - -blocks-storage.bucket-store.index-cache.memcached.addresses=dnssrvnoa+memcached-index-queries.cortex.svc.cluster.local.:11211
        - -blocks-storage.bucket-store.index-cache.memcached.max-async-concurrency=50
        - -blocks-storage.bucket-store.index-cache.memcached.max-get-multi-batch-size=500
        - -blocks-storage.bucket-store.index-cache.memcached.max-get-multi-concurrency=100
        - -blocks-storage.bucket-store.index-cache.memcached.max-idle-connections=50
        - -blocks-storage.bucket-store.index-cache.memcached.max-item-size=5242880
        - -blocks-storage.bucket-store.index-cache.memcached.min-idle-connections-headroom-percentage=50
        - -blocks-storage.bucket-store.index-cache.memcached.timeout=4s
        - -blocks-storage.bucket-store.index-header.lazy-loading-concurrency=0
        - -blocks-storage.bucket-store.metadata-cache.backend=memcached
        - -blocks-storage.bucket-store.metadata-cache.memcached.addresses=dnssrvnoa+memcached-metadata.cortex.svc.cluster.local.:11211
        - -blocks-storage.bucket-store.metadata-cache.memcached.max-async-concurrency=50
        - -blocks-storage.bucket-store.metadata-cache.memcached.max-get-multi-concurrency=100
        - -blocks-storage.bucket-store.metadata-cache.memcached.max-idle-connections=50
        - -blocks-storage.bucket-store.metadata-cache.memcached.max-item-size=1048576
        - -blocks-storage.bucket-store.metadata-cache.memcached.min-idle-connections-headroom-percentage=50
        - -blocks-storage.bucket-store.metadata-cache.memcached.timeout=4s
        - -blocks-storage.bucket-store.sync-dir=/data/tsdb
        - -blocks-storage.bucket-store.sync-interval=15m
        - -blocks-storage.s3.bucket-name=<REDACTED>
        - -blocks-storage.tsdb.block-postings-for-matchers-cache-max-bytes=209715200
        - -blocks-storage.tsdb.block-postings-for-matchers-cache-ttl=20s
        - -blocks-storage.tsdb.series-hash-cache-max-size-bytes=1073741824
        - -common.storage.backend=s3
        - -common.storage.s3.endpoint=s3.us-west-2.amazonaws.com
        - -memberlist.bind-port=7946
        - -memberlist.join=dns+gossip-ring.cortex.svc.cluster.local.:7946
        - -runtime-config.file=/etc/mimir/overrides.yaml
        - -server.grpc.keepalive.min-time-between-pings=10s
        - -server.grpc.keepalive.ping-without-stream-allowed=true
        - -server.http-listen-port=80
        - -server.http-read-timeout=5m
        - -server.http-write-timeout=5m
        - -store-gateway.sharding-ring.heartbeat-period=1m
        - -store-gateway.sharding-ring.heartbeat-timeout=4m
        - -store-gateway.sharding-ring.instance-availability-zone=zone-a
        - -store-gateway.sharding-ring.prefix=multi-zone/
        - -store-gateway.sharding-ring.replication-factor=3
        - -store-gateway.sharding-ring.store=memberlist
        - -store-gateway.sharding-ring.tokens-file-path=/data/tokens
        - -store-gateway.sharding-ring.unregister-on-shutdown=false
        - -store-gateway.sharding-ring.wait-stability-min-duration=1m
        - -store-gateway.sharding-ring.zone-awareness-enabled=true
        - -target=store-gateway
        - -tenant-federation.enabled=true
        - -usage-stats.enabled=false
        - -usage-stats.installation-mode=jsonnet
        env:
        - name: GOMAXPROCS
          value: "7"
        - name: GOMEMLIMIT
          valueFrom:
            resourceFieldRef:
              resource: requests.memory
        - name: JAEGER_REPORTER_MAX_QUEUE_SIZE
          value: "1000"
        image: <REDACTED>
        imagePullPolicy: IfNotPresent
        name: store-gateway
        ports:
        - containerPort: 80
          name: http-metrics
        - containerPort: 9095
          name: grpc
        - containerPort: 7946
          name: gossip-ring
        readinessProbe:
          httpGet:
            path: /ready
            port: 80
          initialDelaySeconds: 15
          timeoutSeconds: 5
        resources:
          limits:
            memory: 50Gi
          requests:
            cpu: "3"
            memory: 30Gi
        volumeMounts:
        - mountPath: /data
          name: store-gateway-data
        - mountPath: /etc/mimir
          name: overrides
      securityContext:
        runAsUser: 0
      serviceAccountName: cortex
      terminationGracePeriodSeconds: 120
      tolerations:
      - effect: NoSchedule
        key: arch
        operator: Equal
        value: arm64
      volumes:
      - configMap:
          name: overrides
        name: overrides
  updateStrategy:
    type: OnDelete
  volumeClaimTemplates:
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: store-gateway-data
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 1.1Ti
      storageClassName: gp3

How to reproduce it?

Not sure, it doesn't happen in our dev clusters but maybe because they don't have much data to load.

What did you think would happen?

the store gateway should be ready immediately

What was your environment?

It happened in Kubernetes 1.28 and below
Mimir 2.14 and 2.15 (but if I remember correctly it happened also in 2.13).
Both on x86 and ARM.
Both when the persistent volume during the restart is kept as is or it's deleted.

Any additional context to share?

No response

The text was updated successfully, but these errors were encountered:

56quarters · 2025-02-14T14:25:26Z

Store-gateways don't load the TSDB index-header into memory until needed when lazy loading is enabled, but they still must download the index-header (a subset of the TSDB index) to local disk. This should only happen with an empty disk such as when new store-gateways are started - not when existing ones are restarted.

agardiman · 2025-02-14T14:59:52Z

Hi Nick, thank you for your reply. I see, so this line in the logs
ts=2025-02-14T11:25:04.953879005Z caller=bucket.go:452 level=info user=default msg="loaded new block" elapsed=19.921888892s id=REDACTED
is about the index-header being loaded on disk, not the block itself?

56quarters · 2025-02-14T15:48:50Z

There are different things going on here. "Loading a block" involves several different pieces of work happening. The index-header is a part of that. Lazy loading controls whether the index-header is downloaded and immediately loaded into memory or just downloaded and loaded into memory when a query involves that particular block.

So regardless of the lazy loading setting, starting a store-gateway is going to involve some work. That's what that log message is about: all the work required for a store-gateway to load a block.

agardiman · 2025-02-14T17:03:13Z

Thanks for the clarification. That's all good on the last event when the PV were deleted.
It remains only the case when it happens occasionally also during normal restarts on isolated pods. I'll see what I can get when we encounter it again.

agardiman added the bug Something isn't working label Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: store-gateway loads all its block before being ready with lazy loading option enabled #10649

Bug: store-gateway loads all its block before being ready with lazy loading option enabled #10649

agardiman commented Feb 14, 2025

56quarters commented Feb 14, 2025

agardiman commented Feb 14, 2025

56quarters commented Feb 14, 2025

agardiman commented Feb 14, 2025

Bug: store-gateway loads all its block before being ready with lazy loading option enabled #10649

Bug: store-gateway loads all its block before being ready with lazy loading option enabled #10649

Comments

agardiman commented Feb 14, 2025

What is the bug?

How to reproduce it?

What did you think would happen?

What was your environment?

Any additional context to share?

56quarters commented Feb 14, 2025

agardiman commented Feb 14, 2025

56quarters commented Feb 14, 2025

agardiman commented Feb 14, 2025