Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/etcd] etcd pods are unable to join existing cluster on node drain #16069

Closed
abhayycs opened this issue Apr 14, 2023 · 42 comments · Fixed by bitnami/containers#75906 or #31161
Closed
Assignees
Labels
etcd solved tech-issues The user has a technical issue about an application

Comments

@abhayycs
Copy link

abhayycs commented Apr 14, 2023

Name and Version

bitnami/etcd-3.5.8

What architecture are you using?

None

What steps will reproduce the bug?

I'm using 3 node Kubernetes cluster and 3 instances of etcd.

When I'm deleting a pod, pod is able to restart.
When I'm only draining a node, the pod is not able to re-join the cluster, and unable to start.

Observations:
ETCD_INITIAL_CLUSTER_STATE is 'new' when it's starting from zero (first time).
CASE-1: When deleting a pod ETCD_INITIAL_CLUSTER_STATE is changing from 'new' to 'existing', and pod is able to start.
CASE-2: When draining a node ETCD_INITIAL_CLUSTER_STATE is staying 'new', and newly created pod is unable to join the cluster and unable to restart.

Are you using any custom parameters or values?

I tried with and without persistence.

What is the expected behavior?

pod should start on node drain. And as per my understanding, the 'ETCD_INITIAL_CLUSTER_STATE' should change to 'existing' on node drain as well.

What do you see instead?

etcd pod not starting on node drain.

Additional information

Please let me know, if this behavior is expected or not, and how can I prevent pod restart failure on node drain.

I'm not sure if it will help:

  1. I have tried this in different clusters (rhel & ubuntu based)
  2. Other services are working fine(kafka, zookeeper), network is fine.
@abhayycs abhayycs added the tech-issues The user has a technical issue about an application label Apr 14, 2023
@github-actions github-actions bot added the triage Triage is needed label Apr 14, 2023
@carrodher carrodher added the etcd label Apr 18, 2023
@github-actions github-actions bot added in-progress and removed triage Triage is needed labels Apr 18, 2023
@abhayycs
Copy link
Author

I've been experiencing the same problem on my end and would appreciate any updates or solutions.

@aoterolorenzo
Copy link
Contributor

Hi @abhayycs ,

Could you provide the values you are using and a set of commands to reproduce the issue?

@abhayycs
Copy link
Author

abhayycs commented Apr 19, 2023

values.yaml

etcd:
  fullnameOverride: 'voltha-etcd-cluster-client'
  global:
    storageClass: "manual"
  persistence:
    enabled: false
  auth:
    rbac:
      create: false
      enabled: false
  replicaCount: 3
  resources:
    limits:
      cpu: 1900m
      memory: 1800Mi
    requests:
      cpu: 950m
      memory: 1Gi

@abhayycs
Copy link
Author

abhayycs commented Apr 19, 2023

To reproduce the issue I tried the below things:

tester@bln-k8s-161 ➜ etcd cat etcd_pv.yaml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: data-myetcd-0
spec:
  capacity:
    storage: 8Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/home/tester/data/etcd/data-myetcd-0"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: data-myetcd-1
spec:
  capacity:
    storage: 8Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/home/tester/data/etcd/data-myetcd-1"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: data-myetcd-2
spec:
  capacity:
    storage: 8Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/home/tester/data/etcd/data-myetcd-2"

helm command:

helm install myetcd bitnami/etcd --namespace etcd --create-namespace --set replicaCount=3 --set persistence.enabled=true --set volumePermissions.enabled=true

@aoterolorenzo
Copy link
Contributor

aoterolorenzo commented Apr 24, 2023

Seems it could be related to the libetcd.sh logic:

else
        info "Detected data from previous deployments"
        if [[ $(stat -c "%a" "$ETCD_DATA_DIR") != *700 ]]; then
            debug "Setting data directory permissions to 700 in a recursive way (required in etcd >=3.4.10)"
            debug_execute chmod -R 700 "$ETCD_DATA_DIR" || true
        fi
        if [[ ${#initial_members[@]} -gt 1 ]]; then
            if is_boolean_yes "$ETCD_DISABLE_PRESTOP"; then
                info "The member will try to join the cluster by it's own"
                export ETCD_INITIAL_CLUSTER_STATE=existing
            fi
            member_id="$(get_member_id)"
            if ! is_healthy_etcd_cluster; then
                warn "Cluster not responding!"
                if is_boolean_yes "$ETCD_DISASTER_RECOVERY"; then
                    latest_snapshot_file="$(find /snapshots/ -maxdepth 1 -type f -name 'db-*' | sort | tail -n 1)"
                    if [[ "${latest_snapshot_file}" != "" ]]; then
                        info "Restoring etcd cluster from snapshot"
                        rm -rf "$ETCD_DATA_DIR"
                        ETCD_INITIAL_CLUSTER="$(recalculate_initial_cluster)"
                        export ETCD_INITIAL_CLUSTER
                        [[ -f "$ETCD_CONF_FILE" ]] && etcd_conf_write "initial-cluster" "$ETCD_INITIAL_CLUSTER"
                        debug_execute etcdctl snapshot restore "${latest_snapshot_file}" \
                            --name "$ETCD_NAME" \
                            --data-dir "$ETCD_DATA_DIR" \
                            --initial-cluster "$ETCD_INITIAL_CLUSTER" \
                            --initial-cluster-token "$ETCD_INITIAL_CLUSTER_TOKEN" \
                            --initial-advertise-peer-urls "$ETCD_INITIAL_ADVERTISE_PEER_URLS"
                        etcd_store_member_id
                    else
                        error "There was no snapshot to restore!"
                        exit 1
                    fi
                else
                    warn "Disaster recovery is disabled, the cluster will try to recover on it's own"
                fi
            elif was_etcd_member_removed; then
                info "Adding new member to existing cluster"
                read -r -a extra_flags <<<"$(etcdctl_auth_flags)"
                is_boolean_yes "$ETCD_ON_K8S" && extra_flags+=("--endpoints=$(etcdctl_get_endpoints)")
                extra_flags+=("--peer-urls=$ETCD_INITIAL_ADVERTISE_PEER_URLS")
                etcdctl member add "$ETCD_NAME" "${extra_flags[@]}" | grep "^ETCD_" >"$ETCD_NEW_MEMBERS_ENV_FILE"
                replace_in_file "$ETCD_NEW_MEMBERS_ENV_FILE" "^" "export "
                # The value of ETCD_INITIAL_CLUSTER_STATE must be changed for it to be correctly added to the existing cluster
                # https://etcd.io/docs/v3.3/op-guide/configuration/#--initial-cluster-state
                export ETCD_INITIAL_CLUSTER_STATE=existing
                etcd_store_member_id
            elif ! is_empty_value "$member_id"; then
                info "Updating member in existing cluster"
                export ETCD_INITIAL_CLUSTER_STATE=existing
                [[ -f "$ETCD_CONF_FILE" ]] && etcd_conf_write "initial-cluster-state" "$ETCD_INITIAL_CLUSTER_STATE"
                read -r -a extra_flags <<<"$(etcdctl_auth_flags)"
                extra_flags+=("--peer-urls=$ETCD_INITIAL_ADVERTISE_PEER_URLS")
                if is_boolean_yes "$ETCD_ON_K8S"; then
                    extra_flags+=("--endpoints=$(etcdctl_get_endpoints)")
                    etcdctl member update "$member_id" "${extra_flags[@]}"
                else
                    etcd_start_bg
                    etcdctl member update "$member_id" "${extra_flags[@]}"
                    etcd_stop
                fi
            else
                info "Member ID wasn't properly stored, the member will try to join the cluster by it's own"
                export ETCD_INITIAL_CLUSTER_STATE=existing
                [[ -f "$ETCD_CONF_FILE" ]] && etcd_conf_write "initial-cluster-state" "$ETCD_INITIAL_CLUSTER_STATE"
            fi
        fi
    fi

Could you show the log output in your specific scenario?

@abhayycs
Copy link
Author

abhayycs commented Apr 25, 2023

myetcd-1 pod is Stuck in CrashLoopBackOff state after draining the node bln-k8s-162

NAME       READY   STATUS             RESTARTS   AGE     IP               NODE          NOMINATED NODE   READINESS GATES
myetcd-0   1/1     Running            0          8m19s   10.233.89.218    bln-k8s-163   <none>           <none>
myetcd-1   0/1     CrashLoopBackOff   5          6m2s    10.233.89.191    bln-k8s-163   <none>           <none>
myetcd-2   1/1     Running            0          8m19s   10.233.115.154   bln-k8s-161   <none>           <none>

When a node is drained and pod gets scheduled to another node and pod starts with the below logs and goes to CrashLoopBackOff State & then in next restart logs are different which is attached below:

etcd 03:36:22.23
etcd 03:36:22.24 Welcome to the Bitnami etcd container
etcd 03:36:22.24 Subscribe to project updates by watching https://github.com/bitnami/containers
etcd 03:36:22.24 Submit issues and feature requests at https://github.com/bitnami/containers/issues
etcd 03:36:22.24
etcd 03:36:22.24 INFO  ==> ** Starting etcd setup **
etcd 03:36:22.27 INFO  ==> Validating settings in ETCD_* env vars..
etcd 03:36:22.27 INFO  ==> Initializing etcd
etcd 03:36:22.28 INFO  ==> Generating etcd config file using env variables
etcd 03:36:22.30 INFO  ==> There is no data from previous deployments
etcd 03:36:22.30 INFO  ==> Bootstrapping a new cluster
etcd 03:36:22.30 DEBUG ==> Waiting for the headless svc domain to have an IP per initial member in the cluster
etcd 03:36:22.32 DEBUG ==> Skipping RBAC configuration in member myetcd-1
etcd 03:36:22.32 INFO  ==> Obtaining cluster member ID
etcd 03:36:22.33 INFO  ==> Starting etcd in background
{"level":"info","ts":"2023-04-25T03:36:22.371713Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_ADVERTISE_CLIENT_URLS","variable-value":"http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2379,http://myetcd.etcd.svc.cluster.local:2379"}
{"level":"info","ts":"2023-04-25T03:36:22.3719Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_AUTH_TOKEN","variable-value":"jwt,priv-key=/opt/bitnami/etcd/certs/token/jwt-token.pem,sign-method=RS256,ttl=10m"}
{"level":"info","ts":"2023-04-25T03:36:22.37193Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_AUTO_TLS","variable-value":"false"}
{"level":"info","ts":"2023-04-25T03:36:22.37196Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_CLIENT_CERT_AUTH","variable-value":"false"}
{"level":"info","ts":"2023-04-25T03:36:22.371991Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_DATA_DIR","variable-value":"/bitnami/etcd/data"}
{"level":"info","ts":"2023-04-25T03:36:22.372095Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_ADVERTISE_PEER_URLS","variable-value":"http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2380"}
{"level":"info","ts":"2023-04-25T03:36:22.372132Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_CLUSTER","variable-value":"myetcd-0=http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380,myetcd-1=http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2380,myetcd-2=http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380"}
{"level":"info","ts":"2023-04-25T03:36:22.372167Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_CLUSTER_STATE","variable-value":"new"}
{"level":"info","ts":"2023-04-25T03:36:22.372196Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_CLUSTER_TOKEN","variable-value":"etcd-cluster-k8s"}
{"level":"info","ts":"2023-04-25T03:36:22.372239Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LISTEN_CLIENT_URLS","variable-value":"http://0.0.0.0:2379"}
{"level":"info","ts":"2023-04-25T03:36:22.372283Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LISTEN_PEER_URLS","variable-value":"http://0.0.0.0:2380"}
{"level":"info","ts":"2023-04-25T03:36:22.372327Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LOG_LEVEL","variable-value":"debug"}
{"level":"info","ts":"2023-04-25T03:36:22.372361Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_NAME","variable-value":"myetcd-1"}
{"level":"info","ts":"2023-04-25T03:36:22.372392Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_PEER_AUTO_TLS","variable-value":"false"}
{"level":"warn","ts":"2023-04-25T03:36:22.372497Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_TRUSTED_CA_FILE="}
{"level":"warn","ts":"2023-04-25T03:36:22.372544Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DISABLE_STORE_MEMBER_ID=no"}
{"level":"warn","ts":"2023-04-25T03:36:22.372561Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_CONF_FILE=/opt/bitnami/etcd/conf/etcd.yaml"}
{"level":"warn","ts":"2023-04-25T03:36:22.372583Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_SNAPSHOT_HISTORY_LIMIT=1"}
{"level":"warn","ts":"2023-04-25T03:36:22.372602Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_ON_K8S=yes"}
{"level":"warn","ts":"2023-04-25T03:36:22.372633Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_SNAPSHOTS_DIR=/snapshots"}
{"level":"warn","ts":"2023-04-25T03:36:22.372659Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_BIN_DIR=/opt/bitnami/etcd/bin"}
{"level":"warn","ts":"2023-04-25T03:36:22.372676Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_VOLUME_DIR=/bitnami/etcd"}
{"level":"warn","ts":"2023-04-25T03:36:22.372714Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_ROOT_PASSWORD=Hz9EMW6o00"}
{"level":"warn","ts":"2023-04-25T03:36:22.372745Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_CLUSTER_DOMAIN=myetcd-headless.etcd.svc.cluster.local"}
{"level":"warn","ts":"2023-04-25T03:36:22.372778Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DISASTER_RECOVERY=no"}
{"level":"warn","ts":"2023-04-25T03:36:22.372812Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_KEY_FILE="}
{"level":"warn","ts":"2023-04-25T03:36:22.372848Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_CONF_DIR=/opt/bitnami/etcd/conf"}
{"level":"warn","ts":"2023-04-25T03:36:22.372882Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DAEMON_GROUP=etcd"}
{"level":"warn","ts":"2023-04-25T03:36:22.3729Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_START_FROM_SNAPSHOT=no"}
{"level":"warn","ts":"2023-04-25T03:36:22.372927Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_INIT_SNAPSHOT_FILENAME="}
{"level":"warn","ts":"2023-04-25T03:36:22.37298Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_INIT_SNAPSHOTS_DIR=/init-snapshot"}
{"level":"warn","ts":"2023-04-25T03:36:22.373001Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DISABLE_PRESTOP=no"}
{"level":"warn","ts":"2023-04-25T03:36:22.373032Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_TMP_DIR=/opt/bitnami/etcd/tmp"}
{"level":"warn","ts":"2023-04-25T03:36:22.373074Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_BASE_DIR=/opt/bitnami/etcd"}
{"level":"warn","ts":"2023-04-25T03:36:22.373118Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_CERT_FILE="}
{"level":"warn","ts":"2023-04-25T03:36:22.373149Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_NEW_MEMBERS_ENV_FILE=/bitnami/etcd/data/new_member_envs"}
{"level":"warn","ts":"2023-04-25T03:36:22.373176Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DAEMON_USER=etcd"}
{"level":"warn","ts":"2023-04-25T03:36:22.373853Z","caller":"embed/config.go:673","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2023-04-25T03:36:22.373926Z","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["etcd"]}
{"level":"warn","ts":"2023-04-25T03:36:22.374019Z","caller":"embed/config.go:673","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2023-04-25T03:36:22.374052Z","caller":"embed/etcd.go:127","msg":"configuring peer listeners","listen-peer-urls":["http://0.0.0.0:2380"]}
{"level":"info","ts":"2023-04-25T03:36:22.374423Z","caller":"embed/etcd.go:135","msg":"configuring client listeners","listen-client-urls":["http://0.0.0.0:2379"]}
{"level":"info","ts":"2023-04-25T03:36:22.374619Z","caller":"embed/etcd.go:309","msg":"starting an etcd server","etcd-version":"3.5.8","git-sha":"217d183e5","go-version":"go1.19.8","go-os":"linux","go-arch":"amd64","max-cpu-set":32,"max-cpu-available":32,"member-initialized":false,"name":"myetcd-1","data-dir":"/bitnami/etcd/data","wal-dir":"","wal-dir-dedicated":"","member-dir":"/bitnami/etcd/data/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2380"],"listen-peer-urls":["http://0.0.0.0:2380"],"advertise-client-urls":["http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2379","http://myetcd.etcd.svc.cluster.local:2379"],"listen-client-urls":["http://0.0.0.0:2379"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"myetcd-0=http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380,myetcd-1=http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2380,myetcd-2=http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380","initial-cluster-state":"new","initial-cluster-token":"etcd-cluster-k8s","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"periodic","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
{"level":"info","ts":"2023-04-25T03:36:22.376673Z","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/bitnami/etcd/data/member/snap/db","took":"1.317094ms"}
{"level":"info","ts":"2023-04-25T03:36:22.383263Z","caller":"etcdserver/raft.go:495","msg":"starting local member","local-member-id":"c330302d396f2768","cluster-id":"cba674359d9edebe"}
{"level":"info","ts":"2023-04-25T03:36:22.383493Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"c330302d396f2768 switched to configuration voters=()"}
{"level":"info","ts":"2023-04-25T03:36:22.383575Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"c330302d396f2768 became follower at term 0"}
{"level":"info","ts":"2023-04-25T03:36:22.383631Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"newRaft c330302d396f2768 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]"}
{"level":"info","ts":"2023-04-25T03:36:22.383668Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"c330302d396f2768 became follower at term 1"}
{"level":"info","ts":"2023-04-25T03:36:22.383791Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"c330302d396f2768 switched to configuration voters=(2531763403718161201)"}
{"level":"info","ts":"2023-04-25T03:36:22.38385Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"c330302d396f2768 switched to configuration voters=(2531763403718161201 12987238743530219448)"}
{"level":"info","ts":"2023-04-25T03:36:22.383916Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"c330302d396f2768 switched to configuration voters=(2531763403718161201 12987238743530219448 14064794607073306472)"}
{"level":"info","ts":"2023-04-25T03:36:22.387004Z","caller":"mvcc/kvstore.go:393","msg":"kvstore restored","current-rev":1}
{"level":"debug","ts":"2023-04-25T03:36:22.387128Z","caller":"etcdserver/server.go:619","msg":"restore consistentIndex","index":0}
{"level":"info","ts":"2023-04-25T03:36:22.387516Z","caller":"etcdserver/quota.go:94","msg":"enabled backend quota with default value","quota-name":"v3-applier","quota-size-bytes":2147483648,"quota-size":"2.1 GB"}
{"level":"info","ts":"2023-04-25T03:36:22.387729Z","caller":"rafthttp/peer.go:133","msg":"starting remote peer","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T03:36:22.387773Z","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T03:36:22.388552Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T03:36:22.38956Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T03:36:22.391571Z","caller":"rafthttp/peer.go:137","msg":"started remote peer","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T03:36:22.391584Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"debug","ts":"2023-04-25T03:36:22.391652Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"2322a166ddf47331","address":"http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/msgapp/c330302d396f2768"}
{"level":"info","ts":"2023-04-25T03:36:22.39167Z","caller":"rafthttp/transport.go:317","msg":"added remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331","remote-peer-urls":["http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380"]}
{"level":"info","ts":"2023-04-25T03:36:22.391719Z","caller":"rafthttp/peer.go:133","msg":"starting remote peer","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T03:36:22.391697Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"debug","ts":"2023-04-25T03:36:22.391842Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"2322a166ddf47331","address":"http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/message/c330302d396f2768"}
{"level":"info","ts":"2023-04-25T03:36:22.39176Z","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T03:36:22.393214Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T03:36:22.39347Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T03:36:22.394385Z","caller":"rafthttp/peer.go:137","msg":"started remote peer","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"debug","ts":"2023-04-25T03:36:22.394427Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"2322a166ddf47331","error":"failed to dial 2322a166ddf47331 on stream MsgApp v2 (the member has been permanently removed from the cluster)"}
{"level":"debug","ts":"2023-04-25T03:36:22.394507Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"2322a166ddf47331","address":"http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/msgapp/c330302d396f2768"}
{"level":"info","ts":"2023-04-25T03:36:22.394462Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"debug","ts":"2023-04-25T03:36:22.394522Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"2322a166ddf47331","error":"failed to dial 2322a166ddf47331 on stream Message (the member has been permanently removed from the cluster)"}
{"level":"debug","ts":"2023-04-25T03:36:22.394553Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"b43bf0d3f14f27b8","address":"http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/msgapp/c330302d396f2768"}
{"level":"debug","ts":"2023-04-25T03:36:22.394597Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"2322a166ddf47331","address":"http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/message/c330302d396f2768"}
{"level":"info","ts":"2023-04-25T03:36:22.394524Z","caller":"rafthttp/transport.go:317","msg":"added remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8","remote-peer-urls":["http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380"]}
{"level":"info","ts":"2023-04-25T03:36:22.394556Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"debug","ts":"2023-04-25T03:36:22.394669Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"b43bf0d3f14f27b8","address":"http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/message/c330302d396f2768"}
{"level":"info","ts":"2023-04-25T03:36:22.394718Z","caller":"etcdserver/server.go:854","msg":"starting etcd server","local-member-id":"c330302d396f2768","local-server-version":"3.5.8","cluster-version":"to_be_decided"}
{"level":"info","ts":"2023-04-25T03:36:22.394929Z","caller":"etcdserver/server.go:754","msg":"starting initial election tick advance","election-ticks":10}
{"level":"info","ts":"2023-04-25T03:36:22.394948Z","caller":"fileutil/purge.go:44","msg":"started to purge file","dir":"/bitnami/etcd/data/member/snap","suffix":"snap.db","max":5,"interval":"30s"}
{"level":"info","ts":"2023-04-25T03:36:22.395078Z","caller":"fileutil/purge.go:44","msg":"started to purge file","dir":"/bitnami/etcd/data/member/snap","suffix":"snap","max":5,"interval":"30s"}
{"level":"warn","ts":"2023-04-25T03:36:22.395026Z","caller":"etcdserver/server.go:1127","msg":"server error","error":"the member has been permanently removed from the cluster"}
{"level":"info","ts":"2023-04-25T03:36:22.395196Z","caller":"fileutil/purge.go:44","msg":"started to purge file","dir":"/bitnami/etcd/data/member/wal","suffix":"wal","max":5,"interval":"30s"}
{"level":"warn","ts":"2023-04-25T03:36:22.395235Z","caller":"etcdserver/server.go:1128","msg":"data-dir used by this member must be removed"}
{"level":"warn","ts":"2023-04-25T03:36:22.395437Z","caller":"etcdserver/server.go:2073","msg":"stopped publish because server is stopped","local-member-id":"c330302d396f2768","local-member-attributes":"{Name:myetcd-1 ClientURLs:[http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2379 http://myetcd.etcd.svc.cluster.local:2379]}","publish-timeout":"7s","error":"etcdserver: server stopped"}
{"level":"info","ts":"2023-04-25T03:36:22.396016Z","caller":"rafthttp/peer.go:330","msg":"stopping remote peer","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T03:36:22.396087Z","caller":"rafthttp/stream.go:294","msg":"stopped TCP streaming connection with remote peer","stream-writer-type":"unknown stream","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T03:36:22.396169Z","caller":"rafthttp/stream.go:294","msg":"stopped TCP streaming connection with remote peer","stream-writer-type":"unknown stream","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T03:36:22.396231Z","caller":"rafthttp/pipeline.go:85","msg":"stopped HTTP pipelining with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"debug","ts":"2023-04-25T03:36:22.396455Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"2322a166ddf47331","error":"failed to dial 2322a166ddf47331 on stream MsgApp v2 (context canceled)"}
{"level":"info","ts":"2023-04-25T03:36:22.39654Z","caller":"rafthttp/stream.go:442","msg":"stopped stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"debug","ts":"2023-04-25T03:36:22.396836Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"2322a166ddf47331","error":"failed to dial 2322a166ddf47331 on stream Message (context canceled)"}
{"level":"info","ts":"2023-04-25T03:36:22.396869Z","caller":"rafthttp/stream.go:442","msg":"stopped stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T03:36:22.39691Z","caller":"rafthttp/peer.go:335","msg":"stopped remote peer","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T03:36:22.396941Z","caller":"rafthttp/peer.go:330","msg":"stopping remote peer","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T03:36:22.396978Z","caller":"rafthttp/stream.go:294","msg":"stopped TCP streaming connection with remote peer","stream-writer-type":"unknown stream","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T03:36:22.397061Z","caller":"rafthttp/stream.go:294","msg":"stopped TCP streaming connection with remote peer","stream-writer-type":"unknown stream","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T03:36:22.397153Z","caller":"rafthttp/pipeline.go:85","msg":"stopped HTTP pipelining with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"debug","ts":"2023-04-25T03:36:22.39717Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"b43bf0d3f14f27b8","error":"failed to dial b43bf0d3f14f27b8 on stream Message (the member has been permanently removed from the cluster)"}
{"level":"debug","ts":"2023-04-25T03:36:22.397215Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"b43bf0d3f14f27b8","address":"http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/message/c330302d396f2768"}
{"level":"debug","ts":"2023-04-25T03:36:22.397227Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"b43bf0d3f14f27b8","error":"failed to dial b43bf0d3f14f27b8 on stream MsgApp v2 (the member has been permanently removed from the cluster)"}
{"level":"info","ts":"2023-04-25T03:36:22.397254Z","caller":"rafthttp/stream.go:442","msg":"stopped stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"debug","ts":"2023-04-25T03:36:22.397312Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"b43bf0d3f14f27b8","error":"failed to dial b43bf0d3f14f27b8 on stream Message (context canceled)"}
{"level":"info","ts":"2023-04-25T03:36:22.397362Z","caller":"rafthttp/stream.go:442","msg":"stopped stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T03:36:22.397398Z","caller":"rafthttp/peer.go:335","msg":"stopped remote peer","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T03:36:22.397959Z","caller":"embed/etcd.go:597","msg":"serving peer traffic","address":"[::]:2380"}
{"level":"info","ts":"2023-04-25T03:36:22.397975Z","caller":"embed/etcd.go:278","msg":"now serving peer/client/metrics","local-member-id":"c330302d396f2768","initial-advertise-peer-urls":["http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2380"],"listen-peer-urls":["http://0.0.0.0:2380"],"advertise-client-urls":["http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2379","http://myetcd.etcd.svc.cluster.local:2379"],"listen-client-urls":["http://0.0.0.0:2379"],"listen-metrics-urls":[]}
{"level":"info","ts":"2023-04-25T03:36:22.397988Z","caller":"embed/etcd.go:569","msg":"cmux::serve","address":"[::]:2380"}
{"level":"info","ts":"2023-04-25T03:36:22.398116Z","caller":"etcdmain/main.go:44","msg":"notifying init daemon"}
{"level":"info","ts":"2023-04-25T03:36:22.398151Z","caller":"etcdmain/main.go:50","msg":"successfully notified init daemon"}

After Crashloopback State it restarts with the below logs:

tester@bln-k8s-161 ➜  etcd ke logs myetcd-1
etcd 04:12:58.80
etcd 04:12:58.81 Welcome to the Bitnami etcd container
etcd 04:12:58.81 Subscribe to project updates by watching https://github.com/bitnami/containers
etcd 04:12:58.81 Submit issues and feature requests at https://github.com/bitnami/containers/issues
etcd 04:12:58.82
etcd 04:12:58.82 INFO  ==> ** Starting etcd setup **
etcd 04:12:58.84 INFO  ==> Validating settings in ETCD_* env vars..
etcd 04:12:58.85 INFO  ==> Initializing etcd
etcd 04:12:58.85 INFO  ==> Generating etcd config file using env variables
etcd 04:12:58.88 INFO  ==> Detected data from previous deployments
etcd 04:12:59.09 DEBUG ==> myetcd-0.myetcd-headless.etcd.svc.cluster.local:2379 endpoint is active
etcd 04:12:59.24 DEBUG ==> myetcd-2.myetcd-headless.etcd.svc.cluster.local:2379 endpoint is active
etcd 04:12:59.26 INFO  ==> Member ID wasn't properly stored, the member will try to join the cluster by it's own
etcd 04:12:59.27 INFO  ==> ** etcd setup finished! **

etcd 04:12:59.29 INFO  ==> ** Starting etcd **
{"level":"info","ts":"2023-04-25T04:12:59.316146Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_ADVERTISE_CLIENT_URLS","variable-value":"http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2379,http://myetcd.etcd.svc.cluster.local:2379"}
{"level":"info","ts":"2023-04-25T04:12:59.316322Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_AUTH_TOKEN","variable-value":"jwt,priv-key=/opt/bitnami/etcd/certs/token/jwt-token.pem,sign-method=RS256,ttl=10m"}
{"level":"info","ts":"2023-04-25T04:12:59.316352Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_AUTO_TLS","variable-value":"false"}
{"level":"info","ts":"2023-04-25T04:12:59.316381Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_CLIENT_CERT_AUTH","variable-value":"false"}
{"level":"info","ts":"2023-04-25T04:12:59.31641Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_DATA_DIR","variable-value":"/bitnami/etcd/data"}
{"level":"info","ts":"2023-04-25T04:12:59.316499Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_ADVERTISE_PEER_URLS","variable-value":"http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2380"}
{"level":"info","ts":"2023-04-25T04:12:59.316523Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_CLUSTER","variable-value":"myetcd-0=http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380,myetcd-1=http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2380,myetcd-2=http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380"}
{"level":"info","ts":"2023-04-25T04:12:59.316544Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_CLUSTER_STATE","variable-value":"new"}
{"level":"info","ts":"2023-04-25T04:12:59.316572Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_CLUSTER_TOKEN","variable-value":"etcd-cluster-k8s"}
{"level":"info","ts":"2023-04-25T04:12:59.316607Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LISTEN_CLIENT_URLS","variable-value":"http://0.0.0.0:2379"}
{"level":"info","ts":"2023-04-25T04:12:59.316635Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LISTEN_PEER_URLS","variable-value":"http://0.0.0.0:2380"}
{"level":"info","ts":"2023-04-25T04:12:59.316662Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LOG_LEVEL","variable-value":"debug"}
{"level":"info","ts":"2023-04-25T04:12:59.316696Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_NAME","variable-value":"myetcd-1"}
{"level":"info","ts":"2023-04-25T04:12:59.316725Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_PEER_AUTO_TLS","variable-value":"false"}
{"level":"warn","ts":"2023-04-25T04:12:59.316807Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_TRUSTED_CA_FILE="}
{"level":"warn","ts":"2023-04-25T04:12:59.316837Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DISABLE_STORE_MEMBER_ID=no"}
{"level":"warn","ts":"2023-04-25T04:12:59.316854Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_CONF_FILE=/opt/bitnami/etcd/conf/etcd.yaml"}
{"level":"warn","ts":"2023-04-25T04:12:59.316875Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_SNAPSHOT_HISTORY_LIMIT=1"}
{"level":"warn","ts":"2023-04-25T04:12:59.316904Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_ON_K8S=yes"}
{"level":"warn","ts":"2023-04-25T04:12:59.316925Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_SNAPSHOTS_DIR=/snapshots"}
{"level":"warn","ts":"2023-04-25T04:12:59.316943Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_BIN_DIR=/opt/bitnami/etcd/bin"}
{"level":"warn","ts":"2023-04-25T04:12:59.316963Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_VOLUME_DIR=/bitnami/etcd"}
{"level":"warn","ts":"2023-04-25T04:12:59.316983Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_ROOT_PASSWORD=BdUzwPlMMo"}
{"level":"warn","ts":"2023-04-25T04:12:59.317004Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_CLUSTER_DOMAIN=myetcd-headless.etcd.svc.cluster.local"}
{"level":"warn","ts":"2023-04-25T04:12:59.317021Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DISASTER_RECOVERY=no"}
{"level":"warn","ts":"2023-04-25T04:12:59.31704Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_KEY_FILE="}
{"level":"warn","ts":"2023-04-25T04:12:59.317057Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_CONF_DIR=/opt/bitnami/etcd/conf"}
{"level":"warn","ts":"2023-04-25T04:12:59.317078Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DAEMON_GROUP=etcd"}
{"level":"warn","ts":"2023-04-25T04:12:59.317094Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_START_FROM_SNAPSHOT=no"}
{"level":"warn","ts":"2023-04-25T04:12:59.317118Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_INIT_SNAPSHOT_FILENAME="}
{"level":"warn","ts":"2023-04-25T04:12:59.317135Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_INIT_SNAPSHOTS_DIR=/init-snapshot"}
{"level":"warn","ts":"2023-04-25T04:12:59.317156Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DISABLE_PRESTOP=no"}
{"level":"warn","ts":"2023-04-25T04:12:59.317172Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_TMP_DIR=/opt/bitnami/etcd/tmp"}
{"level":"warn","ts":"2023-04-25T04:12:59.317196Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_BASE_DIR=/opt/bitnami/etcd"}
{"level":"warn","ts":"2023-04-25T04:12:59.317215Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_CERT_FILE="}
{"level":"warn","ts":"2023-04-25T04:12:59.317236Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_NEW_MEMBERS_ENV_FILE=/bitnami/etcd/data/new_member_envs"}
{"level":"warn","ts":"2023-04-25T04:12:59.317252Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DAEMON_USER=etcd"}
{"level":"warn","ts":"2023-04-25T04:12:59.317935Z","caller":"embed/config.go:673","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2023-04-25T04:12:59.318002Z","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["etcd"]}
{"level":"info","ts":"2023-04-25T04:12:59.318144Z","caller":"etcdmain/etcd.go:116","msg":"server has been already initialized","data-dir":"/bitnami/etcd/data","dir-type":"member"}
{"level":"warn","ts":"2023-04-25T04:12:59.3182Z","caller":"embed/config.go:673","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2023-04-25T04:12:59.318222Z","caller":"embed/etcd.go:127","msg":"configuring peer listeners","listen-peer-urls":["http://0.0.0.0:2380"]}
{"level":"info","ts":"2023-04-25T04:12:59.318578Z","caller":"embed/etcd.go:135","msg":"configuring client listeners","listen-client-urls":["http://0.0.0.0:2379"]}
{"level":"info","ts":"2023-04-25T04:12:59.318782Z","caller":"embed/etcd.go:309","msg":"starting an etcd server","etcd-version":"3.5.8","git-sha":"217d183e5","go-version":"go1.19.8","go-os":"linux","go-arch":"amd64","max-cpu-set":32,"max-cpu-available":32,"member-initialized":true,"name":"myetcd-1","data-dir":"/bitnami/etcd/data","wal-dir":"","wal-dir-dedicated":"","member-dir":"/bitnami/etcd/data/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2380"],"listen-peer-urls":["http://0.0.0.0:2380"],"advertise-client-urls":["http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2379","http://myetcd.etcd.svc.cluster.local:2379"],"listen-client-urls":["http://0.0.0.0:2379"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"","initial-cluster-state":"new","initial-cluster-token":"","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"periodic","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
{"level":"info","ts":"2023-04-25T04:12:59.322258Z","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/bitnami/etcd/data/member/snap/db","took":"3.026124ms"}
{"level":"info","ts":"2023-04-25T04:12:59.322682Z","caller":"etcdserver/server.go:530","msg":"No snapshot found. Recovering WAL from scratch!"}
{"level":"info","ts":"2023-04-25T04:12:59.323731Z","caller":"etcdserver/raft.go:530","msg":"restarting local member","cluster-id":"cba674359d9edebe","local-member-id":"c330302d396f2768","commit-index":3}
{"level":"info","ts":"2023-04-25T04:12:59.32391Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"c330302d396f2768 switched to configuration voters=()"}
{"level":"info","ts":"2023-04-25T04:12:59.323981Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"c330302d396f2768 became follower at term 1"}
{"level":"info","ts":"2023-04-25T04:12:59.323994Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"newRaft c330302d396f2768 [peers: [], term: 1, commit: 3, applied: 0, lastindex: 3, lastterm: 1]"}
{"level":"info","ts":"2023-04-25T04:12:59.326331Z","caller":"mvcc/kvstore.go:393","msg":"kvstore restored","current-rev":1}
{"level":"debug","ts":"2023-04-25T04:12:59.326413Z","caller":"etcdserver/server.go:619","msg":"restore consistentIndex","index":3}
{"level":"info","ts":"2023-04-25T04:12:59.326816Z","caller":"etcdserver/quota.go:94","msg":"enabled backend quota with default value","quota-name":"v3-applier","quota-size-bytes":2147483648,"quota-size":"2.1 GB"}
{"level":"info","ts":"2023-04-25T04:12:59.327209Z","caller":"etcdserver/server.go:854","msg":"starting etcd server","local-member-id":"c330302d396f2768","local-server-version":"3.5.8","cluster-version":"to_be_decided"}
{"level":"info","ts":"2023-04-25T04:12:59.327986Z","caller":"fileutil/purge.go:44","msg":"started to purge file","dir":"/bitnami/etcd/data/member/snap","suffix":"snap.db","max":5,"interval":"30s"}
{"level":"info","ts":"2023-04-25T04:12:59.328173Z","caller":"fileutil/purge.go:44","msg":"started to purge file","dir":"/bitnami/etcd/data/member/snap","suffix":"snap","max":5,"interval":"30s"}
{"level":"info","ts":"2023-04-25T04:12:59.328202Z","caller":"fileutil/purge.go:44","msg":"started to purge file","dir":"/bitnami/etcd/data/member/wal","suffix":"wal","max":5,"interval":"30s"}
{"level":"info","ts":"2023-04-25T04:12:59.327897Z","caller":"etcdserver/server.go:754","msg":"starting initial election tick advance","election-ticks":10}
{"level":"debug","ts":"2023-04-25T04:12:59.328309Z","caller":"etcdserver/server.go:2142","msg":"Applying entries","num-entries":3}
{"level":"debug","ts":"2023-04-25T04:12:59.328637Z","caller":"etcdserver/server.go:2145","msg":"Applying entry","index":1,"term":1,"type":"EntryConfChange"}
{"level":"info","ts":"2023-04-25T04:12:59.329451Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"c330302d396f2768 switched to configuration voters=(2531763403718161201)"}
{"level":"info","ts":"2023-04-25T04:12:59.329662Z","caller":"membership/cluster.go:421","msg":"added member","cluster-id":"cba674359d9edebe","local-member-id":"c330302d396f2768","added-peer-id":"2322a166ddf47331","added-peer-peer-urls":["http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380"]}
{"level":"info","ts":"2023-04-25T04:12:59.329723Z","caller":"rafthttp/peer.go:133","msg":"starting remote peer","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T04:12:59.329835Z","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T04:12:59.334208Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T04:12:59.33425Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T04:12:59.334748Z","caller":"rafthttp/peer.go:137","msg":"started remote peer","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T04:12:59.334891Z","caller":"rafthttp/transport.go:317","msg":"added remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331","remote-peer-urls":["http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380"]}
{"level":"debug","ts":"2023-04-25T04:12:59.334961Z","caller":"etcdserver/server.go:2145","msg":"Applying entry","index":2,"term":1,"type":"EntryConfChange"}
{"level":"info","ts":"2023-04-25T04:12:59.33485Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T04:12:59.335399Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T04:12:59.335503Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"c330302d396f2768 switched to configuration voters=(2531763403718161201 12987238743530219448)"}
{"level":"debug","ts":"2023-04-25T04:12:59.335503Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"2322a166ddf47331","address":"http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/msgapp/c330302d396f2768"}
{"level":"debug","ts":"2023-04-25T04:12:59.335532Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"2322a166ddf47331","address":"http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/message/c330302d396f2768"}
{"level":"info","ts":"2023-04-25T04:12:59.335597Z","caller":"membership/cluster.go:421","msg":"added member","cluster-id":"cba674359d9edebe","local-member-id":"c330302d396f2768","added-peer-id":"b43bf0d3f14f27b8","added-peer-peer-urls":["http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380"]}
{"level":"info","ts":"2023-04-25T04:12:59.335639Z","caller":"rafthttp/peer.go:133","msg":"starting remote peer","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T04:12:59.3357Z","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T04:12:59.336103Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T04:12:59.33634Z","caller":"rafthttp/stream.go:169","msg":"started stream writer with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T04:12:59.336733Z","caller":"rafthttp/peer.go:137","msg":"started remote peer","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T04:12:59.336774Z","caller":"rafthttp/transport.go:317","msg":"added remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8","remote-peer-urls":["http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380"]}
{"level":"info","ts":"2023-04-25T04:12:59.336764Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"debug","ts":"2023-04-25T04:12:59.336814Z","caller":"etcdserver/server.go:2145","msg":"Applying entry","index":3,"term":1,"type":"EntryConfChange"}
{"level":"debug","ts":"2023-04-25T04:12:59.336825Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"b43bf0d3f14f27b8","address":"http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/msgapp/c330302d396f2768"}
{"level":"info","ts":"2023-04-25T04:12:59.336838Z","caller":"rafthttp/stream.go:395","msg":"started stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"debug","ts":"2023-04-25T04:12:59.336963Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"b43bf0d3f14f27b8","address":"http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/message/c330302d396f2768"}
{"level":"info","ts":"2023-04-25T04:12:59.336974Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"c330302d396f2768 switched to configuration voters=(2531763403718161201 12987238743530219448 14064794607073306472)"}
{"level":"info","ts":"2023-04-25T04:12:59.337073Z","caller":"membership/cluster.go:421","msg":"added member","cluster-id":"cba674359d9edebe","local-member-id":"c330302d396f2768","added-peer-id":"c330302d396f2768","added-peer-peer-urls":["http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2380"]}
{"level":"info","ts":"2023-04-25T04:12:59.337397Z","caller":"embed/etcd.go:597","msg":"serving peer traffic","address":"[::]:2380"}
{"level":"info","ts":"2023-04-25T04:12:59.337448Z","caller":"embed/etcd.go:569","msg":"cmux::serve","address":"[::]:2380"}
{"level":"info","ts":"2023-04-25T04:12:59.337517Z","caller":"embed/etcd.go:278","msg":"now serving peer/client/metrics","local-member-id":"c330302d396f2768","initial-advertise-peer-urls":["http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2380"],"listen-peer-urls":["http://0.0.0.0:2380"],"advertise-client-urls":["http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2379","http://myetcd.etcd.svc.cluster.local:2379"],"listen-client-urls":["http://0.0.0.0:2379"],"listen-metrics-urls":[]}
{"level":"debug","ts":"2023-04-25T04:12:59.338903Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"2322a166ddf47331","error":"failed to dial 2322a166ddf47331 on stream Message (the member has been permanently removed from the cluster)"}
{"level":"warn","ts":"2023-04-25T04:12:59.338952Z","caller":"etcdserver/server.go:1127","msg":"server error","error":"the member has been permanently removed from the cluster"}
{"level":"debug","ts":"2023-04-25T04:12:59.339087Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"2322a166ddf47331","address":"http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/message/c330302d396f2768"}
{"level":"warn","ts":"2023-04-25T04:12:59.339087Z","caller":"etcdserver/server.go:1128","msg":"data-dir used by this member must be removed"}
{"level":"warn","ts":"2023-04-25T04:12:59.339213Z","caller":"etcdserver/server.go:2083","msg":"failed to publish local member to cluster through raft","local-member-id":"c330302d396f2768","local-member-attributes":"{Name:myetcd-1 ClientURLs:[http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2379 http://myetcd.etcd.svc.cluster.local:2379]}","request-path":"/0/members/c330302d396f2768/attributes","publish-timeout":"7s","error":"etcdserver: request cancelled"}
{"level":"debug","ts":"2023-04-25T04:12:59.339234Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"2322a166ddf47331","error":"failed to dial 2322a166ddf47331 on stream MsgApp v2 (the member has been permanently removed from the cluster)"}
{"level":"debug","ts":"2023-04-25T04:12:59.33935Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"2322a166ddf47331","address":"http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/msgapp/c330302d396f2768"}
{"level":"warn","ts":"2023-04-25T04:12:59.339405Z","caller":"etcdserver/server.go:2083","msg":"failed to publish local member to cluster through raft","local-member-id":"c330302d396f2768","local-member-attributes":"{Name:myetcd-1 ClientURLs:[http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2379 http://myetcd.etcd.svc.cluster.local:2379]}","request-path":"/0/members/c330302d396f2768/attributes","publish-timeout":"7s","error":"etcdserver: request cancelled"}
{"level":"warn","ts":"2023-04-25T04:12:59.33945Z","caller":"etcdserver/server.go:2073","msg":"stopped publish because server is stopped","local-member-id":"c330302d396f2768","local-member-attributes":"{Name:myetcd-1 ClientURLs:[http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2379 http://myetcd.etcd.svc.cluster.local:2379]}","publish-timeout":"7s","error":"etcdserver: server stopped"}
{"level":"info","ts":"2023-04-25T04:12:59.339533Z","caller":"rafthttp/peer.go:330","msg":"stopping remote peer","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T04:12:59.339638Z","caller":"rafthttp/stream.go:294","msg":"stopped TCP streaming connection with remote peer","stream-writer-type":"unknown stream","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T04:12:59.33972Z","caller":"rafthttp/stream.go:294","msg":"stopped TCP streaming connection with remote peer","stream-writer-type":"unknown stream","remote-peer-id":"2322a166ddf47331"}
{"level":"debug","ts":"2023-04-25T04:12:59.339719Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"b43bf0d3f14f27b8","error":"failed to dial b43bf0d3f14f27b8 on stream Message (the member has been permanently removed from the cluster)"}
{"level":"info","ts":"2023-04-25T04:12:59.339825Z","caller":"rafthttp/pipeline.go:85","msg":"stopped HTTP pipelining with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"debug","ts":"2023-04-25T04:12:59.339837Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"b43bf0d3f14f27b8","address":"http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/message/c330302d396f2768"}
{"level":"debug","ts":"2023-04-25T04:12:59.340145Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"2322a166ddf47331","error":"failed to dial 2322a166ddf47331 on stream MsgApp v2 (context canceled)"}
{"level":"info","ts":"2023-04-25T04:12:59.340174Z","caller":"rafthttp/stream.go:442","msg":"stopped stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"debug","ts":"2023-04-25T04:12:59.34032Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"b43bf0d3f14f27b8","error":"failed to dial b43bf0d3f14f27b8 on stream MsgApp v2 (the member has been permanently removed from the cluster)"}
{"level":"debug","ts":"2023-04-25T04:12:59.340376Z","caller":"rafthttp/stream.go:570","msg":"dial stream reader","from":"c330302d396f2768","to":"b43bf0d3f14f27b8","address":"http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380/raft/stream/msgapp/c330302d396f2768"}
{"level":"debug","ts":"2023-04-25T04:12:59.340405Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"2322a166ddf47331","error":"failed to dial 2322a166ddf47331 on stream Message (context canceled)"}
{"level":"info","ts":"2023-04-25T04:12:59.340546Z","caller":"rafthttp/stream.go:442","msg":"stopped stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"c330302d396f2768","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T04:12:59.340677Z","caller":"rafthttp/peer.go:335","msg":"stopped remote peer","remote-peer-id":"2322a166ddf47331"}
{"level":"info","ts":"2023-04-25T04:12:59.340703Z","caller":"rafthttp/peer.go:330","msg":"stopping remote peer","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T04:12:59.340729Z","caller":"rafthttp/stream.go:294","msg":"stopped TCP streaming connection with remote peer","stream-writer-type":"unknown stream","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T04:12:59.340809Z","caller":"rafthttp/stream.go:294","msg":"stopped TCP streaming connection with remote peer","stream-writer-type":"unknown stream","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T04:12:59.34103Z","caller":"rafthttp/pipeline.go:85","msg":"stopped HTTP pipelining with remote peer","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"debug","ts":"2023-04-25T04:12:59.34115Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"b43bf0d3f14f27b8","error":"failed to dial b43bf0d3f14f27b8 on stream MsgApp v2 (context canceled)"}
{"level":"info","ts":"2023-04-25T04:12:59.341199Z","caller":"rafthttp/stream.go:442","msg":"stopped stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"debug","ts":"2023-04-25T04:12:59.341253Z","caller":"rafthttp/peer_status.go:76","msg":"peer deactivated again","peer-id":"b43bf0d3f14f27b8","error":"failed to dial b43bf0d3f14f27b8 on stream Message (context canceled)"}
{"level":"info","ts":"2023-04-25T04:12:59.341274Z","caller":"rafthttp/stream.go:442","msg":"stopped stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"c330302d396f2768","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T04:12:59.341312Z","caller":"rafthttp/peer.go:335","msg":"stopped remote peer","remote-peer-id":"b43bf0d3f14f27b8"}
{"level":"info","ts":"2023-04-25T04:12:59.344079Z","caller":"etcdmain/main.go:44","msg":"notifying init daemon"}
{"level":"info","ts":"2023-04-25T04:12:59.344113Z","caller":"etcdmain/main.go:50","msg":"successfully notified init daemon"}

@abhayycs
Copy link
Author

I'm not sure if it helps, please also find the logs when deleting one of the pod:

tester@bln-k8s-161 ➜  etcd ke delete pod myetcd-1
pod "myetcd-1" deleted

tester@bln-k8s-161 ➜  etcd ke get pods -o wide
NAME       READY   STATUS    RESTARTS   AGE    IP               NODE          NOMINATED NODE   READINESS GATES
myetcd-0   1/1     Running   0          3m3s   10.233.114.73    bln-k8s-162   <none>           <none>
myetcd-1   1/1     Running   0          73s    10.233.89.3      bln-k8s-163   <none>           <none>
myetcd-2   1/1     Running   0          3m3s   10.233.115.174   bln-k8s-161   <none>           <none>

tester@bln-k8s-161 ➜  etcd ke logs myetcd-1
etcd 04:21:23.48
etcd 04:21:23.48 Welcome to the Bitnami etcd container
etcd 04:21:23.48 Subscribe to project updates by watching https://github.com/bitnami/containers
etcd 04:21:23.49 Submit issues and feature requests at https://github.com/bitnami/containers/issues
etcd 04:21:23.49
etcd 04:21:23.49 INFO  ==> ** Starting etcd setup **
etcd 04:21:23.51 INFO  ==> Validating settings in ETCD_* env vars..
etcd 04:21:23.52 INFO  ==> Initializing etcd
etcd 04:21:23.52 INFO  ==> Generating etcd config file using env variables
etcd 04:21:23.55 INFO  ==> Detected data from previous deployments
etcd 04:21:25.77 DEBUG ==> myetcd-0.myetcd-headless.etcd.svc.cluster.local:2379 endpoint is active
etcd 04:21:25.92 DEBUG ==> myetcd-2.myetcd-headless.etcd.svc.cluster.local:2379 endpoint is active
etcd 04:21:25.93 DEBUG ==> Removal was properly recorded in member_removal.log
etcd 04:21:25.95 INFO  ==> Adding new member to existing cluster
etcd 04:21:26.14 INFO  ==> Obtaining cluster member ID
etcd 04:21:26.15 INFO  ==> Starting etcd in background
{"level":"info","ts":"2023-04-25T04:21:26.187624Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_ADVERTISE_CLIENT_URLS","variable-value":"http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2379,http://myetcd.etcd.svc.cluster.local:2379"}
{"level":"info","ts":"2023-04-25T04:21:26.18783Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_AUTH_TOKEN","variable-value":"jwt,priv-key=/opt/bitnami/etcd/certs/token/jwt-token.pem,sign-method=RS256,ttl=10m"}
{"level":"info","ts":"2023-04-25T04:21:26.187869Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_AUTO_TLS","variable-value":"false"}
{"level":"info","ts":"2023-04-25T04:21:26.187903Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_CLIENT_CERT_AUTH","variable-value":"false"}
{"level":"info","ts":"2023-04-25T04:21:26.187944Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_DATA_DIR","variable-value":"/bitnami/etcd/data"}
{"level":"info","ts":"2023-04-25T04:21:26.188059Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_ADVERTISE_PEER_URLS","variable-value":"http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2380"}
{"level":"info","ts":"2023-04-25T04:21:26.188085Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_CLUSTER","variable-value":"myetcd-0=http://myetcd-0.myetcd-headless.etcd.svc.cluster.local:2380,myetcd-1=http://myetcd-1.myetcd-headless.etcd.svc.cluster.local:2380,myetcd-2=http://myetcd-2.myetcd-headless.etcd.svc.cluster.local:2380"}
{"level":"info","ts":"2023-04-25T04:21:26.188106Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_CLUSTER_STATE","variable-value":"existing"}
{"level":"info","ts":"2023-04-25T04:21:26.188132Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_INITIAL_CLUSTER_TOKEN","variable-value":"etcd-cluster-k8s"}
{"level":"info","ts":"2023-04-25T04:21:26.188186Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LISTEN_CLIENT_URLS","variable-value":"http://0.0.0.0:2379"}
{"level":"info","ts":"2023-04-25T04:21:26.18822Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LISTEN_PEER_URLS","variable-value":"http://0.0.0.0:2380"}
{"level":"info","ts":"2023-04-25T04:21:26.188243Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LOG_LEVEL","variable-value":"debug"}
{"level":"info","ts":"2023-04-25T04:21:26.188276Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_NAME","variable-value":"myetcd-1"}
{"level":"info","ts":"2023-04-25T04:21:26.188306Z","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_PEER_AUTO_TLS","variable-value":"false"}
{"level":"warn","ts":"2023-04-25T04:21:26.188388Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_TRUSTED_CA_FILE="}
{"level":"warn","ts":"2023-04-25T04:21:26.188426Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DISABLE_STORE_MEMBER_ID=no"}
{"level":"warn","ts":"2023-04-25T04:21:26.188453Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_CONF_FILE=/opt/bitnami/etcd/conf/etcd.yaml"}
{"level":"warn","ts":"2023-04-25T04:21:26.188469Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_SNAPSHOT_HISTORY_LIMIT=1"}
{"level":"warn","ts":"2023-04-25T04:21:26.188493Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_ON_K8S=yes"}
{"level":"warn","ts":"2023-04-25T04:21:26.18851Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_SNAPSHOTS_DIR=/snapshots"}
{"level":"warn","ts":"2023-04-25T04:21:26.188532Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_BIN_DIR=/opt/bitnami/etcd/bin"}
{"level":"warn","ts":"2023-04-25T04:21:26.188551Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_VOLUME_DIR=/bitnami/etcd"}
{"level":"warn","ts":"2023-04-25T04:21:26.188571Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_ROOT_PASSWORD=gMhRJrcpC9"}
{"level":"warn","ts":"2023-04-25T04:21:26.188592Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_CLUSTER_DOMAIN=myetcd-headless.etcd.svc.cluster.local"}
{"level":"warn","ts":"2023-04-25T04:21:26.188611Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DISASTER_RECOVERY=no"}
{"level":"warn","ts":"2023-04-25T04:21:26.188628Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_KEY_FILE="}
{"level":"warn","ts":"2023-04-25T04:21:26.188648Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_CONF_DIR=/opt/bitnami/etcd/conf"}
{"level":"warn","ts":"2023-04-25T04:21:26.188664Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_ACTIVE_ENDPOINTS=myetcd-0.myetcd-headless.etcd.svc.cluster.local:2379,myetcd-2.myetcd-headless.etcd.svc.cluster.local:2379"}
{"level":"warn","ts":"2023-04-25T04:21:26.188686Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DAEMON_GROUP=etcd"}
{"level":"warn","ts":"2023-04-25T04:21:26.188702Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_START_FROM_SNAPSHOT=no"}
{"level":"warn","ts":"2023-04-25T04:21:26.188727Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_INIT_SNAPSHOT_FILENAME="}
{"level":"warn","ts":"2023-04-25T04:21:26.188743Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_INIT_SNAPSHOTS_DIR=/init-snapshot"}
{"level":"warn","ts":"2023-04-25T04:21:26.188764Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DISABLE_PRESTOP=no"}
{"level":"warn","ts":"2023-04-25T04:21:26.18878Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_TMP_DIR=/opt/bitnami/etcd/tmp"}
{"level":"warn","ts":"2023-04-25T04:21:26.188804Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_BASE_DIR=/opt/bitnami/etcd"}
{"level":"warn","ts":"2023-04-25T04:21:26.188826Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_CERT_FILE="}
{"level":"warn","ts":"2023-04-25T04:21:26.188846Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_NEW_MEMBERS_ENV_FILE=/bitnami/etcd/data/new_member_envs"}
{"level":"warn","ts":"2023-04-25T04:21:26.188862Z","caller":"flags/flag.go:93","msg":"unrecognized environment variable","environment-variable":"ETCD_DAEMON_USER=etcd"}
{"level":"warn","ts":"2023-04-25T04:21:26.1891Z","caller":"embed/config.go:673","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2023-04-25T04:21:26.189166Z","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["etcd"]}
{"level":"warn","ts":"2023-04-25T04:21:26.189306Z","caller":"etcdmain/etcd.go:446","msg":"found invalid file under data directory","filename":"new_member_envs","data-dir":"/bitnami/etcd/data"}
{"level":"warn","ts":"2023-04-25T04:21:26.189354Z","caller":"embed/config.go:673","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2023-04-25T04:21:26.189377Z","caller":"embed/etcd.go:127","msg":"configuring peer listeners","listen-peer-urls":["http://0.0.0.0:2380"]}

@aoterolorenzo
Copy link
Contributor

Seems indeed an issue with the ETCD_INITIAL_CLUSTER_STATE logic. I will create an internal task for the team to address it. We will reach you back here as soon as we can (have in mind we can give an exact ETA, since it depends on the team workload). Meanwhile, I will mark the issue as on-hold.

@aoterolorenzo aoterolorenzo added the on-hold Issues or Pull Requests with this label will never be considered stale label Apr 27, 2023
@nagyzekkyandras
Copy link

Hi @aoterolorenzo !
I have the same issue with the latest version.
There is any progress on this topic?

@github-actions github-actions bot added triage Triage is needed and removed on-hold Issues or Pull Requests with this label will never be considered stale labels Jun 27, 2023
@github-actions github-actions bot added the stale 15 days without activity label Dec 24, 2023
@6ixfalls
Copy link

not fixed

@github-actions github-actions bot removed the stale 15 days without activity label Dec 25, 2023
@juan131 juan131 added the on-hold Issues or Pull Requests with this label will never be considered stale label Jan 2, 2024
@djpenka
Copy link

djpenka commented Jan 11, 2024

Is a fix for this on the roadmap?

@rmuddana
Copy link

rmuddana commented Jan 18, 2024

bitnami etcd cluster seems to be so fragile when it comes to upgrade or even for pod restarts. There is no clue why it is failing!

rmuddana@cf08:~$ kbl c8-iad0 describe pod etcd-client-2 | grep INIT
      ETCD_INITIAL_ADVERTISE_PEER_URLS:  http://$(MY_POD_NAME).etcd-client-headless.com:2380
      ETCD_INITIAL_CLUSTER_TOKEN:        etcd-cluster-k8s
      ETCD_INITIAL_CLUSTER_STATE:        existing
      ETCD_INITIAL_CLUSTER:              etcd-client-0=http://etcd-client-0.etcd-client-headless.com:2380,etcd-client-1=http://etcd-client-1.etcd-client-headless.com:2380,etcd-client-2=http://etcd-client-2.etcd-client-headless.com:2380

rmuddana@cf08:~$ kbl c8-iad0 logs etcd-client-2 | more
etcd 05:31:10.15 
etcd 05:31:10.15 Welcome to the Bitnami etcd container
etcd 05:31:10.15 Subscribe to project updates by watching https://github.com/bitnami/containers
etcd 05:31:10.15 Submit issues and feature requests at https://github.com/bitnami/containers/issues
etcd 05:31:10.15 
etcd 05:31:10.15 INFO  ==> ** Starting etcd setup **
etcd 05:31:10.18 INFO  ==> Validating settings in ETCD_* env vars..
etcd 05:31:10.18 WARN  ==> You set the environment variable ALLOW_NONE_AUTHENTICATION=yes. For safety reasons, do not use this flag in a production en
vironment.
etcd 05:31:10.18 INFO  ==> Initializing etcd
etcd 05:31:10.18 INFO  ==> Generating etcd config file using env variables
etcd 05:31:10.20 INFO  ==> Detected data from previous deployments
etcd 05:31:10.33 INFO  ==> Adding new member to existing cluster
{"level":"warn","ts":"2024-01-18T05:31:10.369314Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpo
ints://0xc0004daa80/etcd-client-0.etcd-client-headless.com:2379","attempt":0,"error":"rpc error: code = Unavailable desc = etcdserver: unhealthy 
cluster"}
Error: etcdserver: unhealthy cluster

@rmuddana-ns
Copy link

rmuddana-ns commented Mar 7, 2024

Here is a workaround to recover the pod from crash loopback and make it join the cluster:

Find the etcd client that is down. In this case, it was etcd-client-0

$ kubectl get pods | grep etcd
etcd-client-0                           0/1     CrashLoopBackOff   67 (76s ago)    5h17m    knode01.c1.net   <none>           <none>
etcd-client-1                           1/1     Running            0               18d      knode02.c1.net   <none>           <none>
etcd-client-2                           1/1     Running            0               29d      knode05.c1.net   <none>           <none>

Find its member ID

Go to an etcd-client that is running OK like etcd-client-1 in this case and list the members

$ kubectl exec -it etcd-client-1 -- etcdctl member list
64d19d2f86bef81, started, etcd-client-0, http://etcd-client-0.etcd-client-headless.svc.c1.net:2380, http://etcd-client-0.etcd-client-headless.svc.c1.net:2379,http://etcd-client.svc.c1.net:2379, false
4050c251f9d2dd49, started, etcd-client-2, http://etcd-client-2.etcd-client-headless.svc.c1.net:2380, http://etcd-client-2.etcd-client-headless.svc.c1.net:2379,http://etcd-client.svc.c1.net:2379, false
814a84a90f431141, started, etcd-client-1, http://etcd-client-1.etcd-client-headless.svc.c1.net:2380, http://etcd-client-1.etcd-client-headless.svc.c1.net:2379,http://etcd-client..svc.c1.net:2379, false

The member to delete is etcd-client-0 with ID 64d19d2f86bef81

Delete that member

$ kubectl exec -it etcd-client-1 -- etcdctl member remove 64d19d2f86bef81
Member  64d19d2f86bef81 removed from cluster 2929d47079397a18

Wait until a new member joins the etcd cluster

$ kubectl exec -it etcd-client-1 -- etcdctl member list
4050c251f9d2dd49, started, etcd-client-2, http://etcd-client-2.etcd-client-headless.svc.c1.net:2380, http://etcd-client-2.etcd-client-headless.svc.c1.net:2379,http://etcd-client.svc.c1.net:2379, false
814a84a90f431141, started, etcd-client-1, http://etcd-client-1.etcd-client-headless.svc.c1.net:2380, http://etcd-client-1.etcd-client-headless.svc.c1.net:2379,http://etcd-client.svc.c1.net:2379, false
cb476fe6302e4350, started, etcd-client-0, http://etcd-client-0.etcd-client-headless.svc.c1.net:2380, http://etcd-client-0.etcd-client-headless.svc.c1.net:2379,http://etcd-client.svc.c1.net:2379, false

Check the etcd pods are all running

$ kubectl get pods | grep etcd
etcd-client-0                           1/1     Running   101 (37m ago)   8h
etcd-client-1                           1/1     Running   0               19d
etcd-client-2                           1/1     Running   0               30d

@6ixfalls
Copy link

6ixfalls commented Mar 7, 2024

@rmuddana-ns That doesn't seem to be a real workaround, as after doing that it just moves around which pod is stuck in a crash loop. After the deleted member joins the cluster again, another node gets kicked off starting the loop again.

@rmuddana-ns
Copy link

rmuddana-ns commented Mar 7, 2024

@6ixfalls It seems to be a different problem you have. Probably you have a rolling upgrade pending in your deployment due to a crash loopback issue. And once the pod is recovered, upgrade is moving to the new pod.

@6ixfalls
Copy link

6ixfalls commented Mar 7, 2024

@6ixfalls It seems to be a different problem you have. Probably you have a rolling upgrade pending in your deployment due to a crash loopback issue. And once the pod is recovered, upgrade is moving to the new pod.

Looks like you're right, thanks. Next step would be to get this fixed in the chart itself. Have you tried if this works with 2 or all 3 pods in a crashloop?

@rmuddana-ns
Copy link

It should work as long as you have at least one working pod. If you have all 3 of them in crash loopback, you have no one to listen to the etcdctl commands.

Yes, root cause has to be identified and fixed. It appears etcd is somehow holding on to the old member ID. I do not know at this point this is coming from the chart or some other place.

@melkypie
Copy link

The issue seems to be that when the pod is going down it removes itself from the cluster and then tries adding itself back to the cluster but keeps failing. We can see that in the preStop hook script (https://github.com/bitnami/containers/blob/main/bitnami/etcd/3.5/debian-12/rootfs/opt/bitnami/scripts/etcd/prestop.sh) the container removes itself from etcd cluster but is unable to join back after restarting. I looked more closely into why it could be happening and it seems like the first time after the restart the pod tries to join back it still has the member id saved and uses this part of the script (https://github.com/bitnami/containers/blob/main/bitnami/etcd/3.5/debian-12/rootfs/opt/bitnami/scripts/libetcd.sh#L711-L721). The second time around the member id is already lost and it starts throwing errors about not being able to find the member id. Unfortunately I cannot see what is exactly wrong there as I am not that well versed in etcd.

Currently the only workaround that I have found to actually help is to set removeMemberOnContainerTermination: false (https://github.com/bitnami/charts/blob/main/bitnami/etcd/values.yaml#L236). This will mean that it does not remove the etcd pod from etcd cluster when the pod is going for a restart.

However this means that if you are scaling down the etcd cluster, the pod that will be scaled down will not be removed from etcd member list but you can do that manually by executing into one of the remaining etcd pods and removing the member.

@carrodher
Copy link
Member

Thank you for bringing this issue to our attention. We appreciate your involvement! If you're interested in contributing a solution, we welcome you to create a pull request. The Bitnami team is excited to review your submission and offer feedback. You can find the contributing guidelines here.

Your contribution will greatly benefit the community. Feel free to reach out if you have any questions or need assistance.

@pietrogoddibit2win
Copy link

pietrogoddibit2win commented Sep 2, 2024

Hi, it seams that this workaround configuration is not working as expected.
Moving the pod in another node, means that PVC is created from scratch, when the pod starts it remains stale with this log
image

As far as is written here the behaviour of moving pod to another kubernetes node must work but the expected result is that it is not working anymore

@pietrogoddibit2win
Copy link

pietrogoddibit2win commented Oct 4, 2024

Any news about this problem? using removeMemberOnContainerTermination: false didn't work

@juan131
Copy link
Contributor

juan131 commented Oct 7, 2024

As far as is written here the behaviour of moving pod to another kubernetes node must work but the expected result is that it is not working anymore

@pietrogoddibit2win that assumes you're using a PersistentVolume that can be reattached to a different node (e.g. a GCE persistent disk, an Azure disk, an AWS EBS, etc.). However, if you're using a local store that won't work.

@pietrogoddibit2win
Copy link

As far as is written here the behaviour of moving pod to another kubernetes node must work but the expected result is that it is not working anymore

@pietrogoddibit2win that assumes you're using a PersistentVolume that can be reattached to a different node (e.g. a GCE persistent disk, an Azure disk, an AWS EBS, etc.). However, if you're using a local store that won't work.

Yes, we're on GCE using PVC

@forestgagnon
Copy link

The only way I have found to deploy a stable etcd cluster using this Helm chart on GKE is with removeMemberOnContainerTermination: false. Without that flag I eventually hit this bug and the cluster becomes unstable after node pool churn. The nature of the flaw makes it unlikely to be noticed when the cluster is first installed and tested, which makes it quite dangerous and impactful.

If the root cause is not on track to be fixed in the foreseeable future, consider making removeMemberOnContainerTermination: false the new default or strongly recommending it in the chart docs to avoid stability issues.

Stable values.yaml for me:

replicaCount: 5
autoCompactionMode: periodic
autoCompactionRetention: 10m
removeMemberOnContainerTermination: false
auth:
  rbac:
    enabled: true
    allowNoneAuthentication: false
pdb:
  create: true
  minAvailable: 4
extraEnvVars:
  - name: ETCD_SELF_SIGNED_CERT_VALIDITY
    value: "100" # 100 years

@TankerAnker
Copy link

TankerAnker commented Oct 11, 2024

I have the same issue running etcd in a mayastor deployment with etcd binding to local PVs (etcd replicas and PVs are distributed to multiple nodes).

With replicaCount > 1 and initialClusterState = new, this leads to ETCD_INITIAL_CLUSTER_STATE=new on each replica, causing the above problem.

Workaround that works for me:

  • Initial Helm deployment with only 1 etcd replica:
  replicaCount: 1
  initialClusterState: "new"
  persistence:
    enabled: true
    storageClass: "mayastor-etcd-localpv"
  • Scale up with second Helm deployment and provide initialClusterState=existing:
  replicaCount: 3
  initialClusterState: "existing"
  persistence:
    enabled: true
    storageClass: "mayastor-etcd-localpv"

@furucong
Copy link

In recent days, I also encountered this error. I deployed 3 etcd instances on a GKE Autopilot cluster. That's when a node draining happened, one etcd pod was rescheduled on a new node. It failed to restart and reported the error: "etcdserver: member not found.", and then it kept restarting and the final status of the pod was CrashLoopBackOff.

I checked the documentation and implementation of bitnami/etcd.
When the pod is rescheduled by Kubernetes for any reason, a "pre-stop" lifecycle hook is used to ensure that the etcdctl member remove command is executed. The hook stores the output of this command into a file named member_removal.log in the persistent volume attached to the etcd pod. prestop.sh
When the pod starts, it reads the local file to verify whether it is a removed etcd member. If it is determined that the member is not removed successfully, It would send “member update” command to the etcd cluster at startup. libetcd.sh

I checked the log and found that the “member remove” command was sent, but the member_removal.log file in the PVC may not have been saved successfully. I don’t know the exact reason because this bug does not occur every time and it may be related to the specific implementation of Kubernetes. In short, relying on the local file is not reliable!

Suggestions for users:

  • Using more etcd instances. For example set replicaCount=5

Suggestions for bitnami/etcd:

  • Disable “member update” and use “member add” always. Even if the previous member is not removed from the etcd cluster, there is no harm, only some warning logs will be output.
  • If you do want to use "member update", you should first send "member list" to check if the member exists, rather than relying on a local file.

@pckhoi
Copy link
Contributor

pckhoi commented Dec 17, 2024

The documentation said that you should use "member update" only to update peer URLs, not for a member to rejoin. The fix seems to be straightforward to me: just replace "member update" with a pair of "member remove" and "member add". Am I missing something here?

@pckhoi
Copy link
Contributor

pckhoi commented Dec 18, 2024

@abhayycs so I looked into your original logs and it seems that things happened in this order:

  1. When your pod is drained from a node, it gets rescheduled onto a new node and looks like it starts with an empty data directory. In that case, the image doesn't massage ETCD_INITIAL_CLUSTER_STATE env var any further. If you want to rejoin an existing cluster from an empty data dir then you have to set ETCD_INITIAL_CLUSTER_STATE to "yes" yourself. Since the image assumes that the cluster is new, it doesn't call "member add" first and so it fails.
  2. When the pod is started again, the data dir is already created so it no longer matters what ETCD_INITIAL_CLUSTER_STATE is. The etcd reads the existing member id from the data dir and try to join with it but again, the cluster was never told about this member id so it just crash loop from there.

I guess if there's a permanent fix to this issue, that would be for the image to ignore ETCD_INITIAL_CLUSTER_STATE env var altogether and figure out on its own whether the cluster is live and healthy or not by contacting other members.

@pckhoi
Copy link
Contributor

pckhoi commented Dec 18, 2024

Now moving on to my problem. Let me know if I should create a separate issue for this but there are several problems with the etcd image:

  1. It tries to test whether the member is removed by reading ${ETCD_VOLUME_DIR}/member_removal.log which could be wrong for any reason. Perhaps because the member removed itself successfully but failed to write the file or an operator removed the member by hand during debugging. The permanent fix for me is to use member list command in all cases.
  2. It tries to read the old member ID from the file ${ETCD_DATA_DIR}/member_id which could be inaccurate for the same reasons as above. The short-term fix is to set ETCD_DISABLE_STORE_MEMBER_ID=yes, the long-term fix is to remove this file from the scripts altogether.
  3. If the old member is intact, there's zero reasons to call member update, just start the etcd node
  4. If the old member is already removed, we shouldn't start a new node from a non-empty data directory
  5. During startup, if it is detected that the local member ID and the registered member ID differ, the scripts must remove the data directory and re-adds/starts the new member from scratch.
  6. There are many parts of the flow of etcd_initialize function that simply don't make sense

Overall, I think etcd_initialize needs a major rewrite. Here's how the flow should look like:

stateDiagram-v2
    state "etcd_initialize" as ei
    state "get_number_of_healthy_nodes" as hn
    state "start_new_cluster" as nc
    state "echo 'manual recovery required'" as em
    state "is_data_dir_empty" as ed
    state "start_etcd" as se
    state "stop_etcd" as st
    state "remove_data_dir" as rd
    state "remove_old_member_if_exist" as ro
    state "join_as_new_member" as jc

    [*] --> ei
    ei --> hn
    hn --> nc: equal 0
    nc --> [*]
    hn --> em: less than majority
    em --> [*]
    hn --> ed: equal or more than majority
    ed --> se: no
    se --> st: if succeeds
    st --> [*]
    se --> rd: if member is permanently removed
    rd --> ro
    ro --> jc
    jc --> [*]
    ed --> ro: yes
Loading

@github-actions github-actions bot added solved and removed on-hold Issues or Pull Requests with this label will never be considered stale labels Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
etcd solved tech-issues The user has a technical issue about an application
Projects
None yet