Skip to content
This repository has been archived by the owner on Nov 7, 2018. It is now read-only.

Pods elasticsearch failed with "Back-off restarting failed container" #205

Open
maryxu opened this issue Jul 10, 2018 · 5 comments
Open

Comments

@maryxu
Copy link

maryxu commented Jul 10, 2018

Can you help to guide us about why Pods elasticsearch failed with "Back-off restarting failed container"? Thanks a lot!

[root@####### ~]# kubectl describe pod efk3-elasticsearch-2 --namespace=efk --insecure-skip-tls-verify=true
Name: efk3-elasticsearch-2
Namespace: efk
Node: lvdevk8sw23/10.219.161.3
Start Time: Tue, 03 Jul 2018 14:58:28 +0800
Labels: app=elasticsearch
component=master
controller-revision-hash=efk3-elasticsearch-569cf776f
release=efk3
statefulset.kubernetes.io/pod-name=efk3-elasticsearch-2
Annotations:
Status: Running
IP: 10.42.6.19
Controlled By: StatefulSet/efk3-elasticsearch
Init Containers:
sysctl:
Container ID: docker://bed338fd0e395678abbbd1c49be7d14e1636faacdadba334c0e5607c4eb07251
Image: busybox
Image ID: docker-pullable://busybox@sha256:141c253bc4c3fd0a201d32dc1f493bcf3fff003b6df416dea4f41046e0f37d47
Port:
Command:
sysctl
-w
vm.max_map_count=262144
State: Terminated
Reason: Completed
Exit Code: 0
Started: Tue, 03 Jul 2018 14:58:32 +0800
Finished: Tue, 03 Jul 2018 14:58:32 +0800
Ready: True
Restart Count: 0
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from efk3-elasticsearch-token-5vqzr (ro)
Containers:
elasticsearch:
Container ID: docker://336bfce82124136d78de952552de2f2688c5f4249c9c440b1259fd7b3e230046
Image: docker.elastic.co/elasticsearch/elasticsearch-oss:6.2.4
Image ID: docker-pullable://docker.elastic.co/elasticsearch/elasticsearch-oss@sha256:2d9c774c536bd1f64abc4993ebc96a2344404d780cbeb81a8b3b4c3807550e57
Ports: 9300/TCP, 9200/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 10 Jul 2018 13:20:21 +0800
Finished: Tue, 10 Jul 2018 13:20:26 +0800
Ready: False
Restart Count: 1929
Limits:
cpu: 1
Requests:
cpu: 25m
memory: 512Mi
Readiness: http-get http://:9200/_cluster/health%3Flocal=true delay=5s timeout=1s period=10s #success=1 #failure=3
Environment:
cluster.name: efk3-cluster
discovery.zen.ping.unicast.hosts: efk3-elasticsearch
discovery.zen.minimum_master_nodes: 2
KUBERNETES_NAMESPACE: efk (v1:metadata.namespace)
discovery.zen.ping.unicast.hosts: efk3-elasticsearch
PROCESSORS: 1 (limits.cpu)
ES_JAVA_OPTS: -Djava.net.preferIPv4Stack=true -Xms512m -Xmx512m
Mounts:
/usr/share/elasticsearch/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from efk3-elasticsearch-token-5vqzr (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-efk3-elasticsearch-2
ReadOnly: false
efk3-elasticsearch-token-5vqzr:
Type: Secret (a volume populated by a Secret)
SecretName: efk3-elasticsearch-token-5vqzr
Optional: false
QoS Class: Burstable
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message


Normal Pulled 48m (x1921 over 6d) kubelet, lvdevk8sw23 Container image "docker.elastic.co/elasticsearch/elasticsearch-oss:6.2.4" already present on machine
Warning BackOff 3m (x44220 over 6d) kubelet, lvdevk8sw23 Back-off restarting failed container

@mat1010
Copy link

mat1010 commented Jul 10, 2018

@maryxu Could you run a kubectl logs po/efk3-elasticsearch-2 --namespace=efk --insecure-skip-tls-verify=true and check the PODs logs to see if there are issues with the process within the container itself?

It would be also helpful to paste the output always in code blocks for easier reading.

@maryxu
Copy link
Author

maryxu commented Jul 10, 2018 via email

@mat1010
Copy link

mat1010 commented Jul 10, 2018

It looks like a permission issue

Caused by: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes/0/node.lock
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) ~[?:?]
        at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[?:1.8.0_161]
        at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[?:1.8.0_161]
        at org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:125) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
        at org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
        at org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
        at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:209) ~[elasticsearch-6.2.4.jar:6.2.4]
        at org.elasticsearch.node.Node.<init>(Node.java:264) ~[elasticsearch-6.2.4.jar:6.2.4]
        at org.elasticsearch.node.Node.<init>(Node.java:246) ~[elasticsearch-6.2.4.jar:6.2.4]
        at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:213) ~[elasticsearch-6.2.4.jar:6.2.4]
        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:213) ~[elasticsearch-6.2.4.jar:6.2.4]
        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:323) ~[elasticsearch-6.2.4.jar:6.2.4]
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-6.2.4.jar:6.2.4]
        ... 6 more

Elasticsearch seems to not be able to write into /usr/share/elasticsearch/data/

@maryxu
Copy link
Author

maryxu commented Jul 10, 2018 via email

@mat1010
Copy link

mat1010 commented Jul 10, 2018

For the other applications, the container can write to the NFS data without no problem.

What are the other applications? Do you have multiple APPs running in the same container? Please post your kubernetes statefulset. Including the NFS volumes that are attached and mounted to the PODs.

The elasticsearch container will most likely not be started as root. Therefore the user elasticsearch needs the permissions to write into /usr/share/elasticsearch/data

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants