Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while loading datta #247

Closed
sonfrau opened this issue Dec 18, 2017 · 15 comments
Closed

Error while loading datta #247

sonfrau opened this issue Dec 18, 2017 · 15 comments

Comments

@sonfrau
Copy link

sonfrau commented Dec 18, 2017

I was using a 0.6.7 dockerized cerebro to manage five 5.6.2 Elasticsearch nodes but I was getting continuous "Error while loading data" messages.

I have updated the dockerized cerebro to the lasted version: 0.7.2 but I still have the same messages.
devops-749_1
Just I can ask for nodes status but I'm not able to perform any other operation.

I can connect to other ElasticSearch without those problems.

How could I troubleshoot this problem?

Thanks and kind regards

@sonfrau
Copy link
Author

sonfrau commented Dec 18, 2017

We are not even able to ask for the created snapshots

devops-749_2

We are creating a daily snapshot of some of our indexes.

Regards

@lmenezes
Copy link
Owner

@sonfrau Can you see what the error message says? If you click on the message that should expand. If not, could you share the logs?

@sonfrau
Copy link
Author

sonfrau commented Dec 18, 2017

In the application.log I'm getting this messages:

2017-12-18 08:28:57,557 - [ERROR] - from application in ForkJoinPool-2-worker-1 
Error processing request [path: /overview, body: {"host":"******","username":"******","password":"***********"}]
org.asynchttpclient.exception.RemotelyClosedException: Remotely closed

[error] application - Error processing request [path: /snapshots, body: {"host":"******","username":"******","password":"***********"}]
play.api.libs.json.JsResultException: JsResultException(errors:List((,List(ValidationError(List(error.expected.jsarray),WrappedArray())))))
at play.api.libs.json.JsReadable$$anonfun$2.apply(JsReadable.scala:23)
at play.api.libs.json.JsReadable$$anonfun$2.apply(JsReadable.scala:23)
at play.api.libs.json.JsResult$class.fold(JsResult.scala:73)
at play.api.libs.json.JsError.fold(JsResult.scala:13)
at play.api.libs.json.JsReadable$class.as(JsReadable.scala:21)

Could I give you further information?

Is there any way to start cerebro 072 in debugger mode to get more logging information?

Is it safe to change values on conf/logback.xml to raise logger level from INFO to DEBUG (e.g.)?

@lmenezes
Copy link
Owner

@sonfrau It should be safe to change the log level, although I must admit this might not make a lot of difference.

Do you have any error logs on your ES(for the first error posted)?

And for the second, can you give me the output of running these requests against your cluster:

/_cat/indices?format=json
/_snapshot

@sonfrau
Copy link
Author

sonfrau commented Dec 18, 2017

Good morning,

Ok, then I'm not going to raise the level.

on our ES, I don't see any special error. I share with you last logs, some of them, at the same time we were trying to perform actions on cerebro072


[2017-12-18T08:22:33,251][INFO ][o.e.c.r.a.AllocationService] [Node04] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[index-transactions-2017.12.18][3], [index-transactions-2017.12.18][2]] ...]).
[2017-12-18T08:22:33,539][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [index-transactions-2017.12.18/YPPZ0I1HT2W4V8JjOn5-lw] create_mapping [logevent]
[2017-12-18T08:22:33,543][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [index-core-development-2017.12.18/P04DiEi-Shedd2h6n5USdg] create_mapping [index-core-development]
[2017-12-18T08:23:01,587][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [gateway-2017.12.18/022QYhN6QzGZE_xEHvNT4w] create_mapping [quote]
[2017-12-18T08:23:02,981][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [gateway-2017.12.18/022QYhN6QzGZE_xEHvNT4w] update_mapping [quote]
[2017-12-18T08:23:24,339][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [index-dev-core-cluster-2017.12.18/h0tFvY9kRIuRblKUUDtuaA] update_mapping [fluentd]
[2017-12-18T08:25:53,590][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [index-transactions-2017.12.18/YPPZ0I1HT2W4V8JjOn5-lw] update_mapping [logevent]
[2017-12-18T08:28:48,861][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [gateway-2017.12.18/022QYhN6QzGZE_xEHvNT4w] create_mapping [organizations]
[2017-12-18T08:28:51,002][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [index-common-services-cluster-2017.12.18/wXwmnZeETYefS6WtIz6qmg] update_mapping [fluentd]
[2017-12-18T08:43:36,671][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [index-2017.12.18/4PwY78qbStKoeygDxHw2eg] update_mapping [logevent]
[2017-12-18T08:46:45,947][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [index-2017.12.18/4PwY78qbStKoeygDxHw2eg] update_mapping [logevent]
[2017-12-18T08:54:39,818][INFO ][o.e.c.m.MetaDataCreateIndexService] [Node04] [index-2017.12.18] creating index, cause [auto(bulk api)], templates [template_1, template-index], shards [5]/[2], mappings [_default_]
[2017-12-18T08:55:35,304][INFO ][o.e.c.r.a.AllocationService] [Node04] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[index-2017.12.18][4]] ...]).
[2017-12-18T08:55:35,540][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [index-2017.12.18/MgGjNhO9R-6HoZRdzrscRA] create_mapping [logevent]
[2017-12-18T09:36:48,535][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [index-audit-2017.12.18/j4e-aqO6RBOek--Ig2gHRg] update_mapping [logevent]
[2017-12-18T09:46:49,153][INFO ][o.e.c.m.MetaDataMappingService] [Node04] [index-audit-2017.12.18/j4e-aqO6RBOek--Ig2gHRg] update_mapping [logevent]

I attach the screenshot output gotten of your second request

devops-749_3

devops-749_4

I guess it's enough for you that detail or you need the whole output.

Thanks and kind regards.

@sonfrau
Copy link
Author

sonfrau commented Dec 18, 2017

You should know that there is a Nginx Proxy between cerebro072 and our ES cluster.

We need that Proxy to protect our ES cluster.

We have the same infraestructure to protect other ES cluster but we haven't detected the same behaviour.

Regards

@sonfrau
Copy link
Author

sonfrau commented Dec 18, 2017

While we were checking this issue we have seen for first time a right output
devops-749_5

That is to say that cerebro072 is able to show ES cluster. May be there is a timeout response problem, don't you think?

May be the response is too big and there isn't time enough to receive all data.

May be there is some way to raise that timeout response. What do you think?

That ES cluster is holding a lot of indexes.

@sonfrau
Copy link
Author

sonfrau commented Dec 18, 2017

Sorry, I hadn't seen your first message: "@sonfrau Can you see what the error message says? If you click on the message that should expand. If not, could you share the logs?"

This is the detail I get:
When I ask for overview:

Error while loading data 
{
  "error": "Failure for [_stats/docs,store]"
}


```When I ask for the snapshots: 

Error loading repositories
{
"error": "JsResultException(errors:List((,List(ValidationError(List(error.expected.jsarray),WrappedArray())))))"
}

@andrewkcarter
Copy link
Contributor

andrewkcarter commented Jan 11, 2018

I just started getting the exact same error in 0.7.2. In my case, it was immediately after I closed an index. With a closed index in the cluster, Cerebro fails to load the overview. I re-opened the index and the error went away.

screen shot 2018-01-11 at 10 14 07 am

@lmenezes
Copy link
Owner

@andrewkcarter which version of ES are you running? Can't reproduce this with 6.X

@andrewkcarter
Copy link
Contributor

@lmenezes I'm testing on Elastic Cloud hosted ES 5.6.5 and 6.0.1, and I learned a little more about the conditions as I was just verifying the issue again. The index has to have one or more aliases to trigger the failure when closed. If the index has no aliases, then closing the index has no negative impact on Cerebro.

I also tested against ES 5.6.5, ES 6.0.1, and ES 6.1.1 running locally and I couldn't reproduce it. So my issue seems to be isolated to Elastic Cloud hosted clusters.

@andrewkcarter
Copy link
Contributor

andrewkcarter commented Jan 15, 2018

Here is the request which is failing in my case:

[debug] o.a.n.h.HttpHandler -

Request DefaultFullHttpRequest(decodeResult: success, version: HTTP/1.1, content: UnpooledUnsafeHeapByteBuf(freed))
GET /_stats/docs,store HTTP/1.1
Accept-Encoding: gzip,deflate
Host: *******************************
Authorization: ************************************
Accept: */*
User-Agent: AHC/2.0

Response DefaultHttpResponse(decodeResult: success, version: HTTP/1.1)
HTTP/1.1 400 Bad Request
Content-Type: application/json; charset=UTF-8
Date: Mon, 15 Jan 2018 01:55:26 GMT
Server: ***********
X-Found-Handling-Cluster: ************************
X-Found-Handling-Instance: ***************
X-Found-Handling-Server: ****************
Connection: keep-alive
Transfer-Encoding: chunked

with response body

{
  "error": {
    "root_cause": [
      {
        "type": "index_closed_exception",
        "reason": "closed",
        "index_uuid": "DSo67-IrSMiawsynB2e2pA",
        "index": "test"
      }
    ],
    "type": "index_closed_exception",
    "reason": "closed",
    "index_uuid": "DSo67-IrSMiawsynB2e2pA",
    "index": "test"
  },
  "status": 400
}

I also made the request directly, and got the same 400 response back. In fact all /_stats calls fail when an index with an alias is closed. What I've got going on seems to be an issue with EC, not Cerebro. I'm going to open a support ticket with them and ask what's up.

@andrewkcarter
Copy link
Contributor

andrewkcarter commented Jan 15, 2018

Actually, this seems to be the solution: lmenezes/elasticsearch-kopf#372
I tested the query parameter and the stats call works. @lmenezes

I don't understand why it's only an issue for me with hosted ES clusters. Perhaps there is something Elastic configures in their hosted clusters that makes this happen.

@andrewkcarter
Copy link
Contributor

@lmenezes, can we get the ignore_unavailable=true query parameter mentioned here ^ in your next release?

@andrewkcarter
Copy link
Contributor

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants