Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected client session timed out #82

Open
neterium opened this issue Jun 12, 2019 · 3 comments
Open

Unexpected client session timed out #82

neterium opened this issue Jun 12, 2019 · 3 comments

Comments

@neterium
Copy link

I'm using version 3.6.3, with zookeeper as cluster manager.
I've started putting some pressure on our servers by increasing the workload. In terms of response time, everything is OK, the eventbus seems to absorb the trafic, no "Thread Blocked" event, etc... (The number of verticles is #cores - 1)
However, after a while it looks like the connection the ZK is lost:

2019-06-12 10:15:49.981  WARN 1 --- [.internal:2181)] org.apache.zookeeper.ClientCnxn          : Client session timed out, have not heard from server in 34816ms for sessionid 0x100000048e50034
2019-06-12 10:15:49.981  WARN 1 --- [.internal:2181)] org.apache.zookeeper.ClientCnxn          : Client session timed out, have not heard from server in 34839ms for sessionid 0x100000048e50035
2019-06-12 10:15:49.981  WARN 1 --- [.internal:2181)] org.apache.zookeeper.ClientCnxn          : Client session timed out, have not heard from server in 34789ms for sessionid 0x100000048e50033
2019-06-12 10:15:49.981  WARN 1 --- [.internal:2181)] org.apache.zookeeper.ClientCnxn          : Client session timed out, have not heard from server in 29528ms for sessionid 0x100000048e5003b
2019-06-12 10:15:50.082  WARN 1 --- [tor-TreeCache-0] i.v.s.c.zookeeper.impl.ZKAsyncMultiMap   : connection to the zookeeper server have suspended.
2019-06-12 10:15:50.082 ERROR 1 --- [worker-thread-5] i.v.s.c.z.ZookeeperClusterManager        : java.lang.IllegalStateException: Not acquired
2019-06-12 10:15:51.954  WARN 1 --- [.internal:2181)] org.apache.zookeeper.ClientCnxn          : Unable to reconnect to ZooKeeper service, session 0x100000048e5003b has expired
2019-06-12 10:15:51.955 ERROR 1 --- [orker-thread-15] i.v.s.c.z.ZookeeperClusterManager        : java.lang.IllegalStateException: Not acquired
2019-06-12 10:15:51.955  WARN 1 --- [d-0-EventThread] org.apache.curator.ConnectionState       : Session expired event received
2019-06-12 10:15:51.957 ERROR 1 --- [tor-TreeCache-0] i.v.s.c.zookeeper.impl.ZKAsyncMultiMap   : connection to the zookeeper server have lost, all the temporary node will be remove.
2019-06-12 10:15:51.993  INFO 1 --- [ntloop-thread-0] i.v.s.c.zookeeper.impl.ZKAsyncMultiMap   : restore eventbus snapshot cache success.

How can I prevent this from happening ?

Thanks

@neterium
Copy link
Author

Looks like it comes from a "stop the world" major GC at this time. I can finetune the GC settings, but is there a way to increase the session timeout?

@stream-iori
Copy link
Contributor

stream-iori commented Jun 12, 2019 via email

@Viking18
Copy link

Viking18 commented Jan 19, 2020

Looks like it comes from a "stop the world" major GC at this time. I can finetune the GC settings, but is there a way to increase the session timeout?

@neterium I'm confused about the zk disconnect problem in my own project. why do you think it comes from major GC? can a "stop the world" major GC last for 29528ms ?

appreciate for your comply:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants