You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Scylla version: 6.3.0~dev-20241101.19a43b58599c with build-id 8db6e96d95afbd2686d3ceaa32521f5696ac3b23
Kernel Version: 6.8.0-1017-azure
Issue description
This issue is a regression.
It is unknown if this issue is a regression.
c-s had errors in validate ~2h before this
and at some point when some nodes were down, the stress failed, without clear reason
Looks like it cassandra stress internal error.
WARN [cluster1-nio-worker-7] 2024-11-03 06:52:33,299 Connection.java:284 - Error creating netty channel to /10.0.0.5:9042
com.datastax.shaded.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /10.0.0.5:9042
Caused by: java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at com.datastax.shaded.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)
at com.datastax.shaded.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at com.datastax.shaded.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at com.datastax.shaded.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at com.datastax.shaded.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Unknown Source)
WARN [cluster1-nio-worker-1] 2024-11-03 06:57:16,626 Connection.java:284 - Error creating netty channel to /10.0.0.8:9042
com.datastax.shaded.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: /10.0.0.8:9042
Caused by: java.net.NoRouteToHostException: No route to host
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at com.datastax.shaded.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)
at com.datastax.shaded.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at com.datastax.shaded.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at com.datastax.shaded.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at com.datastax.shaded.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Unknown Source)
FAILURE
java.lang.RuntimeException: Failed to execute stress action
2 c-s processes failed and exited with 1 exit code. Looks it happened because c-s lost connection for 2 nodes:
10.0.0.8 - which was decommissioned earlier
10.0.0.5 - which was stopped
This happened for 3 loader nodes from 4th
This happened, when c-s processes was finishing its execution
Impact
Describe the impact this issue causes to the user.
How frequently does it reproduce?
Describe the frequency with how this issue can be reproduced.
OS / Image: /subscriptions/6c268694-47ab-43ab-b306-3c5514bc4112/resourceGroups/SCYLLA-IMAGES/providers/Microsoft.Compute/images/scylla-6.3.0-dev-x86_64-2024-11-02T02-16-27 (azure: undefined_region)
Test: longevity-1tb-5days-azure-test
Test id: 327f9b62-4806-4c0d-9e65-d49b897e6bc8
Test name: scylla-master/tier1/longevity-1tb-5days-azure-test
Test method: longevity_test.LongevityTest.test_custom_time
Test config file(s):
Packages
Scylla version:
6.3.0~dev-20241101.19a43b58599c
with build-id8db6e96d95afbd2686d3ceaa32521f5696ac3b23
Kernel Version:
6.8.0-1017-azure
Issue description
c-s had errors in validate ~2h before this
and at some point when some nodes were down, the stress failed, without clear reason
Looks like it cassandra stress internal error.
2 c-s processes failed and exited with 1 exit code. Looks it happened because c-s lost connection for 2 nodes:
10.0.0.8 - which was decommissioned earlier
10.0.0.5 - which was stopped
This happened for 3 loader nodes from 4th
This happened, when c-s processes was finishing its execution
Impact
Describe the impact this issue causes to the user.
How frequently does it reproduce?
Describe the frequency with how this issue can be reproduced.
Installation details
Cluster size: 4 nodes (Standard_L16s_v3)
Scylla Nodes used in this run:
OS / Image:
/subscriptions/6c268694-47ab-43ab-b306-3c5514bc4112/resourceGroups/SCYLLA-IMAGES/providers/Microsoft.Compute/images/scylla-6.3.0-dev-x86_64-2024-11-02T02-16-27
(azure: undefined_region)Test:
longevity-1tb-5days-azure-test
Test id:
327f9b62-4806-4c0d-9e65-d49b897e6bc8
Test name:
scylla-master/tier1/longevity-1tb-5days-azure-test
Test method:
longevity_test.LongevityTest.test_custom_time
Test config file(s):
Logs and commands
$ hydra investigate show-monitor 327f9b62-4806-4c0d-9e65-d49b897e6bc8
$ hydra investigate show-logs 327f9b62-4806-4c0d-9e65-d49b897e6bc8
Logs:
Jenkins job URL
Argus
The text was updated successfully, but these errors were encountered: