Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Master node get killed when doing redis-benchmark without proper logs. #122

Open
sv6375261073 opened this issue Jul 12, 2023 · 2 comments
Open

Comments

@sv6375261073
Copy link

sv6375261073 commented Jul 12, 2023

-> Redis-cluster version: 0.15.0
-> Master/Slave resource
  Request: 
    cpu: 1
    memory: 1Gi
  Limit:
    cpu 2
    Memory: 10Gi

Hi Team,

When we are doing redis-benchmarking master node automatically restarts with killed log. With this we are not properly able to figure out the exact issue of restart.

################### LOG OF RESTARTED MASTER POD ##################

kubectl logs redis-cluster-follower-2 -n ot-operators -p ──(Wed,Jul12)─┘
E0712 19:20:23.945129 76790 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0712 19:20:25.084599 76790 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0712 19:20:25.349431 76790 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
Defaulted container "redis-cluster-follower" out of: redis-cluster-follower, redis-exporter
Running without TLS mode
Starting redis service in cluster mode.....
10:C 12 Jul 2023 13:46:27.237 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
10:C 12 Jul 2023 13:46:27.237 # Redis version=7.0.5, bits=64, commit=00000000, modified=0, pid=10, just started
10:C 12 Jul 2023 13:46:27.237 # Configuration loaded
10:M 12 Jul 2023 13:46:27.238 * monotonic clock: POSIX clock_gettime
10:M 12 Jul 2023 13:46:27.238 * Node configuration loaded, I'm 5991471e7d5ee1526607badc6a9164eb304546b0
10:M 12 Jul 2023 13:46:27.238 * Running mode=cluster, port=6379.
10:M 12 Jul 2023 13:46:27.238 # Server initialized
10:M 12 Jul 2023 13:46:27.240 * Loading RDB produced by version 7.0.5
10:M 12 Jul 2023 13:46:27.240 * RDB age 88407 seconds
10:M 12 Jul 2023 13:46:27.240 * RDB memory usage when created 1.65 Mb
10:M 12 Jul 2023 13:46:27.240 * Done loading RDB, keys loaded: 0, keys expired: 0.
10:M 12 Jul 2023 13:46:27.240 * DB loaded from disk: 0.000 seconds
10:M 12 Jul 2023 13:46:27.240 * Ready to accept connections
10:M 12 Jul 2023 13:46:27.246 * Replica 10.2.206.97:6379 asks for synchronization
10:M 12 Jul 2023 13:46:27.246 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'abc43f577a7ba63f4f52f602139fc2bb2b5b81a2', my replication IDs are '689a30c17b12760d95e4fd33f19cfe0d743290d3' and '846e7597232a975fd8d427100919ba33d27c7ef5')
10:M 12 Jul 2023 13:46:27.246 * Delay next BGSAVE for diskless SYNC
10:M 12 Jul 2023 13:46:28.315 # Address updated for node 0d13dfe1abcc1c4b33ebf1d96e16b6d230d32609, now 10.2.173.117:6379
10:M 12 Jul 2023 13:46:29.245 # Cluster state changed: ok
10:M 12 Jul 2023 13:46:29.566 # Address updated for node 5c2a062721b4e41e76929747955ba62fc3f01cab, now 10.2.189.11:6379
10:M 12 Jul 2023 13:46:32.255 * Starting BGSAVE for SYNC with target: replicas sockets
10:M 12 Jul 2023 13:46:32.255 * Background RDB transfer started by pid 23
23:C 12 Jul 2023 13:46:32.256 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
10:M 12 Jul 2023 13:46:32.256 # Diskless rdb transfer, done reading from pipe, 1 replicas still up.
10:M 12 Jul 2023 13:46:32.260 * Background RDB transfer terminated with success
10:M 12 Jul 2023 13:46:32.260 * Streamed RDB transfer with replica 10.2.206.97:6379 succeeded (socket). Waiting for REPLCONF ACK from slave to enable streaming
10:M 12 Jul 2023 13:46:32.260 * Synchronization with replica 10.2.206.97:6379 succeeded
/usr/bin/entrypoint.sh: line 91: 10 Killed redis-server /etc/redis/redis.conf

################## CONNECTION SIDE LOG WITH COMMAND OF REDIS-BENCHMARK#############

root@ubuntu-deployment-5474b4864f-hg8zq:/# redis-benchmark -h redis-cluster-leader.ot-operators.svc  -p 6379 -a password -t get,set,lpush -c 1000 -n 3000000 -r 1000000 -d 102400 --cluster -l
Cluster has 3 master nodes:

Master 0: 2ded0507d646d251a338f1f0e0c63f9fd751a943 10.2.134.141:6379
Master 1: 5c2a062721b4e41e76929747955ba62fc3f01cab redis-cluster-leader.ot-operators.svc:6379
Master 2: 5991471e7d5ee1526607badc6a9164eb304546b0 10.2.196.248:6379

Error: Connection reset by peer
@sv6375261073
Copy link
Author

@shubham-cmyk ,

I am facing this issue in 0.15.3 version also. while doing benchmark pod crashes but not getting the exact point of failure for it.

Any update here??

@sv6375261073
Copy link
Author

sv6375261073 commented Jul 18, 2023

Chart Version : 0.15.3

Benchmarking from running pod in the cluster:
`#Redis-benchmark installation
apt update && apt install -y redis

`
root@ubuntu-deployment-5474b4864f-h75zv:/#

########################### REDIS BENCHMARKING ###############
COUNT=0
while [ $COUNT -lt 20 ]; 
do 
    echo "################ ITERATION : $COUNT ##############"; 
    redis-benchmark -h redis-cluster-leader.ot-operators.svc  -p 6379 -a password --cluster -t get,set,ping,sadd,hmset,incr,lpush -c 1000 -n 3000000 -r 1000000 -d 102400; 
    ((COUNT=COUNT+1))
done

############ MASTER POD CRASHED LOG #################

└─(17:51:53 on main ✹ ✭)──> kubectl logs redis-cluster-follower-0 -n ot-operators -p 18:15:35.394017 60813 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0718 18:15:36.297639 60813 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0718 18:15:36.372458 60813 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
Defaulted container "redis-cluster-follower" out of: redis-cluster-follower, redis-exporter
Running without TLS mode
Starting redis service in cluster mode.....
10:C 18 Jul 2023 12:42:35.244 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
10:C 18 Jul 2023 12:42:35.244 # Redis version=7.0.5, bits=64, commit=00000000, modified=0, pid=10, just started
10:C 18 Jul 2023 12:42:35.244 # Configuration loaded
10:M 18 Jul 2023 12:42:35.244 * monotonic clock: POSIX clock_gettime
10:M 18 Jul 2023 12:42:35.245 * Node configuration loaded, I'm a6869e3a644f3c84a890b58cb06919c17d956f3f
10:M 18 Jul 2023 12:42:35.245 * Running mode=cluster, port=6379.
10:M 18 Jul 2023 12:42:35.245 # Server initialized
10:M 18 Jul 2023 12:42:35.246 * Loading RDB produced by version 7.0.5
10:M 18 Jul 2023 12:42:35.246 * RDB age 651 seconds
10:M 18 Jul 2023 12:42:35.246 * RDB memory usage when created 1.93 Mb
10:M 18 Jul 2023 12:42:35.246 * Done loading RDB, keys loaded: 0, keys expired: 0.
10:M 18 Jul 2023 12:42:35.246 * DB loaded from disk: 0.001 seconds
10:M 18 Jul 2023 12:42:35.246 * Ready to accept connections
10:M 18 Jul 2023 12:42:35.252 * Replica 10.2.171.188:6379 asks for synchronization
10:M 18 Jul 2023 12:42:35.252 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '9a64d9bd51f3730a510ee72590d4ab96b24391c1', my replication IDs are 'f36924802bcdb635e523c9da46d2761e0bebc7ba' and '4bf3e69ef1ab4ae8ca18c86b073f6f9157cb99aa')
10:M 18 Jul 2023 12:42:35.252 * Delay next BGSAVE for diskless SYNC
10:M 18 Jul 2023 12:42:37.253 # Cluster state changed: ok
10:M 18 Jul 2023 12:42:40.263 * Starting BGSAVE for SYNC with target: replicas sockets
10:M 18 Jul 2023 12:42:40.263 * Background RDB transfer started by pid 23
23:C 18 Jul 2023 12:42:40.264 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
10:M 18 Jul 2023 12:42:40.264 # Diskless rdb transfer, done reading from pipe, 1 replicas still up.
10:M 18 Jul 2023 12:42:40.268 * Background RDB transfer terminated with success
10:M 18 Jul 2023 12:42:40.268 * Streamed RDB transfer with replica 10.2.171.188:6379 succeeded (socket). Waiting for REPLCONF ACK from slave to enable streaming
10:M 18 Jul 2023 12:42:40.268 * Synchronization with replica 10.2.171.188:6379 succeeded
10:M 18 Jul 2023 12:44:27.128 * Clear FAIL state for node fb8237edfda4f76fe6bfd2038a7898bb0fbb4597: replica is reachable again.
10:M 18 Jul 2023 12:44:36.657 * 10000 changes in 60 seconds. Saving...
10:M 18 Jul 2023 12:44:36.668 * Background saving started by pid 218
10:M 18 Jul 2023 12:44:37.177 # Client id=3 addr=10.2.171.188:37726 laddr=10.2.173.117:6379 fd=16 name= age=122 idle=0 flags=S db=0 sub=0 psub=0 ssub=0 multi=-1 qbuf=0 qbuf-free=20474 argv-mem=0 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=2625 omem=268490264 tot-mem=268512536 events=rw cmd=replconf user=default redir=-1 resp=2 scheduled to be closed ASAP for overcoming of output buffer limits.
10:M 18 Jul 2023 12:44:37.178 # Connection with replica 10.2.171.188:6379 lost.
/usr/bin/entrypoint.sh: line 91: 10 Killed redis-server /etc/redis/redis.conf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant