-
-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[🐛 BUG]: execTTL timeout elapsed #1992
Comments
Hey @cv65kr 👋 Thanks for the report 👍 Note for myself: load test with exec_ttl set. Build RR with the |
I am experiencing a similar problem on GRPC with exec_ttl rr version 2024.1.5 (build time: 2024-06-20T19:10:34+0000, go1.22.4), OS: linux, arch: amd64 |
@rustatian {"level":"info","ts":1724864808550491636,"logger":"grpc","msg":"grpc server was started","address":"tcp://0.0.0.0:5002"}
{"level":"warn","ts":1724869469478675017,"logger":"server","msg":"worker stopped, and will be restarted","reason":"execTTL timeout elapsed","pid":148,"internal_event_name":"EventExecTTL","error":"worker_exec_with_timeout: ExecTTL: context canceled"}
{"level":"error","ts":1724869469478755068,"logger":"grpc","msg":"method call was finished with error","error":"rpc error: code = Internal desc = worker_exec_with_timeout: ExecTTL: context canceled","method":"/my.Service/HealthCheck","start":1724869468475914858,"elapsed":1002}
{"level":"warn","ts":1724890028555570695,"logger":"server","msg":"worker doesn't respond on stop command, killing process","PID":742}
{"level":"error","ts":1724890028555645505,"logger":"server","msg":"no free workers in the pool, wait timeout exceed","reason":"no free workers","internal_event_name":"EventNoFreeWorkers","error":"worker_watcher_get_free_worker: NoFreeWorkers:\n\tcontext canceled"}
{"level":"error","ts":1724890028555717208,"logger":"grpc","msg":"method call was finished with error","error":"rpc error: code = Internal desc = static_pool_exec: NoFreeWorkers:\n\tstatic_pool_exec:\n\tworker_watcher_get_free_worker:\n\tcontext canceled","method":"/my.Service/HealthCheck","start":1724890018477624974,"elapsed":10078} That's the log on a pod with no traffic, just health checks every 5 seconds grpc:
listen: tcp://0.0.0.0:5002
max_concurrent_streams: 10
max_connection_age: 0s
max_connection_age_grace: 0s8h
max_connection_idle: 0s
max_recv_msg_size: 50
max_send_msg_size: 50
ping_time: 2h
pool:
allocate_timeout: 30s
destroy_timeout: 60s
max_jobs: 10000
num_workers: 8
supervisor:
exec_ttl: 1m
idle_ttl: 2m
max_worker_memory: 100
ttl: 1h
watch_tick: 1s
proto: ../my.service/service.proto
timeout: 1h |
Thank you, guys, for the comments. I hope next week I'd be able to dig into this problem. Will keep you all posted. |
No duplicates 🥲.
What happened?
On the metrics screen you can see that average of requests is 5ms, randomly requests start going in the queue and in logs you can see messages:
My config:
It's pretty interesting because elapsing time for request is 9 sec. I checked the endpoint logic and this is simple query which not appears in any slow log of database.
Version (rr --version)
v2024.1.5
How to reproduce the issue?
Didn't found a way, it happened randomly
Relevant log output
No response
The text was updated successfully, but these errors were encountered: