-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monitoring watchdog for gearman #315
Comments
Well, 1.1.15 is a very old version. The latest release is version is 1.1.19.1. I suggest looking into upgrading. That said, I doubt there's been any significant improvements to the persistent storage options. While not officially deprecated, active development on that feature tailed off and is in maintenance mode. The consensus best practice instead is to implement persistent storage as tasks that your workers employ. There are two frameworks for implementing such a system, one is called Gearstore and another is called Garavini. You might want to look into them. Just Google those words with Gearman to find more information. It's also easy to implement your own persistent storage tasks once you understand the design pattern. Executing Your gearmand process weighing 20-30 GB sounds like the real problem. I don't use gearmand's persistent storage feature personally, but that could be a memory leak. As a bandage, you might want to consider restarting your gearmand process periodically (like daily perhaps). If you find a memory leak in gearmand, patches and/or more detailed information are welcome, of course. |
Apologize for interrupting this thread (he he). Do you have a reference (documentation, code what not) to this |
@anderslauri asked:
It's just a task that returns the string "pong". I believe I said that. If you already have workers who have registered other tasks with gearmand, it couldn't be more trivial to implement. It's basically just like the reverse string example, but it's even simpler. If the job doesn't return "pong", the healthcheck fails and Docker will restart my gearmand container. HEALTHCHECK --interval=5m --timeout=3s --retries=2 \
CMD test $(/path/to/ping_test | grep -c 'pong') -eq 1 || exit 1
|
That is great, thank you. |
If you are using Gearman in a container, well I just wouldn't... I would never recommend as it is as important as the O/S in many cases to what you are doing as all containers rely on some other service to be active to be started on reboots. So you are asking for problems if you go that way with it. But anyway you mention systemd. You can set the jobs up on systemd to repair on any timeouts and issue a bunch of workers as needed an example is like this: gearman@{1..5}.service Then you just set WantedBy/target in systemd and scope it out properly. Systemd to be fair is pretty rock solid so you can't go far wrong as long as you feed it the proper info and avoid containers if it is important. |
As an example on systemd. I put in delayed start to the workers to make sure everything is available first: `[Unit] [Service] KillMode=process ExecStart=/your-worker gearman_worker [Install] |
@tomcoxcreative wrote:
I couldn't disagree more. I've been using gearmand in a Docker container on a production system for 6 years or so, and I wouldn't use gearmand any other way. I think container technology is here to stay in modern IT infrastructure and operations. The advantages are too great, and I personally haven't encountered any downsides. |
@esabol that's interesting. I may need to reconsider my way of thinking around containers in that case. I've always avoided using them where possible. |
Hello!
My gearmand (1.1.15) sometimes freezes and i want to monitore such cases and restart it. I have made a systemd service with watchdog, that starts gearmand and watchdog bash script, that every N seconds call
gearadmin --status
and send signal to systemd. That all ok, but my gearmand use a persistent storage and sometimes it weights 20-30Gb and gearmand need time to read data and initialize andgearadmin --status
not working at that moment. Is it possible to understand outside gearmand that its all ok with it in this moment.Thanks in advance!
The text was updated successfully, but these errors were encountered: