You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is to document all the custom code in AWX that is integrated into the dispatcher, and track implementation of callbacks that would be able to support these. I'm taking the approach of a code walkthrough.
I'll list check boxes for all the items, which I tried to keep all as separate topics.
Main Process Before Loop
Run Task on Startup
Task schedule with Main Process Stats
Broker self-check and recycle
Corner case of AWX exception handling
Worker Process Startup
Worker Process Shutdown
Main Process Startup
Obviously this is all synchronous code... which we could call before starting the async loop.
This is confusing to put it with worker, but it actually runs in the main process. There appears to be a distinction between this task and the other startup task, because this happens after we have a database connection. Also, this isn't how we would want to do this.
Since this runs general / misc. logic, I would suggest this becomes a run task on startup callback, meaning that it would run in a worker, after all system checks.
Task Schedule with Main Process Stats
This logic would be best moved into the worker. The problem is getting the stats from the main process into the worker, but then we would strongly prefer that the connection to redis be managed in the worker.
# It's frustrating that we have to do this, but the python k8s# client leaves behind cacert files in /tmp, so we must clean up# the tmpdir per-dispatcher process every time a new task comes intry:
kube_config._cleanup_temp_files()
exceptException:
logger.exception('failed to cleanup k8s client tmp files')
This would be a fairly straightforward callback, just like the worker startup callback.
Worker exception handling
Worker exception logic does:
ifgetattr(exc, 'is_awx_task_error', False):
# Error caused by user / tracked in job outputlogger.warning("{}".format(exc))
Worker callbacks, errbacks
This is an exception case, these are task-level callbacks.
And
forcallbackinbody.get('errbacks', []) or []:
callback['uuid'] =body['uuid']
self.perform_work(callback)
And
forcallbackinbody.get('callbacks', []) or []:
callback['uuid'] =body['uuid']
self.perform_work(callback)
These are submitted as a part of the .apply_async call, and the confusing thing here is that these are implemented in the worker. My belief is that this is a hold-over from Celery that never got fully implemented as one would expect. So I would suggest the options are:
support callbacks & errbacks as Celery did, which means they are handled in the main process
Hack the specific case for job running, which is fairly trivial
The text was updated successfully, but these errors were encountered:
This issue is to document all the custom code in AWX that is integrated into the dispatcher, and track implementation of callbacks that would be able to support these. I'm taking the approach of a code walkthrough.
I'll list check boxes for all the items, which I tried to keep all as separate topics.
Main Process Startup
Obviously this is all synchronous code... which we could call before starting the async loop.
https://github.com/ansible/awx/blob/764dcbf94b51f9b55b467824208e4c8b9dc15786/awx/main/management/commands/run_dispatcher.py#L66
It also runs logic to get the settings. So if we do not call the dispatcher from AWX code, this would have to have some custom callback.
Main Process Startup (Run Task on Startup)
We already have a callback on the worker side.
https://github.com/ansible/awx/blob/764dcbf94b51f9b55b467824208e4c8b9dc15786/awx/main/dispatch/worker/task.py#L141
However, it is called by the main process
https://github.com/ansible/awx/blob/764dcbf94b51f9b55b467824208e4c8b9dc15786/awx/main/dispatch/worker/base.py#L237
This is confusing to put it with worker, but it actually runs in the main process. There appears to be a distinction between this task and the other startup task, because this happens after we have a database connection. Also, this isn't how we would want to do this.
Since this runs general / misc. logic, I would suggest this becomes a run task on startup callback, meaning that it would run in a worker, after all system checks.
Task Schedule with Main Process Stats
This logic would be best moved into the worker. The problem is getting the stats from the main process into the worker, but then we would strongly prefer that the connection to redis be managed in the worker.
https://github.com/ansible/awx/blob/764dcbf94b51f9b55b467824208e4c8b9dc15786/awx/main/dispatch/worker/base.py#L202
The need for main->worker communication is described in #6
Main Process Periodic
Tracking health of postgres connection is also done periodically, would require a new ground-up solution in the main async code.
Worker Process Startup
The signal handing and other stuff becomes internal to the dispatcher (the logic is moved here), but the following is highly AWX-specific.
https://github.com/ansible/awx/blob/764dcbf94b51f9b55b467824208e4c8b9dc15786/awx/main/dispatch/worker/base.py#L295
This would require a callback called on the worker process startup.
Worker Process Shutdown
In a completely different location we have this on worker process shutdown.
https://github.com/ansible/awx/blob/764dcbf94b51f9b55b467824208e4c8b9dc15786/awx/main/dispatch/worker/task.py#L127-L133
This would be a fairly straightforward callback, just like the worker startup callback.
Worker exception handling
Worker exception logic does:
Worker callbacks, errbacks
This is an exception case, these are task-level callbacks.
And
And
These are submitted as a part of the
.apply_async
call, and the confusing thing here is that these are implemented in the worker. My belief is that this is a hold-over from Celery that never got fully implemented as one would expect. So I would suggest the options are:The text was updated successfully, but these errors were encountered: