-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AAP-39559 Wait for all event processing to finish, add fallback task #15798
AAP-39559 Wait for all event processing to finish, add fallback task #15798
Conversation
|
right_now_time = now() | ||
window_end = right_now_time - timedelta(seconds=settings.INDIRECT_HOST_QUERY_FALLBACK_MINUTES) | ||
window_start = right_now_time - timedelta(days=settings.INDIRECT_HOST_QUERY_FALLBACK_GIVEUP_DAYS) | ||
for job in Job.objects.filter(event_queries_processed=False, finished__lte=window_end, finished__gte=window_start).iterator(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this runs the first time, it could lead to a large number of processing jobs (all jobs completed in the last three days). Are we ok with that? Most of them will probably end quickly because there are no queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that seems like a good argument for having a limit to the start of the window which I did here. Otherwise that was more of a gut feeling.
But I still sense something to be worried about. This will get scheduled for a lot of jobs that don't need the processing done. The periodic task will run and schedule a job for each unprocessed job. Each should be a no-op with a single write, but that could still be a lot of jobs. Worse, it could temporarily overwhelm the dispatcher, because it can't delay tasks (ideal might be to scramble them with random delays).
I have a thought for this, and I'll work on putting something up. With a different angle, it'll get more intuitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just one comment above. With the fallback minutes set to 60 there shouldn't be any overlap between first processing of a job and the fallback periodic task.
@AlanCoding going to merge this, so I can add my code |
bc08e03
into
ansible:feature_indirect-host-counting
…15798) * Wait for all event processing to finish, add fallback task * Add flag check to periodic task
…nsible#15798) * Wait for all event processing to finish, add fallback task * Add flag check to periodic task
…nsible#15798) * Wait for all event processing to finish, add fallback task * Add flag check to periodic task
…15798) * Wait for all event processing to finish, add fallback task * Add flag check to periodic task
…15798) * Wait for all event processing to finish, add fallback task * Add flag check to periodic task
…15798) * Wait for all event processing to finish, add fallback task * Add flag check to periodic task
…15798) * Wait for all event processing to finish, add fallback task * Add flag check to periodic task
* AAP-37282 Add parse JQ data and test it for a `job` object in isolation (#15774) * Add jq dependency * Add file in progress * Add license for jq * Write test and get it passing * Successfully test collection of `event_query.yml` data (#15761) * Callback plugin method from cmeyers adapted to global collection list Get tests passing Mild rebranding Put behind feature flag, flip true in dev Add noqa flag * Add missing wait_for_events * feat: try grabbing query files from artifacts directory (#15776) * Contract changes for the event_query collection callback plugin (#15785) * Minor import changes to collection processing in callback plugin * Move agreed location of event_query file * feat: remaining schema changes for indirect host audits (#15787) * Re-organize test file and move artifacts processing logic to callback (#15784) * Rename the indirect host counting test file * Combine artifacts saving logic * Connect host audit model to jq logic via new task * Add unit tests for indirect host counting (#15792) * Do not get django flags from database (#15794) * Document, implement, and test remaining indirect host audit fields (#15796) * Document, implement, and test remaining indirect host audit fields * Fix hashing * AAP-39559 Wait for all event processing to finish, add fallback task (#15798) * Wait for all event processing to finish, add fallback task * Add flag check to periodic task * feat: cleanup of old indirect host audit records (#15800) * By default, do not count indirect hosts (#15801) * By default, do not count indirect hosts * Fix copy paste goof * Fix linter issue from base branch * prevent multiple tasks from processing the same job events, prevent p… (#15805) prevent multiple tasks from processing the same job events, prevent periodic task from spawning another task per job * Fix typos and other bugs found by Pablo review * fix: rely on resolved_action instead of task, adapt to proposed query… (#15815) * fix: rely on resolved_action instead of task, adapt to proposed query structure * tests: update indirect host tests * update remaining queries to new format * update live test * Remove polling loop for job finishing event processing (#15811) * Remove polling loop for job finishing event processing * Make awx/main/tests/live dramatically faster (#15780) * AAP-37282 Add parse JQ data and test it for a `job` object in isolation (#15774) * Add jq dependency * Add file in progress * Add license for jq * Write test and get it passing * Successfully test collection of `event_query.yml` data (#15761) * Callback plugin method from cmeyers adapted to global collection list Get tests passing Mild rebranding Put behind feature flag, flip true in dev Add noqa flag * Add missing wait_for_events * feat: try grabbing query files from artifacts directory (#15776) * Contract changes for the event_query collection callback plugin (#15785) * Minor import changes to collection processing in callback plugin * Move agreed location of event_query file * feat: remaining schema changes for indirect host audits (#15787) * Re-organize test file and move artifacts processing logic to callback (#15784) * Rename the indirect host counting test file * Combine artifacts saving logic * Connect host audit model to jq logic via new task * Document, implement, and test remaining indirect host audit fields (#15796) * Document, implement, and test remaining indirect host audit fields * Fix hashing * AAP-39559 Wait for all event processing to finish, add fallback task (#15798) * Wait for all event processing to finish, add fallback task * Add flag check to periodic task * feat: cleanup of old indirect host audit records (#15800) * prevent multiple tasks from processing the same job events, prevent p… (#15805) prevent multiple tasks from processing the same job events, prevent periodic task from spawning another task per job * Remove polling loop for job finishing event processing (#15811) * Remove polling loop for job finishing event processing * Make awx/main/tests/live dramatically faster (#15780) * temp * remove test * reorder migrations to allow indirect instances backport * cleanup for rebase and merge into devel --------- Co-authored-by: Peter Braun <[email protected]> Co-authored-by: jessicamack <[email protected]> Co-authored-by: Peter Braun <[email protected]>
SUMMARY
Flaw of logic up until this point was that we could be missing events. This does the stuff needed to close those loopholes by adding a task to pick up processing of yet-unprocessed jobs. This does some windowing just so that the task doesn't accidentally blow up in unforeseen circumstances.
@pb82 You should rename the periodic task added in this, and then piggyback the cleanup code on top of this. That way there's only 1 periodic task running for the management of these things.
ISSUE TYPE
COMPONENT NAME