Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AAP-39559 Wait for all event processing to finish, add fallback task #15798

Merged

Conversation

AlanCoding
Copy link
Member

SUMMARY

Flaw of logic up until this point was that we could be missing events. This does the stuff needed to close those loopholes by adding a task to pick up processing of yet-unprocessed jobs. This does some windowing just so that the task doesn't accidentally blow up in unforeseen circumstances.

@pb82 You should rename the periodic task added in this, and then piggyback the cleanup code on top of this. That way there's only 1 periodic task running for the management of these things.

ISSUE TYPE
  • Bug, Docs Fix or other nominal change
COMPONENT NAME
  • API

Copy link

right_now_time = now()
window_end = right_now_time - timedelta(seconds=settings.INDIRECT_HOST_QUERY_FALLBACK_MINUTES)
window_start = right_now_time - timedelta(days=settings.INDIRECT_HOST_QUERY_FALLBACK_GIVEUP_DAYS)
for job in Job.objects.filter(event_queries_processed=False, finished__lte=window_end, finished__gte=window_start).iterator():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this runs the first time, it could lead to a large number of processing jobs (all jobs completed in the last three days). Are we ok with that? Most of them will probably end quickly because there are no queries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that seems like a good argument for having a limit to the start of the window which I did here. Otherwise that was more of a gut feeling.

But I still sense something to be worried about. This will get scheduled for a lot of jobs that don't need the processing done. The periodic task will run and schedule a job for each unprocessed job. Each should be a no-op with a single write, but that could still be a lot of jobs. Worse, it could temporarily overwhelm the dispatcher, because it can't delay tasks (ideal might be to scramble them with random delays).

I have a thought for this, and I'll work on putting something up. With a different angle, it'll get more intuitive.

Copy link
Contributor

@pb82 pb82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one comment above. With the fallback minutes set to 60 there shouldn't be any overlap between first processing of a job and the fallback periodic task.

@pb82
Copy link
Contributor

pb82 commented Jan 31, 2025

@AlanCoding going to merge this, so I can add my code

@pb82 pb82 merged commit bc08e03 into ansible:feature_indirect-host-counting Jan 31, 2025
25 of 27 checks passed
pb82 pushed a commit that referenced this pull request Feb 14, 2025
…15798)

* Wait for all event processing to finish, add fallback task

* Add flag check to periodic task
AlanCoding added a commit to AlanCoding/awx that referenced this pull request Feb 24, 2025
…nsible#15798)

* Wait for all event processing to finish, add fallback task

* Add flag check to periodic task
AlanCoding added a commit to AlanCoding/awx that referenced this pull request Feb 24, 2025
…nsible#15798)

* Wait for all event processing to finish, add fallback task

* Add flag check to periodic task
pb82 pushed a commit that referenced this pull request Feb 24, 2025
…15798)

* Wait for all event processing to finish, add fallback task

* Add flag check to periodic task
pb82 pushed a commit that referenced this pull request Feb 24, 2025
…15798)

* Wait for all event processing to finish, add fallback task

* Add flag check to periodic task
pb82 pushed a commit that referenced this pull request Feb 24, 2025
…15798)

* Wait for all event processing to finish, add fallback task

* Add flag check to periodic task
pb82 pushed a commit that referenced this pull request Feb 24, 2025
…15798)

* Wait for all event processing to finish, add fallback task

* Add flag check to periodic task
pb82 added a commit that referenced this pull request Feb 24, 2025
* AAP-37282 Add parse JQ data and test it for a `job` object in isolation (#15774)

* Add jq dependency

* Add file in progress

* Add license for jq

* Write test and get it passing

* Successfully test collection of `event_query.yml` data (#15761)

* Callback plugin method from cmeyers adapted to global collection list

Get tests passing

Mild rebranding

Put behind feature flag, flip true in dev

Add noqa flag

* Add missing wait_for_events

* feat: try grabbing query files from artifacts directory (#15776)

* Contract changes for the event_query collection callback plugin (#15785)

* Minor import changes to collection processing in callback plugin

* Move agreed location of event_query file

* feat: remaining schema changes for indirect host audits (#15787)

* Re-organize test file and move artifacts processing logic to callback (#15784)

* Rename the indirect host counting test file

* Combine artifacts saving logic

* Connect host audit model to jq logic via new task

* Add unit tests for indirect host counting (#15792)

* Do not get django flags from database (#15794)

* Document, implement, and test remaining indirect host audit fields (#15796)

* Document, implement, and test remaining indirect host audit fields

* Fix hashing

* AAP-39559 Wait for all event processing to finish, add fallback task (#15798)

* Wait for all event processing to finish, add fallback task

* Add flag check to periodic task

* feat: cleanup of old indirect host audit records (#15800)

* By default, do not count indirect hosts (#15801)

* By default, do not count indirect hosts

* Fix copy paste goof

* Fix linter issue from base branch

* prevent multiple tasks from processing the same job events, prevent p… (#15805)

prevent multiple tasks from processing the same job events, prevent periodic task from spawning another task per job

* Fix typos and other bugs found by Pablo review

* fix: rely on resolved_action instead of task, adapt to proposed query… (#15815)

* fix: rely on resolved_action instead of task, adapt to proposed query structure

* tests: update indirect host tests

* update remaining queries to new format

* update live test

* Remove polling loop for job finishing event processing (#15811)

* Remove polling loop for job finishing event processing

* Make awx/main/tests/live dramatically faster (#15780)

* AAP-37282 Add parse JQ data and test it for a `job` object in isolation (#15774)

* Add jq dependency

* Add file in progress

* Add license for jq

* Write test and get it passing

* Successfully test collection of `event_query.yml` data (#15761)

* Callback plugin method from cmeyers adapted to global collection list

Get tests passing

Mild rebranding

Put behind feature flag, flip true in dev

Add noqa flag

* Add missing wait_for_events

* feat: try grabbing query files from artifacts directory (#15776)

* Contract changes for the event_query collection callback plugin (#15785)

* Minor import changes to collection processing in callback plugin

* Move agreed location of event_query file

* feat: remaining schema changes for indirect host audits (#15787)

* Re-organize test file and move artifacts processing logic to callback (#15784)

* Rename the indirect host counting test file

* Combine artifacts saving logic

* Connect host audit model to jq logic via new task

* Document, implement, and test remaining indirect host audit fields (#15796)

* Document, implement, and test remaining indirect host audit fields

* Fix hashing

* AAP-39559 Wait for all event processing to finish, add fallback task (#15798)

* Wait for all event processing to finish, add fallback task

* Add flag check to periodic task

* feat: cleanup of old indirect host audit records (#15800)

* prevent multiple tasks from processing the same job events, prevent p… (#15805)

prevent multiple tasks from processing the same job events, prevent periodic task from spawning another task per job

* Remove polling loop for job finishing event processing (#15811)

* Remove polling loop for job finishing event processing

* Make awx/main/tests/live dramatically faster (#15780)

* temp

* remove test

* reorder migrations to allow indirect instances backport

* cleanup for rebase and merge into devel

---------

Co-authored-by: Peter Braun <[email protected]>
Co-authored-by: jessicamack <[email protected]>
Co-authored-by: Peter Braun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants