-
Notifications
You must be signed in to change notification settings - Fork 184
[Job Scheduler] Add additional checks for initializing the stats job collector #4362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Job Scheduler] Add additional checks for initializing the stats job collector #4362
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apply spotless?
|
Thanks @Zhangxunmt ! Addressed the comments! |
…o minimize jvm usage Signed-off-by: Pavan Yekbote <[email protected]>
Signed-off-by: Pavan Yekbote <[email protected]>
Signed-off-by: Pavan Yekbote <[email protected]>
aebfc8d to
4cb83b8
Compare
Signed-off-by: Pavan Yekbote <[email protected]>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4362 +/- ##
============================================
+ Coverage 80.09% 80.12% +0.03%
- Complexity 10199 10212 +13
============================================
Files 855 855
Lines 44374 44413 +39
Branches 5135 5139 +4
============================================
+ Hits 35540 35585 +45
+ Misses 6670 6666 -4
+ Partials 2164 2162 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Can we please merge this PR? Its very important for Large clusters |
|
Yes, waiting on the required CI to pass! It is failing due to flakiness and throttling |
|
The backport to To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-3.1 3.1
# Navigate to the new working tree
cd .worktrees/backport-3.1
# Create a new branch
git switch --create backport/backport-4362-to-3.1
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5ac2984bf59a2a0ede147262a8ab477d41859463
# Push it to GitHub
git push --set-upstream origin backport/backport-4362-to-3.1
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-3.1Then, create a pull request where the |
…collector (#4362) * fix: add additional checks for initializing the stats job collector to minimize jvm usage Signed-off-by: Pavan Yekbote <[email protected]> * fix: spotless apply and add debug logs Signed-off-by: Pavan Yekbote <[email protected]> * fix: tests Signed-off-by: Pavan Yekbote <[email protected]> * fix: spotless Signed-off-by: Pavan Yekbote <[email protected]> --------- Signed-off-by: Pavan Yekbote <[email protected]> (cherry picked from commit 5ac2984)
…collector (#4362) * fix: add additional checks for initializing the stats job collector to minimize jvm usage Signed-off-by: Pavan Yekbote <[email protected]> * fix: spotless apply and add debug logs Signed-off-by: Pavan Yekbote <[email protected]> * fix: tests Signed-off-by: Pavan Yekbote <[email protected]> * fix: spotless Signed-off-by: Pavan Yekbote <[email protected]> --------- Signed-off-by: Pavan Yekbote <[email protected]> (cherry picked from commit 5ac2984)
…collector (#4362) (#4378) * fix: add additional checks for initializing the stats job collector to minimize jvm usage * fix: spotless apply and add debug logs * fix: tests * fix: spotless --------- (cherry picked from commit 5ac2984) Signed-off-by: Pavan Yekbote <[email protected]> Co-authored-by: Pavan Yekbote <[email protected]>
Description
During a B/G for a very large cluster, the metrics job can be indexed multiple times. We only want to index this job once on startup.
In this case, due to the presence of ClusterState object, it can cause a lot of heap to be used up due to multiple threads waiting on the index action to complete.
Therefore, adding checks to validate whether the index is already created or not and adding local checks within the same node. Additionally, removed the condition for Offline batch polling task job, as it gets created on a batch predict.
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.