Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix running tasks when circuit breaker is open #542

Merged
merged 2 commits into from
Nov 8, 2022

Conversation

ylwu-amzn
Copy link
Collaborator

@ylwu-amzn ylwu-amzn commented Nov 8, 2022

Signed-off-by: Yaliang Wu [email protected]

Description

When disk circuit breaker is open, the upload model task will never completed/failed.

The original design of String errorMsg = checkAndAddRunningTask(mlTask, maxUploadTasksPerNode); will never throw exceptions, will throw exception by checking errorMsg. Now circuit breaker will throw limit exceed exception inside checkAndAddRunningTask, so we should move this method inside the try block.

Issues Resolved

[List any issues this PR will resolve]

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ylwu-amzn ylwu-amzn requested a review from a team November 8, 2022 01:06
rbhavna
rbhavna previously approved these changes Nov 8, 2022
Zhangxunmt
Zhangxunmt previously approved these changes Nov 8, 2022
@ylwu-amzn
Copy link
Collaborator Author

ylwu-amzn commented Nov 8, 2022

CI failed

https://github.com/opensearch-project/ml-commons/actions/runs/3415922046/jobs/5685553293

> Could not resolve all files for configuration ':classpath'.
   > Could not resolve com.netflix.nebula:nebula-core:3.0.0.
     Required by:
         project : > org.opensearch.gradle:build-tools:2.4.0-SNAPSHOT:20221105.061416-146 > com.netflix.nebula:nebula-publishing-plugin:4.4.4
      > Could not resolve com.netflix.nebula:nebula-core:3.0.0.
         > Could not get resource 'https://d1nvenhzbhpy0q.cloudfront.net/snapshots/lucene/com/netflix/nebula/nebula-core/3.0.0/nebula-core-3.0.0.pom'.
            > Could not GET 'https://d1nvenhzbhpy0q.cloudfront.net/snapshots/lucene/com/netflix/nebula/nebula-core/3.0.0/nebula-core-3.0.0.pom'. Received status code 403 from server: Forbidden

common issue reported
opensearch-project/OpenSearch#5121

Signed-off-by: Yaliang Wu <[email protected]>
@ylwu-amzn ylwu-amzn dismissed stale reviews from Zhangxunmt and rbhavna via 5d1d9ee November 8, 2022 02:45
@ylwu-amzn ylwu-amzn merged commit 34bb2ab into opensearch-project:2.x Nov 8, 2022
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 8, 2022
* fix running tasks when circuit breaker is open

Signed-off-by: Yaliang Wu <[email protected]>

* fix log error message

Signed-off-by: Yaliang Wu <[email protected]>

Signed-off-by: Yaliang Wu <[email protected]>
(cherry picked from commit 34bb2ab)
ylwu-amzn added a commit that referenced this pull request Nov 8, 2022
* fix running tasks when circuit breaker is open

Signed-off-by: Yaliang Wu <[email protected]>

* fix log error message

Signed-off-by: Yaliang Wu <[email protected]>

Signed-off-by: Yaliang Wu <[email protected]>
(cherry picked from commit 34bb2ab)

Co-authored-by: Yaliang Wu <[email protected]>
@ylwu-amzn ylwu-amzn added the bug Something isn't working label Nov 8, 2022
b4sjoo pushed a commit to b4sjoo/ml-commons that referenced this pull request Dec 2, 2022
* fix running tasks when circuit breaker is open

Signed-off-by: Yaliang Wu <[email protected]>

* fix log error message

Signed-off-by: Yaliang Wu <[email protected]>

Signed-off-by: Yaliang Wu <[email protected]>
Signed-off-by: Sicheng Song <[email protected]>
b4sjoo added a commit that referenced this pull request Dec 2, 2022
* fix running tasks when circuit breaker is open

Signed-off-by: Yaliang Wu <[email protected]>

* fix log error message

Signed-off-by: Yaliang Wu <[email protected]>

Signed-off-by: Yaliang Wu <[email protected]>
Signed-off-by: Sicheng Song <[email protected]>

Signed-off-by: Yaliang Wu <[email protected]>
Signed-off-by: Sicheng Song <[email protected]>
Co-authored-by: Yaliang Wu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.4 bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants