-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry different disk when bootstrap fails in Full auto mode #2659
Conversation
ambry-store/src/main/java/com/github/ambry/store/DiskManager.java
Outdated
Show resolved
Hide resolved
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #2659 +/- ##
=============================================
- Coverage 69.18% 33.36% -35.83%
+ Complexity 11010 5142 -5868
=============================================
Files 806 809 +3
Lines 65526 65560 +34
Branches 8006 8001 -5
=============================================
- Hits 45337 21875 -23462
- Misses 17620 41991 +24371
+ Partials 2569 1694 -875 ☔ View full report in Codecov by Sentry. |
throw new StateTransitionException("Failed to add store " + partitionName + " into storage manager", | ||
ReplicaOperationFailure); | ||
} else { | ||
// TODO: Delete any files added in store and reserve directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should do this, otherwise, it would just take some disk space and won't let it go before we restart. I suppose we can just add a new method in disk manager to clean up the "unexpected dirs". I have a PR to do that, let's merge these two PRs and then use the method in my PR to remove the partition directories later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I will merge this PR and put up a new PR
@@ -16,6 +16,7 @@ | |||
|
|||
import com.github.ambry.account.AccountService; | |||
import com.github.ambry.clustermap.DiskId; | |||
import com.github.ambry.clustermap.HardwareState; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this import is not used anywhere.
@@ -113,6 +113,7 @@ public class HelixClusterManager implements ClusterMap { | |||
private final PartitionSelectionHelper partitionSelectionHelper; | |||
private final Map<String, Map<String, String>> partitionOverrideInfoMap = new HashMap<>(); | |||
private final Map<String, ReplicaId> bootstrapReplicas = new ConcurrentHashMap<>(); | |||
private final Map<String, Set<DiskId>> disksAttemptedForBootstrap = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one slightly different approach here is avoid using a disk whenver it fails a bootstrap replica. But this also would do the work
Retry different disk when bootstrap fails