-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Enable Safe Bidirectional CCR via Alias policy on Restore #19368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add RestoreSnapshotRequest: alias_write_index_policy: preserve (default), strip_write_index, custom_suffix alias_suffix for custom_suffix Apply policy in RestoreService during alias restore (post-rename): strip_write_index: force is_write_index=false on follower aliases custom_suffix: append suffix and set is_write_index=false This prevents multi-write alias conflicts on followers, unlocking bidirectional CCR with write aliases. Signed-off-by: Atri Sharma <[email protected]>
Signed-off-by: Atri Sharma <[email protected]>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #19368 +/- ##
============================================
+ Coverage 72.93% 72.96% +0.03%
- Complexity 69947 69952 +5
============================================
Files 5676 5676
Lines 321121 321158 +37
Branches 46427 46432 +5
============================================
+ Hits 234195 234323 +128
+ Misses 68032 67904 -128
- Partials 18894 18931 +37 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@ankitkala Can you take a look and see if this matches what you had in mind? |
I was thinking about trying to fix this on the CCR plugin side. |
The rename approach doesn’t fix the writeIndex conflict — it only changes alias names. renameAliasPattern can’t modify alias properties. With bidirectional CCR, both clusters end up with writeIndex=true on the same alias. Renaming “products”. -> “products_dc2” still restores writeIndex=true and you hit the conflict. |
Can you help me understand the use case for custom_suffix on the alias? From my understanding of the issue Also from your test |
.../main/java/org/opensearch/action/admin/cluster/snapshots/restore/RestoreSnapshotRequest.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/snapshots/RestoreService.java
Outdated
Show resolved
Hide resolved
.../main/java/org/opensearch/action/admin/cluster/snapshots/restore/RestoreSnapshotRequest.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Atri Sharma <[email protected]>
Signed-off-by: Atri Sharma <[email protected]>
Thanks for the review. I’ve simplified the change per your feedback:
This keeps restore focused on one job (toggling the write flag) and uses the existing rename fields when user wants suffixed aliases. |
Signed-off-by: Atri Sharma <[email protected]>
❌ Gradle check result for e5b6100: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for e5b6100: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for e5b6100: Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for e5b6100: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lgtm, once this is merged and pushed to maven thinking we should also add an IT here in ccr plugin to ensure this bug is fixed?
@ankitkala Wondering if you could pls make a pass here as well
❌ Gradle check result for e5b6100: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Thanks @mch2 . @ankitkala Please let me know if this looks ok. I am hoping to get it in 3.3.0 |
Thanks. Took a brief look at it. overall approach looks good to me. Are we planning to also expose this as a parameter when starting ccr for an index? Like @mch2 called out, let's add some CCR IT as well. |
Ack. I will add it in a follow up PR. |
…-project#19368) Restore: enable safe bidirectional CCR via alias policy on restore Add RestoreSnapshotRequest: alias_write_index_policy: preserve (default), strip_write_index, custom_suffix alias_suffix for custom_suffix Apply policy in RestoreService during alias restore (post-rename): strip_write_index: force is_write_index=false on follower aliases custom_suffix: append suffix and set is_write_index=false This prevents multi-write alias conflicts on followers, unlocking bidirectional CCR with write aliases. Signed-off-by: Atri Sharma <[email protected]>
Add RestoreSnapshotRequest:
alias_write_index_policy: preserve (default), strip_write_index, custom_suffix
alias_suffix for custom_suffix
Apply policy in RestoreService during alias restore (post-rename):
strip_write_index: force is_write_index=false on follower aliases
custom_suffix: append suffix and set is_write_index=false
This prevents multi-write alias conflicts on followers, unlocking bidirectional CCR with write aliases.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.