Problem
When a Valkey cluster is scaled out (e.g. 3 -> 4 replicas) and a targeted switchover is then issued to the freshly added replica, the OpsRequest fails with:
WARNING: could not confirm new primary within 300s
even though Sentinel has already promoted the fresh candidate, post/settle topology is correct, and replica-priority has been restored.
Root cause
addons/valkey/scripts/switchover.sh iterates a member list sourced from the container env variable VALKEY_POD_FQDN_LIST, which is rendered into pod environment at pod creation time via componentVarRef.podFQDNs. The container env of an existing pod is not refreshed by KubeBlocks after scale-out.
So when scale-out grows replicas from N to N+1, the old primary's action container still sees the old N-entry list. All iteration points in switchover.sh then miss the freshly added candidate:
set_priorities_with_candidate_bias() — does not set replica-priority=1 on the fresh candidate
restore_priorities() — does not restore on the fresh candidate
wait_for_new_master() — never probes the fresh candidate, so it cannot observe role:master even after Sentinel promotion
check_* helpers using the same list
Fix
Introduce pod_fqdns_with_candidate() that unions KB_SWITCHOVER_CANDIDATE_FQDN (passed at action time as expected_fqdn / candidate_fqdn) into the env list. All iteration points are switched to consume the union list.
Validation
- ShellSpec: 55 examples, 0 failures (
scripts-ut-spec/valkey_switchover_spec.sh), with new cases covering stale-list scenarios.
- Live broader smoke test (143 PASS / 4 FAIL / 2 SKIP, the 4 fails are non-product environment/capability gaps): T09 fresh scale-out targeted switchover one-shot pass, T14 targeted switchover Ops Succeed with candidate becoming primary, T15 sentinel failover normal.
- Live chaos suite 143 PASS / 0 FAIL / 0 SKIP covering master kill, all-sentinel kill, all 6 pods kill, rapid master kill, restart, scale-out/in during writes, vscale during writes — fix holds under concurrent writes and chaos.
Same-pattern risk in other addons
Redis (addons/redis/scripts/redis-switchover.sh) follows the identical pattern with REDIS_POD_FQDN_LIST and SENTINEL_POD_FQDN_LIST injected via componentVarRef.podFQDNs. The same iteration points (set_redis_priorities, recover_redis_priorities, check_redis_kernel_status, check_switchover_result) carry the same architectural risk. This PR does not modify Redis — left for a follow-up evaluation.
Problem
When a Valkey cluster is scaled out (e.g. 3 -> 4 replicas) and a targeted switchover is then issued to the freshly added replica, the OpsRequest fails with:
even though Sentinel has already promoted the fresh candidate, post/settle topology is correct, and
replica-priorityhas been restored.Root cause
addons/valkey/scripts/switchover.shiterates a member list sourced from the container env variableVALKEY_POD_FQDN_LIST, which is rendered into pod environment at pod creation time viacomponentVarRef.podFQDNs. The container env of an existing pod is not refreshed by KubeBlocks after scale-out.So when scale-out grows replicas from N to N+1, the old primary's action container still sees the old N-entry list. All iteration points in
switchover.shthen miss the freshly added candidate:set_priorities_with_candidate_bias()— does not setreplica-priority=1on the fresh candidaterestore_priorities()— does not restore on the fresh candidatewait_for_new_master()— never probes the fresh candidate, so it cannot observerole:mastereven after Sentinel promotioncheck_*helpers using the same listFix
Introduce
pod_fqdns_with_candidate()that unionsKB_SWITCHOVER_CANDIDATE_FQDN(passed at action time asexpected_fqdn/candidate_fqdn) into the env list. All iteration points are switched to consume the union list.Validation
scripts-ut-spec/valkey_switchover_spec.sh), with new cases covering stale-list scenarios.Same-pattern risk in other addons
Redis (
addons/redis/scripts/redis-switchover.sh) follows the identical pattern withREDIS_POD_FQDN_LISTandSENTINEL_POD_FQDN_LISTinjected viacomponentVarRef.podFQDNs. The same iteration points (set_redis_priorities,recover_redis_priorities,check_redis_kernel_status,check_switchover_result) carry the same architectural risk. This PR does not modify Redis — left for a follow-up evaluation.