Skip to content

Conversation

@fuweid
Copy link
Member

@fuweid fuweid commented Dec 5, 2025

There are three places where zombie membership information can be cleaned up:

1. ApplierV2 -> UpdateAttributes

   Ideally, each member updates its own attributes only once before serving
   requests. Since this happens only at startup and at low frequency, it is
   safe to clean up zombie members during this operation.

   Additionally, pending logs are always applied before attributes are updated,
   so there is no data race.

2. Applying a snapshot via Raft message

   When a member receives a snapshot through a Raft message, the v2store is
   synchronized with the v3store. This makes it safe to detect and clean up
   zombie membership entries at that time.

3. Applying ConfChange

   Check and cleanup zombie members after ConfChange. Ideally, there aren't
   too many ConfChange, the impact on performance should be minimal.

Fixes: #20967

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

@fuweid fuweid marked this pull request as draft December 5, 2025 16:45
@fuweid fuweid force-pushed the reproduce-20967 branch 2 times, most recently from e47654e to affcd21 Compare December 8, 2025 00:10
@fuweid fuweid changed the title [WIP on release-3.5] Reproduce 20967 [release-3.5] server: cleanup zombie membership information Dec 8, 2025
@fuweid fuweid marked this pull request as ready for review December 8, 2025 04:45
Comment on lines +25 to +59
// createClustewithZombieMembers creates a cluster with zombie members in v3store.
//
// There are two ways to create zombie members:
//
// 1. Restoring a cluster using a snapshot taken from version <= v3.4.39.
//
// The `etcdctl snapshot restore` tool in versions <= v3.4.39 contains a bug
// that fails to clean up old member information during restore. This leaves
// behind stale (zombie) member entries in the v3store.
//
// 2. Using ForceNewCluster on a cluster running a version < v3.5.22.
//
// If ETCD commits a ConfChangeAddNode entry to the v3store but crashes before
// the corresponding hard state is persisted to the WAL, then after restarting
// with ForceNewCluster, the member entries in the v3store are not cleaned up,
// resulting in zombie members. This issue was fixed in [v3.5.22][1].
//
// It may be possible to create zombie member in versions < [v3.4.0][2] through a
// single-node ForceNewCluster restart. For example, suppose we start with a
// one-node cluster. We add a second node, but the second node fails to join
// and the cluster loses quorum. We then restart the first node with ForceNewCluster,
// which results in a zombie member entry for the second node. However, it
// is still unlikely for three zombie members (like [issue-20967][3]) to
// appear at the same time.
//
// So in theory, this could happen when using ForceNewCluster. However, it is
// extremely unlikely in practice because multiple members are running, and it
// would require all of them to panic at the same time. It would also require
// selecting one of the crashed members to restart with ForceNewCluster, where
// that member had failed to persist the hard state to the WAL—an improbable
// combination of events (what a coincidence).
//
// So, in this function, we use the first method to create zombie members for
// testing.
//
// REF:
//
// [1]: https://github.com/etcd-io/etcd/commit/ccf66456e7237ef5251d9fdc96f1624d1aa4daf1
// [2]: https://github.com/etcd-io/etcd/issues/14370
// [3]: https://github.com/etcd-io/etcd/issues/20967
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the comment per #20967 (comment)

Comment on lines +778 to +800
func CleanupZombieMembersIfNeeded(lg *zap.Logger, be backend.Backend, st v2store.Store) {
v2Members, _ := membersFromStore(lg, st)
v3Members, _ := membersFromBackend(lg, be)

for id, v3Member := range v3Members {
_, ok := v2Members[id]
if !ok {
lg.Warn("Removing zombie member from v3store", zap.String("member", fmt.Sprintf("%+v", *v3Member)))
if err := unsafeDeleteMemberFromBackend(be, id); err != nil {
lg.Warn("failed to delete zombie member from backend", zap.String("member-id", id.String()), zap.Error(err))
}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking probably we should sync from v2store to v3store to address the following cases:

  • zombie members in v3store (exist in v3, but not in v2). It's already reproduced and confirmed.
  • missing members in v3store (exist in v2, but not in v3). We haven't see any reproduction for such case so far. Since v2store is the only source of truth for membership data for v3.5.x and older versions, so we should completely trust v2store, so probably we should take a proactive measure to sync from v2store to v3store for such case as well. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we should completely trust v2store, so probably we should take a proactive measure to sync from v2store to v3store for such case as well. WDYT?

Agree so we can sync after confChange

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, I am thinking probably we should handle the case missing members in v3store (exist in v2, but not in v3) as well, although we haven't seen any reproduction for such case so far.

}

// restoreSnapshotIntoMemberDir restores snapshot into the given member's data dir.
func restoreSnapshotIntoMemberDir(t *testing.T, etcdctlBin string, snapshotPath string, initialCluster string, proc e2e.EtcdProcess) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add such function into tests/framework/e2e/etcdctl.go. I won't insist on it since it's stable releases, as we won't add any new features, so I don't expect any major refatoring.

Comment on lines +778 to +800
func CleanupZombieMembersIfNeeded(lg *zap.Logger, be backend.Backend, st v2store.Store) {
v2Members, _ := membersFromStore(lg, st)
v3Members, _ := membersFromBackend(lg, be)

for id, v3Member := range v3Members {
_, ok := v2Members[id]
if !ok {
lg.Warn("Removing zombie member from v3store", zap.String("member", fmt.Sprintf("%+v", *v3Member)))
if err := unsafeDeleteMemberFromBackend(be, id); err != nil {
lg.Warn("failed to delete zombie member from backend", zap.String("member-id", id.String()), zap.Error(err))
}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, I am thinking probably we should handle the case missing members in v3store (exist in v2, but not in v3) as well, although we haven't seen any reproduction for such case so far.

There are three places where zombie membership information can be cleaned up:

1. ApplierV2 -> UpdateAttributes

   Ideally, each member updates its own attributes only once before serving
   requests. Since this happens only at startup and at low frequency, it is
   safe to clean up zombie members during this operation.

   Additionally, pending logs are always applied before attributes are updated,
   so there is no data race.

2. Applying a snapshot via Raft message

   When a member receives a snapshot through a Raft message, the v2store is
   synchronized with the v3store. This makes it safe to detect and clean up
   zombie membership entries at that time.

3. Applying ConfChange

   Check and cleanup zombie members after ConfChange. Ideally, there aren't
   too many ConfChange, the impact on performance should be minimal.

Signed-off-by: Wei Fu <[email protected]>
@fuweid
Copy link
Member Author

fuweid commented Dec 10, 2025

@ahrtr for this comment,

As mentioned above, I am thinking probably we should handle the case missing members in v3store (exist in v2, but not in v3) as well, although we haven't seen any reproduction for such case so far.

I think we can keep it as it is right now. based on current situation, v3store has more information than v2store. If we don't have such case like v2store contains more information than v3, we can assume it's not going to happen.

@ahrtr
Copy link
Member

ahrtr commented Dec 10, 2025

If we don't have such case like v2store contains more information than v3, we can assume it's not going to happen.

Hopefully we won't have to fix a follow-up issue and generate a follow-up blog.

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks for the great work!

Please also update the changelog.

@ivanvc can we plan for the next 3.5.x patch? cc @joshjms

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, fuweid

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ahrtr ahrtr merged commit 83553fc into etcd-io:release-3.5 Dec 12, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants