vdev_open: clear async remove flag after reopen #16921
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
[Sponsors: Klara, Inc., Wasabi Technology, Inc.]
Motivation and Context
It's possible[*] for a vdev to be flagged for async remove after the pool has suspended. If the removed device has been returned when the pool is resumed, the
ASYNC_REMOVE
task will still run at the end of txg, and remove the device from the pool again.[*] Maybe only theoretical. I think to happen the pool would have to suspend, then a different IO would need to return an error, and the media check fail, flagging the async remove. I haven't triggered it on stock OpenZFS. I can trip it regularly on a (not yet published) branch that chains writes and flush IOs, and so is more likely to have outstanding IO after the pool has suspended.
Description
To fix, we clear the async remove flag at reopen, just as we did for the async fault flag in 5de3ac2.
How Has This Been Tested?
ZTS run completed successfully.
I am not really sure how to write a test to prove this (we don't really have the tools), but I think intuitively it makes sense - it matches the existing async fault case.
Types of changes
Checklist:
Signed-off-by
.