-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Fix gang write late_arrival bug #17824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I have feeling this is a workaround on a workaround. The |
I have taken this to its logical extreme; why don't we just always enable the feature in syncing_txg + 1? That seems simpler and like it shouldn't cause any issues. |
|
I wonder if |
This seems possible, using the max of the txg values here seems prudent. |
When a write comes in via dmu_sync_late_arrival, its txg is equal to the open TXG. If that write gangs, and we have not yet activated the new gang header feature, and the gang header we pick can store a larger gang header, we will try to schedule the upgrade for the open TXG + 1. In debug mode, this causes an assertion to trip. This PR sets the TXG for activating the feature to be at most the current open TXG. Signed-off-by: Paul Dagnelie <[email protected]> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Paul Dagnelie <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
When a write comes in via dmu_sync_late_arrival, its txg is equal to the open TXG. If that write gangs, and we have not yet activated the new gang header feature, and the gang header we pick can store a larger gang header, we will try to schedule the upgrade for the open TXG + 1. In debug mode, this causes an assertion to trip. This PR sets the TXG for activating the feature to be the larger of either the current open TXG or the syncing TXG + 1. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Closes openzfs#17824
When a write comes in via dmu_sync_late_arrival, its txg is equal to the open TXG. If that write gangs, and we have not yet activated the new gang header feature, and the gang header we pick can store a larger gang header, we will try to schedule the upgrade for the open TXG + 1. In debug mode, this causes an assertion to trip. This PR sets the TXG for activating the feature to be the larger of either the current open TXG or the syncing TXG + 1. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Closes openzfs#17824
Sponsored by: [Klara, Inc.; Wasabi Technology, Inc.]
Motivation and Context
When a write comes in via dmu_sync_late_arrival, its txg is equal to the open TXG. If that write gangs, and we have not yet activated the new gang header feature, and the gang header we pick can store a larger gang header, we will try to schedule the upgrade for the open TXG + 1. In debug mode, this causes an assertion to trip in
txg_verify. I can pretty reliably reproduce this on a performance test setup I have.Description
This PR sets the TXG for activating the feature to be at most the current open TXG. Activating the feature a TXG early shouldn't cause any problems, since we don't use the activation txg directly. I believe this method of doing accessing the current open TXG is safe; the current open TXG could increase while the comparison/replacement is happening, but I don't believe a value larger than the open txg can get into
txg_verifywith this code. And we don't use atomics anywhere else to access this field, so it shouldn't be necessary here either.How Has This Been Tested?
Manual testing with the workload that originally triggered the problem.
Unfortunately I haven't been able to find a small reproducer for this. I've tried a few things, but we need the first thing that gangs to be a
dmu_sync_late_arrivalcall, which is not trivial to orchestrate. If anyone has ideas for a test that would work, I'm happy to try them out.Types of changes
Checklist:
Signed-off-by.