-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Fix concurrency bug in NamespacedHierarchicalStore.computeIfAbsent
#5209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Pankraz76
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great fix. Precisely executed and documented without any flaw, thanks a lot.
+1
...ngine/src/main/java/org/junit/platform/engine/support/store/NamespacedHierarchicalStore.java
Outdated
Show resolved
Hide resolved
...ngine/src/main/java/org/junit/platform/engine/support/store/NamespacedHierarchicalStore.java
Outdated
Show resolved
Hide resolved
✅ All tests passed ✅🏷️ Commit: e7357f7 Learn more about TestLens at testlens.app. |
|
You're welcome @Pankraz76, thanks as well for the praise and the review, I really appreciate it! :) |
Pankraz76
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 feat. complete.
Well done, thanks again for dedication leading to incrementation.
Now its just about polish, giving optional potential dedication - striving for excellence.
But also this is danger land, might better to extract this into clean PR afterwards. Scout principle is nice, still tend ppl. to tilt on this, likely to be overwhelmed.
...ngine/src/main/java/org/junit/platform/engine/support/store/NamespacedHierarchicalStore.java
Show resolved
Hide resolved
| @SuppressWarnings("ReferenceEquality") | ||
| @API(status = MAINTAINED, since = "6.0") | ||
| public <K, V> Object computeIfAbsent(N namespace, K key, Function<? super K, ? extends V> defaultCreator) { | ||
| Preconditions.notNull(defaultCreator, "defaultCreator must not be null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Preconditions.notNull(defaultCreator, "defaultCreator must not be null"); | |
| notNull(defaultCreator, MANDATORY_DEFAULT_CREATOR); |
imho, useless coupling, context and noisy burden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to do this (Preconditions.notNull => notNull) too, but since it was like this in the code already I wasn't sure if it was a convention here, as I imagine it could potentially be confused with requireNonNull or it could be intentional to make it clear it's a precondition. Could a maintainer please give a second opinion here? I'd be glad to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally use Preconditions.notNull.
| rejectIfClosed(); | ||
| return computedValue; | ||
| }); | ||
| return Preconditions.notNull(defaultCreator.apply(key), "defaultCreator must not return null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return Preconditions.notNull(defaultCreator.apply(key), "defaultCreator must not return null"); | |
| return notNull(defaultCreator.apply(key), MANDATORY_DEFAULT_CREATOR_VALUE); |
| @API(status = MAINTAINED, since = "1.13.3") | ||
| public final class NamespacedHierarchicalStore<N> implements AutoCloseable { | ||
|
|
||
| private final AtomicInteger insertOrderSequence = new AtomicInteger(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| private static final String MANDATORY_DEFAULT_CREATOR_VALUE = "defaultCreator must not return `null`"; | |
| private static final String MANDATORY_DEFAULT_CREATOR = "defaultCreator must not be `null`"; | |
| private final AtomicInteger insertOrderSequence = new AtomicInteger(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there aren't that many constants in the code for repeating strings like this I'm not sure this is on purpose - I'd agree with that change, can a maintainer please give a second opinion here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't extract constants for pre-condition messages. It creates a layer of indirection that makes the code harder to understand. And I can't give you chapter and verse, but Java will de duplicate constant expressions.
Also note that we don't put backticks around null.
Pankraz76
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2cts.
going fully functional separating the concerns (SoC/SRP).
...ngine/src/main/java/org/junit/platform/engine/support/store/NamespacedHierarchicalStore.java
Outdated
Show resolved
Hide resolved
...ngine/src/main/java/org/junit/platform/engine/support/store/NamespacedHierarchicalStore.java
Outdated
Show resolved
Hide resolved
...ngine/src/main/java/org/junit/platform/engine/support/store/NamespacedHierarchicalStore.java
Show resolved
Hide resolved
...ngine/src/main/java/org/junit/platform/engine/support/store/NamespacedHierarchicalStore.java
Show resolved
Hide resolved
`computeIfAbsent` previously invoked `defaultCreator` while holding the store's internal map lock. Under parallel execution this could cause threads using the store to block each other and temporarily see missing or incorrectly initialized state for values created via `computeIfAbsent`. The implementation now wraps `defaultCreator` in a `MemoizingSupplier` and installs that supplier via the map operation, evaluating it only after the update has completed. This avoids running user code while holding the lock and aligns `computeIfAbsent`'s behavior with the deprecated `getOrComputeIfAbsent`, preserving the intended "one initialization per key" semantics. Issue: junit-team#5171 Signed-off-by: martinfrancois <[email protected]>
|
You’re welcome, and thanks again, @Pankraz76, for taking another careful look. |
mpkorstanje
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've given this a quick read through and left some comments to resolve open questions, but this is not a full review yet.
It seems that you've found and solved a different problem than described in #5171. And you claim the solutions overlap.
Unfortunately the original problem is quite tricky and the description for this pull request incredibly verbose. We'll need some time to go through the details. You can help us process this by writing a much more concise PR description.
| @API(status = MAINTAINED, since = "1.13.3") | ||
| public final class NamespacedHierarchicalStore<N> implements AutoCloseable { | ||
|
|
||
| private final AtomicInteger insertOrderSequence = new AtomicInteger(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't extract constants for pre-condition messages. It creates a layer of indirection that makes the code harder to understand. And I can't give you chapter and verse, but Java will de duplicate constant expressions.
Also note that we don't put backticks around null.
| @SuppressWarnings("ReferenceEquality") | ||
| @API(status = MAINTAINED, since = "6.0") | ||
| public <K, V> Object computeIfAbsent(N namespace, K key, Function<? super K, ? extends V> defaultCreator) { | ||
| Preconditions.notNull(defaultCreator, "defaultCreator must not be null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally use Preconditions.notNull.
| block each other and temporarily see a missing or incorrectly initialized state | ||
| for values created via `computeIfAbsent`. The method now evaluates | ||
| `defaultCreator` outside the critical section using a memoizing supplier, | ||
| aligning its behavior with the deprecated `getOrComputeIfAbsent`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you write this down more concisely? The release notes generally focus on a top-line understanding of what was fixed. You could express this as having solved the symptoms of #5209 rather than its root cause.
You could for clarity also add a second item that describes how computeIfAbsent no longer deadlocks.
| } | ||
|
|
||
| @Test | ||
| void computeIfAbsentCanDeadlockWithCollidingKeys() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming of this test suggests that computeIfAbsent can currently deadlock. But I assume that after your fix this is no longer the case?
mpkorstanje
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've already found one reason this will not work as expected. See comment below.
simulateRaceConditionInComputeIfAbsent did not catch this issue because it only exercises contention on a single key and relies on ConcurrentHashMap's per-key atomicity; it does not force different keys into the same bucket or run user code in a way that re-enters the store while the map lock is held, so the problematic interaction never occurs in that test
With this in mind, I would have expected to see a test like simulateRaceConditionInComputeIfAbsent that forces keys into the same bucket.
| return requireNonNull(newStoredValue.evaluate()); | ||
| } | ||
| catch (Throwable t) { | ||
| storedValues.remove(compositeKey, newStoredValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a period of time between storedValues.compute() and storedValues.remove() where a different thread via getStoredValue() can briefly access the newStoredValue and encounter its stored exception. As such the stores operations are not atomic.
And I think this invalidates any approach that tries to avoid execution of the defaultCreator outside the compute method.
| var result = StoredValue.evaluateIfNotNull(storedValue); | ||
| if (result == null) { | ||
| StoredValue newStoredValue = this.storedValues.compute(compositeKey, (__, oldStoredValue) -> { | ||
| if (StoredValue.evaluateIfNotNull(oldStoredValue) == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your analysis you said:
In NamespacedHierarchicalStore#computeIfAbsent, the implementation previously relied on ConcurrentMap.computeIfAbsent, which provides the "one logical initialization per key" behavior. After the change to storedValues.compute(…), every call to NamespacedHierarchicalStore.computeIfAbsent for the same key can rerun the initialization logic and replace the existing StoredValue.
That means that even though each compute call is atomic, two threads calling NamespacedHierarchicalStore.computeIfAbsent for the same key can:
- Have Thread A initialize the stored value and start tracking statistics.
- Then have Thread B rerun the initialization and replace that value, effectively resetting the statistics.
But looking at the existing implementation, the defaultCreator is not applied until after the oldStoredValue has been checked. So when defaultCreator is applied for a given key a value was either not set at all or that value was set and set not null. So on the face of it the defaultCreator should be applied at most once and point 2 shouldn't happen.
I fixed a concurrency bug in
NamespacedHierarchicalStore.computeIfAbsentwhere thedefaultCreatorfunction was executed while holding the store's internal map lock. Under parallel execution, this could cause threads using the store to block each other and temporarily see a missing or incorrectly initialized state for values created viacomputeIfAbsent.Concretely, the changes are:
I reworked
NamespacedHierarchicalStore.computeIfAbsentso that:defaultCreatorin aMemoizingSupplierand installs that supplier via the map operation.ConcurrentHashMapupdate has completed, so user code is no longer run while holding the internal map lock.StoredValue(storedValue) and the current mapping in the store (oldStoredValue) to decide when it is safe to install a new value:oldStoredValue == null), it installs a new value.oldStoredValue == storedValue, by reference equality), it overwrites that mapping (for example, when it is evaluated tonull).StoredValuefrom the map so that subsequent calls can retry.computeIfAbsentwith the behavior of the deprecatedgetOrComputeIfAbsent, preserving the intended "one initialization per key" semantics while avoiding execution of user code inside the map's critical section.I introduced a
CollidingKeyhelper type in the tests to deterministically force different keys into the sameConcurrentHashMapbucket.I added a regression test
computeIfAbsentDoesNotDeadlockWithCollidingKeysinNamespacedHierarchicalStoreTests:computeIfAbsentfor two differentCollidingKeyinstances in the same namespace.defaultCreatorwaits for the second, while the second only proceeds after the first has started.computeIfAbsentdoes not block subsequent computations on colliding keys.I added an equivalent test
getOrComputeIfAbsentDoesNotDeadlockWithCollidingKeysfor the deprecatedgetOrComputeIfAbsent:CollidingKeyand latches.getOrComputeIfAbsentdoes not exhibit this deadlock pattern, because it already evaluatesdefaultCreatoroutside the critical section.computeIfAbsentwithgetOrComputeIfAbsentis the correct fix.I added
computeIfAbsentOverridesParentNullValueto verify parent/child store semantics purely in terms ofcomputeIfAbsent,put, andget:nullvalue in the parent viaparent.put(namespace, key, null).nullviachild.get(namespace, key).child.computeIfAbsent(namespace, key, __ -> "value")and asserts that the call returns"value"and that subsequentchild.get(namespace, key)also returns"value".nullvalue in the parent is treated as "logically absent" forcomputeIfAbsentand does not prevent the child from installing its own non-null value.I verified the behavior of the new tests against both the old and new implementations:
computeIfAbsentimplementation,computeIfAbsentDoesNotDeadlockWithCollidingKeysfails, whilegetOrComputeIfAbsentDoesNotDeadlockWithCollidingKeysand the existing tests (includingsimulateRaceConditionInComputeIfAbsent) pass.computeIfAbsentimplementation, all tests pass.computeIfAbsentand that the fix restores the expected behavior without changinggetOrComputeIfAbsent.simulateRaceConditionInComputeIfAbsentdid not catch this issue because it only exercises contention on a single key and relies onConcurrentHashMap's per-key atomicity; it does not force different keys into the same bucket or run user code in a way that re-enters the store while the map lock is held, so the problematic interaction never occurs in that testThis change should also fix the flakiness observed in AssertJ's
SoftAssertionsExtension_PER_CLASS_Concurrency_Test(SoftAssertionsExtension_PER_CLASS_Concurrency_Testflaky test assertj/assertj#1996). In that scenario, two tests run in parallel, andSoftAssertionsExtensionuses the JUnitExtensionContextstore (backed byNamespacedHierarchicalStore) to obtain per-testAssertionErrorCollectorinstances viacomputeIfAbsent. With the old implementation, concurrent initialization under the map lock could block or race so that collectors were sometimes never registered, and the engine-level statistics occasionally reportedstarted(0)/failed(0)instead ofstarted(2)/failed(2). With the updatedcomputeIfAbsent, the store computes each collector outside the lock and only once per key, so the collectors are always stored and the test statistics are stable even when the tests execute in parallel.Fixes #5171
I hereby agree to the terms of the JUnit Contributor License Agreement.
Definition of Done
@APIannotations