Skip to content

Conversation

schiemon
Copy link
Contributor

@schiemon schiemon commented Sep 15, 2025

Motivation

RetryingRpcClientTest.doNotRetryWhenResponseIsCancelled is brittle because it assumes that cancelling the response after delegating to RetryingRpcClient will always result in exactly one attempt. However, there is no guarantee about the number of attempts made between res = delegate.execute() and res.cancel().

I propose rewriting the test to verify two guarantees:

  1. If the response is cancelled before RetryingRpcClient.execute is called, RetryingRpcClient performs no attempts.
  2. If the response is cancelled after RetryingRpcClient.execute is called, RetryingRpcClient performs at most one additional attempt after cancellation.

The first guarantee is straightforward to test. For the second, we currently lack a way to signal cancellation before invoking RetryingRpcClient.execute(ctx, req), since we do not yet have a handle to the response. Additionally, RetryingRpcClient currently ignores ctx.isCancelled() during retries.

To address this, I propose adding a ctx.isCancelled() check before each retry attempt. This allows the doNotRetryWhenResponseIsCancelled test to call ctx.cancel() before proceeding through the decorator chain, ensuring RetryingRpcClient observes the cancelled state and skips further retries.

See also this comment for additional context.

Modifications

  • Added a ctx.isCancelled() check in RetryingRpcClient before scheduling each retry.
  • Enhanced RetryingRpcClientTest.doNotRetryWhenResponseIsCancelled to cancel the request/response with varying delays.

Result

  • RetryingRpcClientTest.doNotRetryWhenResponseIsCancelled is no longer flaky.
  • RetryingRpcClient respects ctx.isCancelled

@schiemon schiemon force-pushed the fix-flaky-doNotRetryWhenResponseIsCancelled-take-2 branch from f021914 to 5316498 Compare September 15, 2025 10:20
schiemon referenced this pull request Sep 15, 2025
…#6382)

Motivation:

I thought the previous modification #6354 would fix the flakiness, but
it turned out it didn't.

The first attempt may not be cancelled by `res.cancel(true)` because of
the race between an event loop and the main thread.
https://github.com/line/armeria/blob/a0e2225cb7aac6b229b86e25e9b2f34633a9f45b/thrift/thrift0.13/src/test/java/com/linecorp/armeria/it/client/retry/RetryingRpcClientTest.java#L354-L356
Therefore, I propose removing the flaky assertions and adding a new
clear assertion that verifies retry occurs only once.

Modifications:

- Remove flaky assertions in
`RetryingRpcClientTest.doNotRetryWhenResponseIsCancelled()`

Result:

Make CI stable
@schiemon schiemon marked this pull request as ready for review September 15, 2025 11:11
Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Contributor

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link

codecov bot commented Sep 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.09%. Comparing base (8150425) to head (f6cc34e).
⚠️ Report is 189 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #6399      +/-   ##
============================================
- Coverage     74.46%   74.09%   -0.37%     
- Complexity    22234    22993     +759     
============================================
  Files          1963     2061      +98     
  Lines         82437    86125    +3688     
  Branches      10764    11310     +546     
============================================
+ Hits          61385    63813    +2428     
- Misses        15918    16898     +980     
- Partials       5134     5414     +280     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ikhoon ikhoon merged commit b450943 into line:main Sep 25, 2025
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants