Skip to content

Conversation

@jrhee17
Copy link
Contributor

@jrhee17 jrhee17 commented Sep 23, 2025

Motivation:

This changeset attempts to solve the same problem as #6318.

Retry limiting is a concept which limits the number of retries in case a system undergoes a prolonged period of service degradation.
gRPC offers a token-based configuration which limits retries depending on certain predicates, whereas envoy offers a simple concurrency limiting configuration based on the number of active retries.

To support token-based retry limiters, I propose that RetryDecision#permits is added as metadata which could signal to RetryLimiter whether retries should be further made. This could be useful for systems behind load balancers, as the load balancer may return certain status codes depending on the health upstream.

To support simple concurrency-based retry limiters, I propose that a RetryLimiter#shouldRetry method is called right before a retry is executed.

Modifications:

  • Introduced RetryLimiter which acts as an extension to dynamically limit retries.
    • RetryLimiter#shouldRetry decides whether a retry should be executed
    • RetryLimiter#handleDecision is invoked when a RetryDecision is made. RetryDecision#permits may be used to update the internal state and decide whether retries should be allowed.
  • Added RetryDecision#permits
  • Added APIs so users can set RetryLimiter to RetryConfig

Result:

@jrhee17 jrhee17 added this to the 1.34.0 milestone Sep 23, 2025
@jrhee17 jrhee17 marked this pull request as ready for review September 23, 2025 03:34
Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good.

* Returns a {@link RetryLimiter} which limits the number of concurrent retry requests.
* This limiter does not consider {@link RetryDecision#permits()} when limiting retries.
*/
static RetryLimiter concurrencyLimiting(long maxRequests) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional) Would it be useful to support rateLimiting by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to leave a comment on this - the RetryLimiter API doesn't go well with guava's RateLimiter as there is no way to determine if a the state should be rate-limited at a given time.

I'd like to revisit this later on unless I'm missing something/you have a better idea.

* <a href="https://github.com/grpc/proposal/blob/master/A6-client-retries.md#throttling-retry-attempts-and-hedged-rpcs">gRPC's documentation</a>
* for more details.
*/
static RetryLimiter tokenBased(float maxTokens, float tokenRatio) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that int is more appropriate type for maxTokens. Is there a reason a float is being used?
https://github.com/grpc/proposal/blob/master/A6-client-retries.md#validation-of-retrythrottling

maxTokens MUST be specified and MUST be a JSON integer value in the range (0, 1000].

Comment on lines 73 to 77
if (permits > 0) {
if (currentCount == 0) {
break;
}
newCount = Math.max(currentCount - THREE_DECIMAL_PLACES_SCALE_UP, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Javadoc says

each positive {@link RetryDecision#permits()} will increment available tokens by {@code tokenRatio} 

Which is different from the actual logic.
Does the permit always -1, 0, 1?
I'm asking because this logic doesn't take into account the permit numbers, so I was wondering if we could use more specific names than 'permit'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've generalized tokenBased. Although users who prefer gRPC-style retry limiting will need to add a little more logic, the probably makes more sense for general users.

@codecov
Copy link

codecov bot commented Oct 17, 2025

Codecov Report

❌ Patch coverage is 75.22124% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.13%. Comparing base (8150425) to head (35f7573).
⚠️ Report is 216 commits behind head on main.

Files with missing lines Patch % Lines
...m/linecorp/armeria/client/retry/RetryDecision.java 50.00% 10 Missing ⚠️
...p/armeria/client/retry/TokenBasedRetryLimiter.java 65.21% 5 Missing and 3 partials ⚠️
...ria/client/retry/ConcurrencyBasedRetryLimiter.java 66.66% 4 Missing and 1 partial ⚠️
...ecorp/armeria/client/retry/RetryConfigBuilder.java 66.66% 2 Missing ⚠️
...rp/armeria/client/retry/RetryLimitedException.java 80.00% 0 Missing and 1 partial ⚠️
...m/linecorp/armeria/client/retry/RetryLimiters.java 94.11% 1 Missing ⚠️
...necorp/armeria/client/retry/RetryingRpcClient.java 87.50% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #6409      +/-   ##
============================================
- Coverage     74.46%   74.13%   -0.33%     
- Complexity    22234    23036     +802     
============================================
  Files          1963     2067     +104     
  Lines         82437    86222    +3785     
  Branches      10764    11323     +559     
============================================
+ Hits          61385    63924    +2539     
- Misses        15918    16881     +963     
- Partials       5134     5417     +283     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 👍 👍

Copy link
Contributor

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable retry throttling

3 participants