fix: Rate limiter returns retry after None instead of a duration#1269
fix: Rate limiter returns retry after None instead of a duration#1269nickpismenkov merged 3 commits intostagingfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical issue where rate-limited LLM and embedding providers would return an undefined retry duration, leading to immediate retries or no retries at all. By introducing a default 60-second fallback for the Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request addresses a bug where rate limit errors would not include a retry-after duration, causing issues with backoff and retry logic. The fix involves adding a 60-second fallback duration when the Retry-After header is missing or unparseable. This change has been consistently applied across the anthropic_oauth, nearai_chat, and embeddings providers.
You've also added comprehensive regression tests for each provider to validate the new fallback behavior, which is excellent. My review includes a few suggestions to improve the maintainability of these new tests by refactoring the test helpers to avoid duplicating production logic, aligning with best practices for robust and maintainable test suites.
4413859 to
0503c6c
Compare
Add regression test to src/llm/retry.rs that verifies RateLimited errors always have a fallback duration (never None) due to the 60-second fallback applied in all rate limit error creation sites (nearai_chat.rs, anthropic_oauth.rs, embeddings.rs). The production code fix adds `.or(Some(Duration::from_secs(60)))` to ensure the error message never displays "retry after None" to the user. [skip-regression-check] Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
d5e70e3 to
2d608c3
Compare
zmanian
left a comment
There was a problem hiding this comment.
Review: fix rate limiter "retry after None" display bug
Simple, correct fix. Adds .or(Some(Duration::from_secs(60))) as a fallback when the Retry-After header is missing or unparseable. All 4 HTTP Retry-After parsing sites are covered:
- anthropic_oauth.rs
- nearai_chat.rs
- embeddings.rs (OpenAiEmbeddings + NearAiEmbeddings)
I verified there are no other HTTP Retry-After parsing sites in the codebase -- other retry_after: None occurrences are internally constructed errors with their own fallback handling.
CI fully green.
Positives:
- Complete coverage of all affected sites
- 60s fallback is a reasonable default
- Good test coverage for each module
Minor notes:
- The test helpers (e.g.
parse_retry_after_anthropic_for_test) reimplement parsing logic rather than testing actual production code. If someone later changes the production parsing chain but not the test helper, tests could pass while production is broken. Consider integration-level tests with mock HTTP responses in a follow-up. - The
retry.rstest (commit 3) constructs aRateLimitedwithSome(60s)and asserts it'sSome(60s)-- this doesn't exercise any code path. It's harmless but not providing real regression protection. Option<Duration>is now effectively always-Some from these call sites. Minor type-level inaccuracy but not worth changing.
LGTM.
When the nearai_chat provider hits its rate limit, the error message
shows:
"LLM error: Provider nearai_chat rate limited, retry after None"
The "None" value indicates the retry delay is not being extracted or set correctly.
This also means the agent has no backoff duration to work with and may retry
immediately (causing more rate limit hits) or not retry at all