Skip to content

Combo falls through to next model on transient 503 instead of retrying #335

@East-rayyy

Description

@East-rayyy

Problem

When all API keys for a provider are temporarily locked due to transient errors (e.g., Anthropic 503 "No capacity available"), the combo handler immediately falls through to the next model in the chain instead of waiting for the short cooldown to expire and retrying.

The cooldown for these transient errors starts at just 1-2 seconds (exponential backoff: 1s → 2s → 4s...), but the combo handler treats "all accounts temporarily locked" the same as "provider permanently unavailable" and moves on.

This causes unnecessary fallthrough to providers that may not work (see #334), when simply waiting 1-2 seconds would have resolved the issue.

Example Flow

  1. Combo: antigravity/claude-opus-4-6-thinkinggithub/claude-opus-4-6-thinking
  2. Antigravity key seif gets 503 → locked for 1s
  3. Antigravity key personal gets 503 → locked for 1s
  4. handleSingleModelChat returns allRateLimited with retryAfter = 1 second from now
  5. Combo handler receives non-ok response → checks shouldFallback → moves to github
  6. GitHub fails → client gets a fatal error
  7. Meanwhile, both antigravity keys would have been available again in 1 second

Expected Behavior

When a provider returns allRateLimited with a short retryAfter (e.g., under 5-10 seconds), the combo handler should wait for the cooldown to expire and retry the same provider before falling through to the next combo model.

Suggested Fix

In handleComboChat (open-sse/services/combo.js), after receiving a failed response from handleSingleModel:

  1. Check if the error response contains a Retry-After header or the error body contains retryAfter
  2. If the retry delay is short (under a configurable threshold, e.g., 5-10 seconds), await the delay and retry the same model
  3. Limit retries to 1-2 attempts per model to avoid infinite loops
  4. Only apply this to transient/capacity errors, not permanent failures (401, 403, etc.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions