Combo falls through to next model on transient 503 instead of retrying

## Problem

When all API keys for a provider are temporarily locked due to transient errors (e.g., Anthropic 503 "No capacity available"), the combo handler immediately falls through to the next model in the chain instead of waiting for the short cooldown to expire and retrying.

The cooldown for these transient errors starts at just **1-2 seconds** (exponential backoff: 1s → 2s → 4s...), but the combo handler treats "all accounts temporarily locked" the same as "provider permanently unavailable" and moves on.

This causes unnecessary fallthrough to providers that may not work (see #334), when simply waiting 1-2 seconds would have resolved the issue.

## Example Flow

1. Combo: `antigravity/claude-opus-4-6-thinking` → `github/claude-opus-4-6-thinking`
2. Antigravity key `seif` gets 503 → locked for 1s
3. Antigravity key `personal` gets 503 → locked for 1s  
4. `handleSingleModelChat` returns `allRateLimited` with `retryAfter` = 1 second from now
5. Combo handler receives non-ok response → checks `shouldFallback` → moves to `github`
6. GitHub fails → client gets a fatal error
7. Meanwhile, both antigravity keys would have been available again in 1 second

## Expected Behavior

When a provider returns `allRateLimited` with a **short** `retryAfter` (e.g., under 5-10 seconds), the combo handler should **wait** for the cooldown to expire and **retry** the same provider before falling through to the next combo model.

## Suggested Fix

In `handleComboChat` (open-sse/services/combo.js), after receiving a failed response from `handleSingleModel`:

1. Check if the error response contains a `Retry-After` header or the error body contains `retryAfter`
2. If the retry delay is short (under a configurable threshold, e.g., 5-10 seconds), `await` the delay and retry the same model
3. Limit retries to 1-2 attempts per model to avoid infinite loops
4. Only apply this to transient/capacity errors, not permanent failures (401, 403, etc.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Combo falls through to next model on transient 503 instead of retrying #335

Problem

Example Flow

Expected Behavior

Suggested Fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Combo falls through to next model on transient 503 instead of retrying #335

Description

Problem

Example Flow

Expected Behavior

Suggested Fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions