Skip to content

feat(config): support multiple API keys for failover#1707

Merged
yinwm merged 2 commits intosipeed:mainfrom
liuy:feat/multi-api-key-failover
Mar 18, 2026
Merged

feat(config): support multiple API keys for failover#1707
yinwm merged 2 commits intosipeed:mainfrom
liuy:feat/multi-api-key-failover

Conversation

@liuy
Copy link
Contributor

@liuy liuy commented Mar 17, 2026

Summary

Add api_keys field to ModelConfig to support multiple API keys with automatic failover. When multiple keys are configured, they are expanded into separate model entries with fallbacks set up for key-level failover.

Also fixed cooldown tracking granularity from provider-level to (provider, model)-level, enabling proper key-switching when multiple keys share the same provider.

Failover Flow

┌─────────────────────────────────────────────────────────────────┐
│                    Config: api_keys + fallbacks                  │
├─────────────────────────────────────────────────────────────────┤
│  {                                                              │
│    "model_name": "glm-4.7",                                     │
│    "model": "zhipu/glm-4.7",                                    │
│    "api_keys": ["k1", "k2"],           ← multiple keys          │
│    "fallbacks": ["minimax/m2.5"]      ← cross-model fallback    │
│  }                                                              │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼ ExpandMultiKeyModels()
┌─────────────────────────────────────────────────────────────────┐
│                      Expanded Models                             │
├─────────────────────────────────────────────────────────────────┤
│  [0] glm-4.7          (k1)  Fallbacks: [glm-4.7__key_1, minimax]│
│  [1] glm-4.7__key_1   (k2)  Fallbacks: [minimax]                │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼ FallbackChain.Execute()
┌─────────────────────────────────────────────────────────────────┐
│                      Failover Order                             │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   Request ──► glm-4.7 ──(429)──► glm-4.7__key_1 ──(429)──► ... │
│                 │                    │                          │
│              k1 fails              k2 fails                     │
│                 │                    │                          │
│                 └────────────────────┴───► minimax ──(ok)──► ✓ │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Key-level failover:  glm-4.7(k1) → glm-4.7__key_1(k2) → ...
Model-level fallback: zhipu keys exhausted → minimax

Cooldown Fix

Before: Cooldown tracked per-provider → all keys blocked when one fails

k1(429) ──► provider "zhipu" in cooldown ──► k2 also blocked ✗

After: Cooldown tracked per (provider, model) → each key has independent cooldown

k1(429) ──► "zhipu/glm-4.7" in cooldown ──► k2 (glm-4.7__key_1) still works ✓

Usage

{
  "model_name": "glm-4.7",
  "model": "zhipu/glm-4.7",
  "api_keys": ["key1", "key2", "key3"]
}

Expands internally to:

  • glm-4.7 (uses key1) → fallbacks: [glm-4.7__key_1, glm-4.7__key_2]
  • glm-4.7__key_1 (uses key2)
  • glm-4.7__key_2 (uses key3)

Backward Compatibility

Single api_key still works as before:

{"model_name": "gpt-4", "model": "openai/gpt-4o", "api_key": "single-key"}

Test Plan

Config expansion tests:

  • TestExpandMultiKeyModels_SingleKey - single key unchanged
  • TestExpandMultiKeyModels_APIKeysOnly - array only config
  • TestExpandMultiKeyModels_APIKeyAndAPIKeys - both fields merged
  • TestExpandMultiKeyModels_WithExistingFallbacks - prepends to existing fallbacks
  • TestExpandMultiKeyModels_EmptyAPIKeys - empty array handled
  • TestExpandMultiKeyModels_Deduplication - duplicate keys removed
  • TestExpandMultiKeyModels_PreservesOtherFields - other config preserved
  • TestMergeAPIKeys - key merging logic

Failover behavior tests:

  • TestMultiKeyFailover - key1 429 → key2 success
  • TestMultiKeyFailoverAllFail - all keys fail → FallbackExhaustedError
  • TestMultiKeyFailoverCooldown - key1 in cooldown → skipped
  • TestMultiKeyFailoverWithFormatError - non-retriable error → no fallback
  • TestMultiKeyWithModelFallback - keys exhausted → model fallback

@liuy liuy force-pushed the feat/multi-api-key-failover branch 3 times, most recently from 23f8c0e to e8058ae Compare March 17, 2026 17:55
Add api_keys field to ModelConfig to support multiple API keys with
automatic failover. When multiple keys are configured, they are expanded
into separate model entries with fallbacks set up for key-level failover.

Example config:
  {
    "model_name": "glm-4.7",
    "model": "zhipu/glm-4.7",
    "api_keys": ["key1", "key2", "key3"]
  }

Expands internally to:
  - glm-4.7 (key1) -> fallbacks: [glm-4.7__key_1, glm-4.7__key_2]
  - glm-4.7__key_1 (key2)
  - glm-4.7__key_2 (key3)

Backward compatible: single api_key still works as before.
@liuy liuy force-pushed the feat/multi-api-key-failover branch from e8058ae to ca153d5 Compare March 17, 2026 23:49
This enables proper key-switching when multiple API keys share the same
provider. Previously, when one key failed, all keys were blocked because
cooldown was tracked per-provider.

Now each (provider, model) combination has independent cooldown, allowing
fallback to alternate keys when one is rate limited.

Includes TestMultiKeyWithModelFallback and related failover tests.
@liuy liuy force-pushed the feat/multi-api-key-failover branch from ca153d5 to 38e144d Compare March 17, 2026 23:54
Copy link
Collaborator

@yinwm yinwm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-Landing Review: Approved ✅

Scope Check: CLEAN

  • Intent: Support multiple API keys with automatic failover + fix cooldown tracking granularity
  • Delivered: Complete implementation of api_keys field, ExpandMultiKeyModels expansion, cooldown changed from provider-level to (provider, model)-level

Backward Compatibility: 100%

  • Old configs with single api_key work unchanged
  • New fields (api_keys, fallbacks) are optional with omitempty
  • Cooldown granularity change is a positive improvement

Pass 1 (CRITICAL):

  • ✅ SQL & Data Safety - Not applicable
  • ✅ Race Conditions - Cooldown tracking correctly uses ModelKey(provider, model)
  • ✅ LLM Output Trust Boundary - Not applicable
  • ✅ Enum & Value Completeness - No new enum values

Pass 2 (INFORMATIONAL):

  • ✅ Conditional Side Effects - Fallback chain logic is clear
  • ✅ Magic Numbers - __key_ suffix is a reasonable naming convention
  • ✅ Dead Code - Clean code, comprehensive test coverage
  • ✅ Test Gaps - Tests cover: single key, multi key, mixed key scenarios, fallback chain, cooldown skip, format error non-retry, cross-model fallback combination

Code Quality Highlights:

  1. ExpandMultiKeyModels correctly copies all necessary fields (RPM, Timeout, ThinkingLevel, etc.)
  2. resolveAPIKeys handles both APIKey and APIKeys array decryption
  3. Backward compatible: single api_key continues to work normally

Optional suggestion (non-blocking):

  • Consider extracting a cloneModelConfig helper function if more fields are added in the future

Conclusion: High-quality PR with clear implementation, comprehensive tests, and full backward compatibility. Ready to merge.

@yinwm yinwm merged commit e73d9d9 into sipeed:main Mar 18, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants