Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Centralizing Rate Limits #427

Closed
wants to merge 3 commits into from
Closed

Centralizing Rate Limits #427

wants to merge 3 commits into from

Conversation

whitead
Copy link
Collaborator

@whitead whitead commented Sep 17, 2024

I created an OpenAI project set at the tier 1 usage and tried to get liteLLM to work with lower rate limits. I centralized the Router to originate from the settings, but still am running into a lot of issues.

I've built a Router with

{
    "model_list": [
        {
            "model_name": "gpt-4o-2024-08-06",
            "litellm_params": {"model": "gpt-4o-2024-08-06", "temperature": 0.0},
            "rpm": 500,
            "tpm": 30000,
        }
    ],
    "num_retries": 3,
    "retry_after": 10,
    "enable_pre_call_checks": True,
}

The issues are that - with async concurrency:

  1. The rate limits are not respected and we're getting 429s from OpenAI. LiteLLM also does not actually respect the message in 429 to wait the requested time (e.g., retry in 3.5 seconds)
  2. The logs are full of "Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new" regardless of my log settings
  3. Even if it did work, we have a fundamental mismatch in usage. Because of how gpt-4o rate limits work, we need to share rate limits across multiple models - but litellm will only share across deployments (which are keyed to one model or group of models). This is the opposite of what we need.

I reproduced these errors in the min file router_test.py which triggers 429, even though the rate limits are set to match my OpenAI rate limits.

Opening this PR to start discussion on how to proceed

@whitead
Copy link
Collaborator Author

whitead commented Sep 28, 2024

Closing in favor of new plan

@whitead whitead closed this Sep 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant