For the Openai embedding model
under construction
When applying Openai embedding model text-embedding-ada-002
to any textual content and storage them in vector database, there is a rate limit on token per min (TPM) and response per min (RPM) on all openai models (See below for some examples).
Model | TPM | RPM |
---|---|---|
gpt-3.5-turbo | 90,000 | 3,500 |
gpt-3.5-turbo-16k | 180,000 | 3,500 |
gpt-4 | 10,000 | 200 |
text-embedding-ada-002 | 1,000,000 | 3,000 |
To avoid to hit the rate limit to prevent the embedding operation stalling, here is a simple python code snippets to help estimate how many docs will reach the rate limit of the TPM by applying the binary search algorithm.
- Prepare your own documentations in your designated folder
- Define your personal configuration and load the config file in the function below:
def get_config_path():
cwd = os.getcwd()
return os.path.join(cwd, "<YOUR_JSON_CONFIG_FILE>")
- Set up your dev env
python -m venv myvenv source myvenv/bin/activate
- Install requirement libraries
pip install -r requirements.txt
- Run the codes
python estimator.py