-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calculation of "5:1 -- Cost Ratio of generation of text using GPT-3.5-Turbo vs OpenAI embedding" #18
Comments
Hey Theo, Thanks for reaching out! The point is you don't use an LM, you use a vector database. You embed questions like capitals in a semantic search index. This includes products/oss projects like FAISS, Chroma and commercial products like Vectara, Pinecone, etc. These are all examples of the general body of techniques called retrieval augmented generation. |
Thank you for your answer Waleed! I understand the tools you mentioned help in the retrieval part, however when doing retrieval augmented generation you have a retreiver and a generator, the generator being a language model. I understand you could just retrieve the document without using a lanugage model but that would juste be called document retrieval. In both scenarios, creating embeddings, indexing, and performing semantic search are necessary steps. Regarding the 5:1 ratio, you mentioned vector lookup being considered free, but I'm curious about the calculations behind it, especially given potential expenses with large document sets. Add to that the cost of the generator, which I'm sure would be cheaper than an API call to GPT 3.5 Turbo as you don't need a model that big once you feed it the info it needs on a case by case basis, but still requires an infrastructure to run on. Could you please provide insight into the calculation for the 5:1 ratio? Appreciate your help! |
Sure, here's the calculation breakdown: For text generation: GPT-3.5 Turbo with 4K context: $0.0015 per 1K tokens input, $0.002 per 1K tokens output. Ada v2: $0.0001 per 1K tokens. If you have further questions or need clarification, feel free to ask! |
Hi, could you share the calculation for this one, and maybe add it to the footnote? not sure I understand it.
Here are the numbers I find on the open ai pricing page:
GPT 3.5 Turbo text generation
Embeddings
You give the example of answering "What is the capital of Delaware?", if you had to answer this question with a LM that doesn't have the info in its weights you could say you have to embed all the documents of a corpus that contains this answer, you could choose an arbitrary narrow scope but that could also be the whole wikipedia, which is something like 5.6B tokens, and would cost something like 5.6e9/1000*$0.0001=$560 to just index.
What am I missing ?
Thanks!
The text was updated successfully, but these errors were encountered: