Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contribute advanced Hybrid search example in OpenAI Cookbook (python) #7

Open
tylerhutcherson opened this issue May 2, 2023 · 4 comments
Labels
enhancement New feature or request python

Comments

@tylerhutcherson
Copy link
Collaborator

The existing cookbook just touches the surface:
https://github.com/openai/openai-cookbook/blob/main/examples/vector_databases/redis/getting-started-with-redis-and-openai.ipynb

Contribute a Python notebook that demonstrates complex Hybrid queries with Redis VSS and other search features (an ecommerce dataset might work nicely) including

  • Numeric range filters
  • Tag filters
  • Full text search "filters"
  • Client-side hybrid scoring combing both BM25 lexical AND semantic search. This could be done in a pipeline to send 1 redis call to fetch both search results (top K) and then merge the sets. Show performance improvement with this technique over pure lexical or pure semantic?
@tylerhutcherson tylerhutcherson added enhancement New feature or request python labels May 2, 2023
@tylerhutcherson tylerhutcherson changed the title Add advanced Hybrid search example in OpenAI Cookbook Contribute advanced Hybrid search example in OpenAI Cookbook (python) May 2, 2023
@michaelskyuan
Copy link

Submitted PR

@tylerhutcherson
Copy link
Collaborator Author

Initial review submitted from our end >>> openai/openai-cookbook#417

@michaelskyuan at some point we will also want to make an update to this notebook that covers bullet point 4 above. This is a bit "green field" in the sense that we have not yet explicitly tried this. But it's theoretically possible to do true weighted hybrid search using a redis pipeline command and "merging" results from the two scoring algorithms (BM25 + KNN/CosineD). I sense the lift will be a bit more on this, and since not immediately pressing, I will spin it off into a separate issue that we can re-prioritize when the time is right, probably in the next month.

@michaelskyuan
Copy link

I agree @tylerhutcherson. And I believe this topic deserves it's own separate notebook with a denser text dataset.
Let's leave OOTB Redis Hybrid search functionality on the current notebook and have a specific notebook that will address normalization of lexical and semantic scoring using a more appropriate dataset instead of an ecommerce dataset.

@Spartee
Copy link
Contributor

Spartee commented May 26, 2023

@michaelskyuan This App/notebook was recently contributed by OpenAI. We could use some of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request python
Projects
None yet
Development

No branches or pull requests

3 participants