-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support for setting top_k for HybridRouteLayer #197
Conversation
I have tested the solution in one of my projects, and seen a change in performance. from semantic_router.encoders import AzureOpenAIEncoder, TfidfEncoder
from semantic_router.hybrid_layer import HybridRouteLayer
model = HybridRouteLayer(
encoder=AzureOpenAIEncoder(...),
sparse_encoder=TfidfEncoder(),
routes=routes,
alpha=0.3, # default = 0.3
top_k=3, # default = 5
) |
hi @andreped — this looks great, we have not focused on the Will complete the review once tests have finished running |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #197 +/- ##
==========================================
- Coverage 77.74% 77.72% -0.02%
==========================================
Files 42 42
Lines 2089 2092 +3
==========================================
+ Hits 1624 1626 +2
- Misses 465 466 +1 ☔ View full report in Codecov by Sentry. |
Hello, @jamescalam! Hybrid search stuff has worked great for me t in the past, so I would think that the HybridRouteLayer would offer similar benefits. By using the base HybridRouteLayer without any modifications, I achieved 4% higher macro F1-score on our application. I further gained a few percentages by lowering topk to 2 or 3 from 5. Runtime was twice as slow with Hybrid compared to without, but with a runtime per prompt of 0.16s this was fine. Now I basically just need to refine the utterances which were labelled wrongly occasionally or were suboptimal to describe some of the routes. I will be exploring it more today, and may make a new PR, if I find one technique to work better for my application. For instance, changing the aggregation method from SUM to MEAN or MAX, would also be of interest. I can make an issue and PR about this separately :] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
hey @andreped, that's awesome — yes I think the |
fixes #198.
Also did the following:
top_k
andalpha
for theHybridRouteLayer
CONTRIBUTIONS.md
to includeall-extras
during installation setup relevant for unit testsCONTRIUBTIONS.md
)