feat: Support for setting top_k for HybridRouteLayer #197

andreped · 2024-03-13T18:20:46Z

fixes #198.

Also did the following:

Added unit test for top_k and alpha for the HybridRouteLayer
Updated CONTRIBUTIONS.md to include all-extras during installation setup relevant for unit tests
Performed linting (following the CONTRIUBTIONS.md)

andreped · 2024-03-13T20:59:39Z

I have tested the solution in one of my projects, and seen a change in performance.
To use this new feature, simply do:

from semantic_router.encoders import AzureOpenAIEncoder, TfidfEncoder
from semantic_router.hybrid_layer import HybridRouteLayer

model = HybridRouteLayer(
    encoder=AzureOpenAIEncoder(...),
    sparse_encoder=TfidfEncoder(),
    routes=routes,
    alpha=0.3,  # default = 0.3
    top_k=3,  # default = 5
)

jamescalam · 2024-03-14T03:48:10Z

hi @andreped — this looks great, we have not focused on the HybridRouteLayer so far so I'm curious as to whether you're seeing improved performance when using it? In any case the hybrid approach should in theory unlock better performance so it's something we'd like to spend more time on soon

Will complete the review once tests have finished running

codecov · 2024-03-14T03:48:49Z

Codecov Report

Attention: Patch coverage is 80.00000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 77.72%. Comparing base (1e753d9) to head (f0ffac5).

Files	Patch %	Lines
semantic_router/hybrid_layer.py	75.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #197      +/-   ##
==========================================
- Coverage   77.74%   77.72%   -0.02%     
==========================================
  Files          42       42              
  Lines        2089     2092       +3     
==========================================
+ Hits         1624     1626       +2     
- Misses        465      466       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

andreped · 2024-03-14T07:37:08Z

Hello, @jamescalam! Hybrid search stuff has worked great for me t in the past, so I would think that the HybridRouteLayer would offer similar benefits.

By using the base HybridRouteLayer without any modifications, I achieved 4% higher macro F1-score on our application. I further gained a few percentages by lowering topk to 2 or 3 from 5. Runtime was twice as slow with Hybrid compared to without, but with a runtime per prompt of 0.16s this was fine. Now I basically just need to refine the utterances which were labelled wrongly occasionally or were suboptimal to describe some of the routes.

I will be exploring it more today, and may make a new PR, if I find one technique to work better for my application. For instance, changing the aggregation method from SUM to MEAN or MAX, would also be of interest. I can make an issue and PR about this separately :]

jamescalam

lgtm!

jamescalam · 2024-03-14T11:39:36Z

hey @andreped, that's awesome — yes I think the HybridRouteLayer could definitely do with some optimization, feel free to submit changes it would help a lot! For now I'm merging this, thanks for the PR :)

andreped · 2024-03-14T14:22:54Z

hey @andreped, that's awesome — yes I think the HybridRouteLayer could definitely do with some optimization, feel free to submit changes it would help a lot! For now I'm merging this, thanks for the PR :)

Made an issue #201 and I will draft a PR very soon.

EDIT: 🚀 PR #202 is ready to review now!

andreped added 9 commits March 13, 2024 18:35

Add support for setting top_k for HybridRouteLayer through class API

7a5dfc6

Added top_k check in class __init__

a2f363f

Add test for verifying that top_k selecting works

def1b25

Add alpha selection test for Hybrid route layer

9f60017

Updated contributing to include --all-extras for poetry install step

1bd99c5

Minor refactoring

1515169

Linted code

b5fa9ed

Updated coverage post lint; minor refactoring in contrib

c2d7396

Fixed remainding linting issues

f0ffac5

jamescalam self-requested a review March 14, 2024 03:46

jamescalam approved these changes Mar 14, 2024

View reviewed changes

jamescalam merged commit 19cf5d2 into aurelio-labs:main Mar 14, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Support for setting top_k for HybridRouteLayer #197

feat: Support for setting top_k for HybridRouteLayer #197

andreped commented Mar 13, 2024 •

edited

Loading

andreped commented Mar 13, 2024 •

edited

Loading

jamescalam commented Mar 14, 2024

codecov bot commented Mar 14, 2024 •

edited

Loading

andreped commented Mar 14, 2024 •

edited

Loading

jamescalam left a comment

jamescalam commented Mar 14, 2024

andreped commented Mar 14, 2024 •

edited

Loading

feat: Support for setting top_k for HybridRouteLayer #197

feat: Support for setting top_k for HybridRouteLayer #197

Conversation

andreped commented Mar 13, 2024 • edited Loading

andreped commented Mar 13, 2024 • edited Loading

jamescalam commented Mar 14, 2024

codecov bot commented Mar 14, 2024 • edited Loading

Codecov Report

andreped commented Mar 14, 2024 • edited Loading

jamescalam left a comment

Choose a reason for hiding this comment

jamescalam commented Mar 14, 2024

andreped commented Mar 14, 2024 • edited Loading

andreped commented Mar 13, 2024 •

edited

Loading

andreped commented Mar 13, 2024 •

edited

Loading

codecov bot commented Mar 14, 2024 •

edited

Loading

andreped commented Mar 14, 2024 •

edited

Loading

andreped commented Mar 14, 2024 •

edited

Loading