You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
While working with the QdrantHybridRetriever I found out that the score_threshold parameter can be a little tricky to understand and seems like that it is used only during the search of the sparse vectors as shown here in the QdrantDocumentStore module:
The score that is returned from the requests is not normalized, so that make more difficult to set a threshold score.
Describe the solution you'd like
A simple solution to this problem can be implemented in one of these ways:
Solution A:
Remove the score_threshold from the sparse vector search
Add an optional parameter to the reciprocal_rank_fusion function so that it can optionally filter all the points that have a score lower than the threshold
Solution B:
Remove the score_threshold from the sparse vector search
Add a condition on the point score to the following list comprehension:
results = [convert_qdrant_point_to_haystack_document(point, use_sparse_embeddings=True) for point in points if point.score >= score_threshold]
Describe alternatives you've considered
Another alternative, is to implement a component that filters out the document based on the score given from the retriever like this:
Is your feature request related to a problem? Please describe.
While working with the
QdrantHybridRetriever
I found out that thescore_threshold
parameter can be a little tricky to understand and seems like that it is used only during the search of the sparse vectors as shown here in theQdrantDocumentStore
module:The score that is returned from the requests is not normalized, so that make more difficult to set a threshold score.
Describe the solution you'd like
A simple solution to this problem can be implemented in one of these ways:
score_threshold
from the sparse vector searchreciprocal_rank_fusion
function so that it can optionally filter all the points that have a score lower than the thresholdscore_threshold
from the sparse vector searchDescribe alternatives you've considered
Another alternative, is to implement a component that filters out the document based on the score given from the retriever like this:
Although is not a bad solution, I think it should be a cool thing to be able to perform the filtering directly on the retriever component
The text was updated successfully, but these errors were encountered: