You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In one of my pg_search_scopes turning on prefix search yields very weird ranks. Apart from slightly adjusting the rank (e.g. by a factor of 0.5 for old items), I don't do anything tricky:
pg_search_scope:search,against: :text,using: {tsearch: {tsvector_column: 'search_tsvector',prefix: true,negation: true,dictionary: 'simple',normalization: 0,}},ranked_by: <<-SQL trunc( :tsearch * 1000000 * // slight boosting of results according to certain flags or item age, but never more than by a combined factor of 8. ) SQL
The query gives low ranks to obviously important items (20+ occurrences of the search term) and results in a weird distribution of ranks. I would (for search in general) expect some distribution where ranks between neighboring results differ by maybe 10% on average, but I get ranks like [1'000'000, 30'000, 500, 10, ...].
Obviously, with such huge gaps, any custom rank boosting will have no effect on the order of the results. But more importantly, I would understand such a clear result, if the best match would be on the top, but it isn't.
This huge spread of ranks only happens with prefix: true, any_word: false. For all other three combinations of these flags, the ranks have a saner distribution, are much closer to each other and the obvious best result is on top.
Is there any known problem with this combination? Is this possibly a bug or is there a logical reason, why this combination behaves differently than the others? Also, are there more advanced methods of debugging such a thing than simply displaying the rank in the output?
I would really like to keep the prefix search without messing up all of the ranks.
The text was updated successfully, but these errors were encountered:
In one of my pg_search_scopes turning on prefix search yields very weird ranks. Apart from slightly adjusting the rank (e.g. by a factor of 0.5 for old items), I don't do anything tricky:
The query gives low ranks to obviously important items (20+ occurrences of the search term) and results in a weird distribution of ranks. I would (for search in general) expect some distribution where ranks between neighboring results differ by maybe 10% on average, but I get ranks like [1'000'000, 30'000, 500, 10, ...].
Obviously, with such huge gaps, any custom rank boosting will have no effect on the order of the results. But more importantly, I would understand such a clear result, if the best match would be on the top, but it isn't.
This huge spread of ranks only happens with
prefix: true, any_word: false
. For all other three combinations of these flags, the ranks have a saner distribution, are much closer to each other and the obvious best result is on top.Is there any known problem with this combination? Is this possibly a bug or is there a logical reason, why this combination behaves differently than the others? Also, are there more advanced methods of debugging such a thing than simply displaying the rank in the output?
I would really like to keep the prefix search without messing up all of the ranks.
The text was updated successfully, but these errors were encountered: