You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When searching for certain keywords, that can be found in the documentation of a versioned library/application/framework, I often see the same result over and over and over again, where the only difference is the available version in the documentation, e.g. see the screenshot of a search for "kcl" "dict" "schema" below:
The hard part might be detecting those as "same results, but only the most recent version/latest is releveant" and I lack the knowledge to suggest what to do about this implementation wise.
From a UX PoV it would probably make sense to hide those duplicates behind a "Show more similar results" fold-out or so.
I just discovered the "Copycats removal" Optic which somehow helps here, but also removes the original result of the latest version of the documentation and shows a completely different set of results instead.
There currently is some soft deduplication based on the url, title and body. Essentially if a result has a title with a very high similarity to a result title that's higher in the list, then the lower result get's deprioritized a bit. I think if we had more results in the index that matched your search terms, then it would look a bit better as I am pretty sure the older versions would be deprioritized based on their title similarity and body similarity with the top result.
I agree it would probably be a good idea to hide very high similarity results behind some kind of button at the end of the search results here.
It's a very interesting problem to detect which documentation that points to the latest version. Right now, the ranking would probably rely fully on the harmonic centrality values to try and figure it out, but we might need to write some custom logic here. I don't exactly know what the best way to implement it would be yet.
Can you elaborate a bit on the optic problem? The "copycats removal" optic doesn't seem to remove the results from kcl-lang.io for me.
When searching for certain keywords, that can be found in the documentation of a versioned library/application/framework, I often see the same result over and over and over again, where the only difference is the available version in the documentation, e.g. see the screenshot of a search for
"kcl" "dict" "schema"
below:The hard part might be detecting those as "same results, but only the most recent version/
latest
is releveant" and I lack the knowledge to suggest what to do about this implementation wise.From a UX PoV it would probably make sense to hide those duplicates behind a "Show more similar results" fold-out or so.
Might be related to #51
The text was updated successfully, but these errors were encountered: