Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic search strings that don't return the expected result #246

Open
LeMurphant opened this issue May 25, 2023 · 11 comments
Open

Semantic search strings that don't return the expected result #246

LeMurphant opened this issue May 25, 2023 · 11 comments

Comments

@LeMurphant
Copy link
Collaborator

see https://discord.com/channels/677546901339504640/1106385025177485362

Writers have noticed that sometimes using simple search terms on aisafety.info don't return results for the expected page that is live on site. This is a place to collect them and notify the devs.

Entries should take the following form:
Search term: { term(s) }
Expected result: {aisafety.info or google docs page}

As of 2023-05-25 no such string has been officially collected, I will make sure writers know to post these failed searches here.

@markovial
Copy link

markovial commented Jun 2, 2023

searched term: pascals mugging
expected result: Aren't AI existential risk concerns just an example of Pascal's mugging?
returned result: none

@markovial
Copy link

markovial commented Jun 2, 2023

searched term: academia
expected result: How can I work on AGI safety outreach in academia and among experts?
returned result: none

searched term: outreach
expected result: How can I work on public AI safety outreach?
returned result: none

searched term: mathematical, philosophical
expected result: How can I do conceptual, mathematical, or philosophical work on AI alignment?
returned result: none
comment: as soon as I add just one more word and the semantic search kicks in instead of keyword I get the expected result

@Aprillion
Copy link
Collaborator

Aprillion commented Jun 2, 2023

the exact match academia, outreach, ... look like a problem with cache => I need to find time to investigate the caching issues from #228 ... I just deleted the cache and it started to work again:
Screenshot 2023-06-02 at 14 35 05

feel free to ping me on Discord when we have a batch of new Live on site questions that cannot be found by single word exact match search 😅

non-exact match like pascals -> pascal's will need more discussion how to solve properly ... but this particular case might be good enough when you start typing pa:
Screenshot 2023-06-02 at 14 38 37

@Aprillion
Copy link
Collaborator

actually, looks like we solved the apostrophes too, so that one was also not working because of cache issues...
Screenshot 2023-06-02 at 14 39 58

@markovial
Copy link

I usually notice this kind of stuff only once a month when I am going through and creating the update lesswrong post, because I need to search up all the questions that go into that post. But since we push questions to live on site through the month as well, it might just be worth setting up a manual reminder to clear the cache every once a week or so. I don't really know what the negative consequences are as far as performance is concerned if we do it too often.

@Aprillion
Copy link
Collaborator

Caching issues from #228 are now fixed 🤞 so hopefully no more strange search results, but let's keep this ticket open in case we discover more problems...

@LeMurphant
Copy link
Collaborator Author

LeMurphant commented Jul 25, 2023

Searching for intelligence explosion does not return What is an "intelligence explosion" in the top 5
https://aisafety.info?state=6306_
intelligence_explosion
The 5 results are relevant, but "what is" sounds more relevant

@Aprillion
Copy link
Collaborator

Aprillion commented Jul 26, 2023

Searching for intelligence explosion

Dev note: 2 words => "baseline search", not "semantic search" (which uses small model that wasn't good for exact match of 1-2 words) ... both are the same "search" from user perspective, but fixing this case will involve some if/else code and not playing with hyper-parameters 😅

(in any case, still a good test case for semantic search API too)

@Aprillion
Copy link
Collaborator

Boosting the "What is ..." / "What are ..." questions in baseline search in #288:

image

@LeMurphant
Copy link
Collaborator Author

Search for "Metaculus" or "Metaculus' " does not return anything, but this article contains the sentence "Metaculus’ forecasts for..."
Note that the search for "August 2023" also returns no results.

@LeMurphant
Copy link
Collaborator Author

Not sure if semantic search is enabled at the moment, but searching for "how will we know if AI is conscious" should return the "Are AIs conscious?" article
conscious

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Todo
Development

No branches or pull requests

3 participants