Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search results page fails to render if search string includes special characters #7618

Closed
tfmorris opened this issue Mar 9, 2023 · 3 comments · Fixed by #7931
Closed

Search results page fails to render if search string includes special characters #7618

tfmorris opened this issue Mar 9, 2023 · 3 comments · Fixed by #7931
Labels
Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] Module: Search Page(s) Issues pertaining to the Search page UI Priority: 2 Important, as time permits. [managed] Theme: Search Issues related to search UI and backend. [managed] Type: Bug Something isn't working. [managed]

Comments

@tfmorris
Copy link
Contributor

tfmorris commented Mar 9, 2023

Evidence / Screenshot (if possible)

Screen Shot 2023-03-08 at 2 56 35 PM

Relevant url?

https://openlibrary.org/search?q=Cornell+%2777%3A+The+Music%2C+the+Myth+and+the+Magnificence+of+the+Grateful+Dead+Show+at+Barton+Hall&mode=everything

Steps to Reproduce

  1. Go to ...
  2. Do ...
  • Actual:
  • Expected:

Details

  • Logged in (Y/N)?
  • Browser type/version?
  • Operating system?
  • Environment (prod/dev/local)? prod

Proposal & Constraints

Related files

Stakeholders

@tfmorris tfmorris added Needs: Lead Needs: Triage This issue needs triage. The team needs to decide who should own it, what to do, by when. [managed] Type: Bug Something isn't working. [managed] labels Mar 9, 2023
@mekarpeles mekarpeles added Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] Theme: Search Issues related to search UI and backend. [managed] Module: Search Page(s) Issues pertaining to the Search page UI and removed Needs: Lead labels Mar 13, 2023
@JaydenTeoh
Copy link
Collaborator

@cdrini Hi, I just pushed a fix for this issue, do let me know if I missed out on anything!

@cdrini cdrini added Priority: 2 Important, as time permits. [managed] and removed Needs: Triage This issue needs triage. The team needs to decide who should own it, what to do, by when. [managed] labels May 1, 2023
@cdrini
Copy link
Collaborator

cdrini commented May 30, 2023

This is an issue with the library we're using to parse solr queries. I've created issues there:

Their community did in the past fix another issue I reported: jurismarches/luqum#79 . Maybe you could try to fix these issues on that repo @JaydenTeoh and open a PR? Using their commit jurismarches/luqum@0cc7767 as a template?

In the meantime, we can try adding these to the escape flow. Currently all user-submitted queries pass through this function:

This function will try to parse the query, and if parsing errors, try to escape every character to convert the query to a plain solr string with no special features. This is done by this function:

def fully_escape_query(query: str) -> str:
"""
Try to convert a query to basically a plain lucene string.
>>> fully_escape_query('title:foo')
'title\\\\:foo'
>>> fully_escape_query('title:foo bar')
'title\\\\:foo bar'
>>> fully_escape_query('title:foo (bar baz:boo)')
'title\\\\:foo \\\\(bar baz\\\\:boo\\\\)'
>>> fully_escape_query('x:[A TO Z}')
'x\\\\:\\\\[A TO Z\\\\}'
>>> fully_escape_query('foo AND bar')
'foo and bar'
"""
escaped = query
# Escape special characters
escaped = re.sub(r'[\[\]\(\)\{\}:"\-+?~^/\\,]', r'\\\g<0>', escaped)
# Remove boolean operators by making them lowercase
escaped = re.sub(r'AND|OR|NOT', lambda _1: _1.group(0).lower(), escaped)
return escaped

We likely want to expand this function to include the single quote.

@JaydenTeoh
Copy link
Collaborator

@cdrini Noted on that, I have expanded the fully_escaped_query function in #7931 for now, I will take a look at the luqum parser repo to make a fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] Module: Search Page(s) Issues pertaining to the Search page UI Priority: 2 Important, as time permits. [managed] Theme: Search Issues related to search UI and backend. [managed] Type: Bug Something isn't working. [managed]
Projects
None yet
5 participants
@tfmorris @mekarpeles @cdrini @JaydenTeoh and others