Skip to content

Conversation

@eyeinsky
Copy link
Collaborator

@eyeinsky eyeinsky commented Nov 10, 2025

Move the ES searchable condition from the boosting part of the query to the filter part.

The query is functionally the same as before, but having the condition in the boosting part is confusing as that is (mostly) used to change the order of returned documents (demote less relevant documents further back in the search result) not whether they are returned. Why it's confusing is that the boosting query apparently also can behave as a filter when the condition is expressed in the positive subsection of the boosting query -- at least that's we figured by reading the documentation and why it currently also works.

The searchable condition was accidentally added to the boosting part of the query by me in this PR: #4786 (b51531c).

(Having the searchable condition in the boosting part looked like a bug when QA was testing it, but it turned out to be a red herring.)

Checklist

  • Add a new entry in an appropriate subdirectory of changelog.d
  • Read and follow the PR guidelines

@eyeinsky eyeinsky requested a review from a team as a code owner November 10, 2025 13:30
@eyeinsky
Copy link
Collaborator Author

@akshaymankar Looked at the ES searchable query while another test was running and I think I succeeded in also making it look ok. The tests succeed locally and the following is a JSON print out of the filter and boosts parts of the query:

Filter:

{
  "bool": {
    "must": [
      {
        "bool": {
          "should": [
            {
              "bool": {
                "must_not": [
                  {
                    "exists": {
                      "field": "team"
                    }
                  }
                ]
              }
            },
            {
              "bool": {
                "must": [
                  {
                    "exists": {
                      "field": "team"
                    }
                  },
                  {
                    "term": {
                      "search_visibility_inbound": {
                        "value": "searchable-by-all-teams"
                      }
                    }
                  }
                ]
              }
            },
            {
              "term": {
                "team": {
                  "value": "7f0cc8ab-1da4-4ce5-a4af-a230bfc0c1b7"
                }
              }
            }
          ]
        }
      },
      {
        "bool": {
          "should": [
            {
              "term": {
                "account_status": {
                  "value": "active"
                }
              }
            },
            {
              "bool": {
                "must_not": [
                  {
                    "exists": {
                      "field": "account_status"
                    }
                  }
                ]
              }
            }
          ]
        }
      }
    ],
    "must_not": [
      {
        "term": {
          "_id": {
            "value": "356cb8d4-e295-4ed8-bb99-d758b667484f"
          }
        }
      },
      {
        "term": {
          "searchable": {
            "value": "false"
          }
        }
      }
    ]
  }
}

Boost:

{
  "boosting": {
    "negative": {
      "bool": {
        "must_not": [
          {
            "term": {
              "team": {
                "value": "7f0cc8ab-1da4-4ce5-a4af-a230bfc0c1b7"
              }
            }
          }
        ]
      }
    },
    "negative_boost": 0.1,
    "positive": {
      "bool": {
        "must": [
          {
            "bool": {
              "must_not": [
                {
                  "term": {
                    "handle": {
                      "value": "[email protected]"
                    }
                  }
                }
              ],
              "should": [
                {
                  "multi_match": {
                    "fields": [
                      "handle.prefix^2",
                      "normalized.prefix",
                      "normalized^3"
                    ],
                    "operator": "and",
                    "query": "[email protected]",
                    "type": "most_fields",
                    "zero_terms_query": "none"
                  }
                }
              ]
            }
          }
        ],
        "should": [
          {
            "exists": {
              "field": "handle"
            }
          }
        ]
      }
    }
  }
}

@stefanwire stefanwire changed the title Move searhable condition to the "filter" part of Elastic Search query Move searchable condition to the "filter" part of Elastic Search query Nov 13, 2025
@zebot zebot added the ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist label Nov 13, 2025
@battermann battermann requested a review from Copilot November 19, 2025 15:22
Copilot finished reviewing on behalf of battermann November 19, 2025 15:23
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR moves the searchable condition filter from the query scoring context to the filter context of the ElasticSearch query, which is a performance optimization. Filters in ElasticSearch are cached and don't affect document scoring, making them more efficient for boolean conditions.

  • Moved the searchable != false condition from defaultUserQuery to the filter section in mkUserQuery
  • Maintained the same logical behavior while improving query performance

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 432 to 436
-- The following matches both where searchable is true
-- or where the field is missing. There didn't seem to
-- be a more readable way to express
-- "not(exists(searchable) or searchable = true" in
-- Elastic Search.
Copy link

Copilot AI Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment has a typo in the logical expression. It currently reads "not(exists(searchable) or searchable = true" but this doesn't accurately describe the filter logic. The code boolQueryMustNotMatch = [TermQuery (Term "searchable" "false")] means "must NOT match searchable=false", which is equivalent to "(searchable = true) OR (searchable field is missing)".

The comment should be corrected to something like:

-- The following matches where searchable is true
-- or where the field is missing. There didn't seem to
-- be a more readable way to express
-- "not(searchable = false)" in Elastic Search.
Suggested change
-- The following matches both where searchable is true
-- or where the field is missing. There didn't seem to
-- be a more readable way to express
-- "not(exists(searchable) or searchable = true" in
-- Elastic Search.
-- The following matches where searchable is true
-- or where the field is missing. There didn't seem to
-- be a more readable way to express
-- "not(searchable = false)" in Elastic Search.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was also a parenthesis typo, fixed here 9762178.

I don't want to change the comment because the recommended not(searchable = false) leaves the "field being missing" case implicit, explicit is better.

@eyeinsky
Copy link
Collaborator Author

(I guess not adding a changelog entry for this is ok 🤔 )

@supersven
Copy link
Contributor

(I guess not adding a changelog entry for this is ok 🤔 )

Please add one: If there would be any issues with user search, it might be a very helpful pointer.

@supersven
Copy link
Contributor

supersven commented Nov 19, 2025

@eyeinsky Could you also please explain in the PR description or as a comment why you moved the condition?

@eyeinsky eyeinsky requested review from a team as code owners November 19, 2025 16:36
@eyeinsky
Copy link
Collaborator Author

@supersven Did both!

Copy link
Contributor

@supersven supersven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the explanations and for improving our code base ❤️

LGTM 👍

@eyeinsky eyeinsky merged commit 5e8a0eb into develop Nov 20, 2025
10 checks passed
@eyeinsky eyeinsky deleted the ml/searchable3 branch November 20, 2025 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants