Skip to content

Conversation

laminelam
Copy link

@laminelam laminelam commented Aug 5, 2025

Implements a new ShardDocSortBuilder and ShardDocFieldComparatorSource to allow sorting by shard id and global doc id as a tie-breaker.

Resolves 17064

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Lamine Idjeraoui added 2 commits August 5, 2025 11:08
- Implements ShardDocSortBuilder + comparator
- TODO: Add unit + integ tests
- Registers in SearchModule

Signed-off-by: Lamine Idjeraoui <[email protected]>
Copy link
Contributor

github-actions bot commented Aug 5, 2025

❕ Gradle check result for 7199347: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Aug 5, 2025

Codecov Report

❌ Patch coverage is 61.17647% with 33 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.95%. Comparing base (eb28e77) to head (60f438d).
⚠️ Report is 35 commits behind head on main.

Files with missing lines Patch % Lines
...rch/search/sort/ShardDocFieldComparatorSource.java 13.63% 19 Missing ⚠️
...rg/opensearch/search/sort/ShardDocSortBuilder.java 78.04% 7 Missing and 2 partials ⚠️
...va/org/opensearch/action/search/SearchRequest.java 81.25% 1 Missing and 2 partials ⚠️
...nsearch/search/searchafter/SearchAfterBuilder.java 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #18924      +/-   ##
============================================
+ Coverage     72.81%   72.95%   +0.14%     
- Complexity    69801    69925     +124     
============================================
  Files          5674     5676       +2     
  Lines        320850   320935      +85     
  Branches      46383    46398      +15     
============================================
+ Hits         233638   234153     +515     
+ Misses        68264    67842     -422     
+ Partials      18948    18940       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc labels Aug 6, 2025
@LantaoJin
Copy link
Member

@laminelam can we add any microbenchmark for it?

Copy link
Contributor

❌ Gradle check result for f7d8f73: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 9467d06: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for a719fb7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for a719fb7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for a719fb7: SUCCESS

@gaobinlong
Copy link
Contributor

Thanks @laminelam, this PR is very close to merge, but seems the code test coverage is low, could you add more some unit tests?

@LantaoJin
Copy link
Member

Thanks @laminelam , we are preparing the new release version (3.3.0) which the planned code freeze is Sep 30. Do you have a chance to fix the test coverage check ASAP?

@laminelam
Copy link
Author

Thanks @laminelam , we are preparing the new release version (3.3.0) which the planned code freeze is Sep 30. Do you have a chance to fix the test coverage check ASAP?

Hi @LantaoJin & @gaobinlong
Yes sure, gimme a day or two.
Thx for the review

@ivenhov
Copy link

ivenhov commented Sep 26, 2025

Hi
Any change this going to be backported to 2.19 ?

add more test cases

Signed-off-by: Lamine Idjeraoui <[email protected]>
Copy link
Contributor

❌ Gradle check result for 60f438d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@LantaoJin
Copy link
Member

LantaoJin commented Sep 28, 2025

@laminelam, will all PIT search requests add an implicit _shard_doc sort tiebreaker field with this PR?

Example 1:

GET /_search
{
  "size": 10000,
  "query": {
    "match" : {
      "user.id" : "elkbee"
    }
  },
  "pit": {
    "id":  "46ToAwMDaWR5BXV1aWQyKwZub2RlXzMAAAAAAAAAACoBYwADaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQADaWR5BXV1aWQyKgZub2RlXzIAAAAAAAAAAAwBYgACBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", 
    "keep_alive": "100m"
  },
  "sort": [ 
    {"@timestamp": {"order": "asc"}}
  ]
}

will sort by @timestamp (asc) + _shard_doc(asc). returns:

{
  "pit_id" : "46ToAwMDaWR5BXV1aWQyKwZub2RlXzMAAAAAAAAAACoBYwADaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQADaWR5BXV1aWQyKgZub2RlXzIAAAAAAAAAAAwBYgACBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==",
  "took" : 17,
  "timed_out" : false,
  "_shards" : ...,
  "hits" : {
    "total" : ...,
    "max_score" : null,
    "hits" : [
      ...
      {
        "_index" : "my-index-000001",
        "_id" : "FaslK3QBySSL_rrj9zM5",
        "_score" : null,
        "_source" : ...,
        "sort" : [
          "2021-05-20T05:30:04.832Z",
          4294967298
        ]
      }
    ]
  }
}

Example 2

GET /_search
{
  "size": 10000,
  "query": {
    "match" : {
      "user.id" : "elkbee"
    }
  },
  "pit": {
    "id":  "46ToAwMDaWR5BXV1aWQyKwZub2RlXzMAAAAAAAAAACoBYwADaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQADaWR5BXV1aWQyKgZub2RlXzIAAAAAAAAAAAwBYgACBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", 
    "keep_alive": "100m"
  },
  "sort": []
}

will sort by _shard_doc(asc), returns

  "pit_id" : "46ToAwMDaWR5BXV1aWQyKwZub2RlXzMAAAAAAAAAACoBYwADaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQADaWR5BXV1aWQyKgZub2RlXzIAAAAAAAAAAAwBYgACBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==",
  "took" : 17,
  "timed_out" : false,
  "_shards" : ...,
  "hits" : {
    "total" : ...,
    "max_score" : null,
    "hits" : [
      ...
      {
        "_index" : "my-index-000001",
        "_id" : "FaslK3QBySSL_rrj9zM5",
        "_score" : null,
        "_source" : ...,
        "sort" : [
          4294967298
        ]
      }
    ]
  }
}

Comment on lines +388 to +390
if (sb instanceof FieldSortBuilder) {
return ShardDocSortBuilder.NAME.equals(((FieldSortBuilder) sb).getFieldName());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is never used because we are not allowed to create a FieldSortBuilder("_shard_doc"). I got this exception if I create a FieldSortBuilder("_shard_doc"):

Caused by: org.opensearch.index.query.QueryShardException: No mapping found for [_shard_doc] in order to sort on
        at org.opensearch.search.sort.FieldSortBuilder.resolveUnmappedType(FieldSortBuilder.java:571) ~[opensearch-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at org.opensearch.search.sort.FieldSortBuilder.build(FieldSortBuilder.java:418) ~[opensearch-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at org.opensearch.search.sort.SortBuilder.buildSort(SortBuilder.java:168) ~[opensearch-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]

To fix it, we need to update the SearchSourceBuilder.sort(String)

public SearchSourceBuilder sort(String name) {
if (name.equals(ScoreSortBuilder.NAME)) {
return sort(SortBuilders.scoreSort());
}
return sort(SortBuilders.fieldSort(name));
}

Copy link
Author

@laminelam laminelam Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @LantaoJin
I am not able to replicate the issue.
new FieldSortBuilder(ShardDocSortBuilder.NAME).order(SortOrder.ASC).build(context) is not throwing any exception for me
wondering if we are testing on the same code

Copy link
Member

@LantaoJin LantaoJin Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exception was thrown by fcalling SortBuilders.fieldSort("_shard_doc") to build a source builder first then run client.search(request). Not a blocker IMO, I could call SortBuilders.shardDocSort().

Copy link
Author

@laminelam laminelam Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok I get it now. Actually this will not work and it's not the correct way of instantiating a ShardDocSortBuilder, because shard_doc is NOT a real field, it's a kind of pseudo field built on the fly, in the same way score is not.

SortBuilders.fieldSort("fieldName") is for real fields.

There is another method for shard_doc which is SortBuilders.shardDocSort()

In the same you would call scoreSort() for score, scriptSort() for script sort or geoDistanceSort for geoDistance...

Comment on lines +365 to +370
SearchRequest searchRequest = new SearchRequest().source(
new SearchSourceBuilder().pointInTimeBuilder(new PointInTimeBuilder("id"))
.sort(new FieldSortBuilder(ShardDocSortBuilder.NAME).order(SortOrder.ASC))
);
ActionRequestValidationException e = searchRequest.validate();
assertNull(e);
Copy link
Member

@LantaoJin LantaoJin Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you try new FieldSortBuilder(ShardDocSortBuilder.NAME).order(SortOrder.ASC).build()? it will fail with

"No mapping found for [" + fieldName + "] in order to sort on"

@laminelam
Copy link
Author

laminelam commented Sep 28, 2025

@laminelam, will all PIT search requests add an implicit _shard_doc sort tiebreaker field with this PR?

Thought do it, actually did it and reverted it back, because I am little bit hesitant to implicitly tie a stable proven feature (PIT) to a newly introduced one (shard_doc sorting) in a single PR. I am thinking we better give the user the option to use PIT without shard_doc and at some point we can make it implicit in a separate PR. I think this is a safest path to avoid breaking anything and also make it backwards compatible.

What are your thoughts?

@LantaoJin
Copy link
Member

@laminelam, will all PIT search requests add an implicit _shard_doc sort tiebreaker field with this PR?

Thought do it, actually did it and reverted it back, because I am little bit hesitant to implicitly tie a stable proven feature (PIT) to a newly introduced one (shard_doc sorting) in a single PR. I am thinking we better give the user the option to use PIT without shard_doc and at some point we can make it implicit in a separate PR. I think this is a safest path to avoid breaking anything and also make it backwards compatible.

What are your thoughts?

Make sense, LGTM.

@gaobinlong
Copy link
Contributor

Hi Any change this going to be backported to 2.19 ?

New feature or enhancement won't be backported to 2.19, only bug fix and security issue fix PR can be backported.

Copy link
Contributor

❕ Gradle check result for 60f438d: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for Elastic Search _shard_doc equivalent
4 participants