Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add _primary preference only for segment replication enabled indices #2040

Merged
merged 5 commits into from
Sep 1, 2023

Conversation

dreamer-89
Copy link
Member

Description

Adds _primary preference only for segment replication enabled indices.

Issues Resolved

#1801

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@codecov
Copy link

codecov bot commented Aug 30, 2023

Codecov Report

Merging #2040 (f9e6ee0) into main (56bc7d5) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##               main    #2040   +/-   ##
=========================================
  Coverage     97.28%   97.28%           
  Complexity     4624     4624           
=========================================
  Files           409      409           
  Lines         11942    11943    +1     
  Branches        828      828           
=========================================
+ Hits          11618    11619    +1     
  Misses          317      317           
  Partials          7        7           
Flag Coverage Δ
sql-engine 97.28% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...s/storage/OpenSearchDataSourceMetadataStorage.java 98.75% <100.00%> (+<0.01%) ⬆️

SearchRequest searchRequest = new SearchRequest();
searchRequest.indices(DATASOURCE_INDEX_NAME);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(query);
searchSourceBuilder.size(DATASOURCE_QUERY_RESULT_SIZE);
searchRequest.source(searchSourceBuilder);
if (state.isSegmentReplicationEnabled(DATASOURCE_INDEX_NAME)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not add parameter in search Request? e.g. consistentRead = true. Current approach expose internal implementation detail to client. In future, in case we add any feature breaking consistent read, all the client require consistent read need to make code change accordingly.

Copy link
Member Author

@dreamer-89 dreamer-89 Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point @penghuo ! I feel the existing parameter _primary serves the purpose for consistent reads and adding a new param can be confusing? I also agree isSegmentReplicationEnabled naming give insights to underlying replication strategy but this utility was initially develop for core internal usage and is re-used for plugins.

Copy link
Member

@vamsi-amazon vamsi-amazon Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on Peng's comment.
Also are we doing this with all the plugins?
Now, I am really not sure is it any worth of identifying the replication strategy and set the primary preference. What is the worst that could happen if we set the primary preference it for all domains? It overwhelms the primary shard only if we cross certain limit?
By any chance do we have numbers for that?

Copy link
Member

@mch2 mch2 Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a consistency param would be cleaner here for clients/plugins and doesn't expose implementation detail or use internal apis. However, this needs more vetting before being made api.

We need to define the behavior here - does this simply route to primary or does it guarantee a strong read every time it is invoked? Because if its the latter we need to perform a refresh internally to guarantee even the primary is serving the latest writes irrespective of what refresh policy may have been set on the write path. This concerns me because it changes refresh behavior where it may not be intended and have consequences. However, this is similar to the behavior of a GetRequest with realtime set as true so may be acceptable.

I think what we really are asking for here is the former, which is to "read the last refreshed on data" whatever that may be at the time, which before SR existed is default behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have any data on this @vamsi-amazon so I agree with leaving as is. The downside of primary preference is the request will fail if the primary is unavailable and not route to replicas that are healthy.

To alleviate some concern, I suggest we use _primary_first which will attempt primary first and fallback to replicas only if the node is weighed away.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@penghuo : Adding a new parameter may need more discussion and product feedback.
@vamsi-amazon : No, we don't have data around primary shard limits i.e. max request that it can support before causing instability.

Proceeding with @mch2 suggestion here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Request for review @penghuo @vamsi-amazon @mch2

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@penghuo @vamsi-amazon : Please have a look. Let know if you have any concerns.

CC @mch2

@dreamer-89
Copy link
Member Author

@penghuo @vamsi-amazon : Can you please help with review and merge.

@vamsi-amazon vamsi-amazon merged commit 9c12628 into opensearch-project:main Sep 1, 2023
21 checks passed
@dreamer-89
Copy link
Member Author

@vamsi-amazon @penghuo : Can you please add backport 2.x label to have this backport to 2.x/2.10 branch

opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 1, 2023
…2040)

* Add _primary preference only for segment replication enabled indices

Signed-off-by: Suraj Singh <[email protected]>

* Add test for segment replication tests

Signed-off-by: Suraj Singh <[email protected]>

* Remove isSegRepEnabled check and use _primary_first preference

Signed-off-by: Suraj Singh <[email protected]>

* Remove unused ClusterState reference

Signed-off-by: Suraj Singh <[email protected]>

* Self review

Signed-off-by: Suraj Singh <[email protected]>

---------

Signed-off-by: Suraj Singh <[email protected]>
(cherry picked from commit 9c12628)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
penghuo pushed a commit that referenced this pull request Sep 1, 2023
…2040) (#2045)

* Add _primary preference only for segment replication enabled indices



* Add test for segment replication tests



* Remove isSegRepEnabled check and use _primary_first preference



* Remove unused ClusterState reference



* Self review



---------


(cherry picked from commit 9c12628)

Signed-off-by: Suraj Singh <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants