Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with segment replication #346

Closed
Tracked by #8211
dreamer-89 opened this issue Jun 29, 2023 · 8 comments
Closed
Tracked by #8211

Compatibility with segment replication #346

dreamer-89 opened this issue Jun 29, 2023 · 8 comments
Assignees
Labels
enhancement New feature or request v2.9.0 v2.9.0

Comments

@dreamer-89
Copy link
Member

dreamer-89 commented Jun 29, 2023

Summary

With 2.9.0 release, there are lot of enhancements going in for segment replication[1][2] feature (went GA in 2.7.0), we need to ensure different plugins are compatible with current state of this feature. Previously, we ran tests on plugin repos to verify this compatibility but want plugin owners to be aware of these changes so that required updates (if any) can be made. With 2.10.0 release, remote store feature is going GA which internally uses SEGMENT replication strategy only i.e. it enforces all indices to use SEGMENT replication strategy. So, it is important to validate plugins are compatible with segment replication feature.

What changed

1. Refresh policy behavior

  1. RefreshPolicy.IMMEDIATE will only refresh primary shards but not replica shards immediately. Instead post refresh, primary will start a round of segment replication to update the replica shard copies leading to eventual consistency.
  2. RefreshPolicy.WAIT_UNTIL ensures the indexing operation is searchable in your cluster i.e. RAW (Read after write guarantee). With segment replication, this guarantee is not promised due to delay in replica shared updates from asynchronous background refreshes.

2. Refresh lag on replicas

With segment replication, there is inherent delay in documents to be searchable on replica shard copies. This is due to the fact that replica shard copies over data (segment) files from primary. Thus, compared to document replication, there will be on average increase in amount of time the replica shards are consistent with primaries.

3. System/hidden indices support

With opensearch-project/OpenSearch#8200, system and hidden indices are now supported with SEGMENT replication strategy. We need to ensure there are no bottlenecks which prevents system/hidden indices with segment replication.

Next steps

With segment replication strong reads are not guaranteed. Thus, if the plugin needs strong reads guarantees specially as alternative to change in behavior of refresh policy and lag on replicas (point 1 and 2 above), we need to update search requests to target primary shard only. With opensearch-project/OpenSearch#7375, core now supports primary shards only based search. Please follow documentation for examples and details

Open questions

In case of any questions or issues, please post it in core issue

Reference

[1] Design

[2] Documentation

@dreamer-89
Copy link
Member Author

Request owners to add v2.9.0 label on this issue.

heemin32 added a commit to heemin32/geospatial that referenced this issue Jun 29, 2023
1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to opensearch-project#346
2. Set 0-all for auto_expand_replicas from the start so that geo data
can be available in all data nodes quick enough after swithcing to the updated index
3. Update datasource metadata index mapping

Signed-off-by: Heemin Kim <[email protected]>
heemin32 added a commit to heemin32/geospatial that referenced this issue Jun 30, 2023
1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to opensearch-project#346
2. Update datasource metadata index mapping

Signed-off-by: Heemin Kim <[email protected]>
heemin32 added a commit to heemin32/geospatial that referenced this issue Jun 30, 2023
1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to opensearch-project#346
2. Update datasource metadata index mapping

Signed-off-by: Heemin Kim <[email protected]>
heemin32 added a commit to heemin32/geospatial that referenced this issue Jun 30, 2023
1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to opensearch-project#346
2. Update datasource metadata index mapping
3. Move batch size from static value to setting

Signed-off-by: Heemin Kim <[email protected]>
heemin32 added a commit to heemin32/geospatial that referenced this issue Jun 30, 2023
1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to opensearch-project#346
2. Update datasource metadata index mapping
3. Move batch size from static value to setting

Signed-off-by: Heemin Kim <[email protected]>
heemin32 added a commit that referenced this issue Jun 30, 2023
1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to #346
2. Update datasource metadata index mapping
3. Move batch size from static value to setting

Signed-off-by: Heemin Kim <[email protected]>
@gaiksaya gaiksaya added the v2.9.0 v2.9.0 label Jul 3, 2023
@dreamer-89
Copy link
Member Author

Hi Plugin Owners,
Gentle reminder to look into this issue as code freeze date for 2.9.0 release is near i.e. July 11th.

@heemin32 heemin32 self-assigned this Jul 11, 2023
@heemin32
Copy link
Collaborator

heemin32 commented Jul 11, 2023

There are one place where we use WAIT_UNTIL,

return client.prepareBulk().setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);

This is used when customer upload their geojson file to OpenSearch cluster. After upload is completed, the data might not be available for search in replicas after segment replication is enabled. We will accept the behavior as it is. User can search using primary shards only option if they need strong read gurantee.

@dreamer-89
Copy link
Member Author

Thanks @heemin32 for sharing the update and having the _primary based fix. I just wanted to check with you that using primary based search may add addtional load on primary shards and may have performance implications. I assume you do not have heavy read workload and already verified this to be not a problem in your use-case.

@heemin32
Copy link
Collaborator

The read is not happening internally. It will be user initiated. So, user should decide if they want to wait or use _primary option.

@dreamer-89
Copy link
Member Author

dreamer-89 commented Jul 12, 2023

@heemin32 : From linked PR #347, I understand _primary based search is used for all fetch datasource metadata search requests, so I don't understand how user would decide ?

In my last comment, I meant that using _primary search preference for high read throughput systems can have performance implications. Can you please confirm that with your change there will be no adverse impact

@heemin32
Copy link
Collaborator

heemin32 commented Jul 12, 2023

@dreamer-89 I am talking about what will be released in 2.9.0.

return client.prepareBulk().setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);

The PR #347 you mentioned will be released in 2.10.0.

Regarding the PR #347, yes, there is no issue as datasource read is not high throughput operation.

heemin32 added a commit to heemin32/geospatial that referenced this issue Jul 14, 2023
)

1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to opensearch-project#346
2. Update datasource metadata index mapping
3. Move batch size from static value to setting

Signed-off-by: Heemin Kim <[email protected]>
heemin32 added a commit to heemin32/geospatial that referenced this issue Jul 21, 2023
)

1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to opensearch-project#346
2. Update datasource metadata index mapping
3. Move batch size from static value to setting

Signed-off-by: Heemin Kim <[email protected]>
heemin32 added a commit that referenced this issue Jul 21, 2023
* Update gradle version to 7.6 (#265)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

* Exclude lombok generated code from jacoco coverage report (#268)

Signed-off-by: Heemin Kim <[email protected]>

* Make jacoco report to be generated faster in local (#267)

Signed-off-by: Heemin Kim <[email protected]>

* Update dependency org.json:json to v20230227 (#273)

Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>

* Baseline owners and maintainers (#275)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

* Add Auto Release Workflow (#288)

Signed-off-by: Naveen Tatikonda <[email protected]>

* Change package for Strings.hasText (#314)

Signed-off-by: Heemin Kim <[email protected]>

* Adding release notes for 2.8 (#323)

Signed-off-by: Martin Gaievski <[email protected]>

* Add 2.9.0 release notes (#350)

Signed-off-by: Junqiu Lei <[email protected]>

* Update packages according to a change in OpenSearch core (#353)

Signed-off-by: Heemin Kim <[email protected]>

* Implement creation of ip2geo feature (#257)

* Update gradle version to 7.6 (#265)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

* Implement creation of ip2geo feature

* Implementation of ip2geo datasource creation
* Implementation of ip2geo processor creation

Signed-off-by: Heemin Kim <[email protected]>
---------

Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Heemin Kim <[email protected]>
Co-authored-by: Vijayan Balasubramanian <[email protected]>

* Added unit tests with some refactoring of codes (#271)

* Add Unit tests
* Set cache true for search query
* Remove in memory cache implementation (Two way door decision)
 * Relying on search cache without custom cache
* Renamed datasource state from FAILED to CREATE_FAILED
* Renamed class name from *Helper to *Facade
* Changed updateIntervalInDays to updateInterval
* Changed value type of default update_interval from TimeValue to Long
* Read setting value from cluster settings directly

Signed-off-by: Heemin Kim <[email protected]>

* Sync from main (#280)

* Update gradle version to 7.6 (#265)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

* Exclude lombok generated code from jacoco coverage report (#268)

Signed-off-by: Heemin Kim <[email protected]>

* Make jacoco report to be generated faster in local (#267)

Signed-off-by: Heemin Kim <[email protected]>

* Update dependency org.json:json to v20230227 (#273)

Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>

* Baseline owners and maintainers (#275)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

---------

Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Heemin Kim <[email protected]>
Co-authored-by: Vijayan Balasubramanian <[email protected]>
Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>

* Add datasource name validation (#281)

Signed-off-by: Heemin Kim <[email protected]>

* Refactoring of code (#282)

1. Change variable name from datasourceName to name
2. Change variable name from id to name
3. Added helper methods in test code

Signed-off-by: Heemin Kim <[email protected]>

* Change field name from md5 to sha256 (#285)

Signed-off-by: Heemin Kim <[email protected]>

* Implement get datasource api (#279)

Signed-off-by: Heemin Kim <[email protected]>

* Update index option (#284)

1. Make geodata index as hidden
2. Make geodata index as read only allow delete after creation is done
3. Refresh datasource index immediately after update

Signed-off-by: Heemin Kim <[email protected]>

* Make some fields in manifest file as mandatory (#289)

Signed-off-by: Heemin Kim <[email protected]>

* Create datasource index explicitly (#283)

Signed-off-by: Heemin Kim <[email protected]>

* Add wrapper class of job scheduler lock service (#290)

Signed-off-by: Heemin Kim <[email protected]>

* Remove all unused client attributes (#293)

Signed-off-by: Heemin Kim <[email protected]>

* Update copyright header (#298)

Signed-off-by: Heemin Kim <[email protected]>

* Run system index handling code with stashed thread context (#297)

Signed-off-by: Heemin Kim <[email protected]>

* Reduce lock duration and renew the lock during update (#299)

Signed-off-by: Heemin Kim <[email protected]>

* Implements delete datasource API (#291)

Signed-off-by: Heemin Kim <[email protected]>

* Set User-Agent in http request (#300)

Signed-off-by: Heemin Kim <[email protected]>

* Implement datasource update API (#292)

Signed-off-by: Heemin Kim <[email protected]>

* Refactoring test code (#302)

Make buildGeoJSONFeatureProcessorConfig method to be more general

Signed-off-by: Heemin Kim <[email protected]>

* Add ip2geo processor integ test for failure case (#303)

Signed-off-by: Heemin Kim <[email protected]>

* Bug fix and refactoring of code (#305)

1. Bugfix: Ingest metadata can be null if there is no processor created
2. Refactoring: Moved private method to another class for better testing support
3. Refactoring: Set some private static final variable as public so that unit test can use it
4. Refactoring: Changed string value to static variable

Signed-off-by: Heemin Kim <[email protected]>

* Add integration test for Ip2GeoProcessor (#306)

Signed-off-by: Heemin Kim <[email protected]>

* Add ConcurrentModificationException (#308)

Signed-off-by: Heemin Kim <[email protected]>

* Add integration test for UpdateDatasource API (#307)

Signed-off-by: Heemin Kim <[email protected]>

* Bug fix on lock management and few performance improvements (#310)

* Release lock before response back to caller for update/delete API
* Release lock in background task for creation API
* Change index settings to improve indexing performance

Signed-off-by: Heemin Kim <[email protected]>

* Change index setting from read_only_allow_delete to write (#311)

read_only_allow_delete does not block write to an index.
The disk-based shard allocator may add and remove this block automatically.
Therefore, use index.blocks.write instead.

Signed-off-by: Heemin Kim <[email protected]>

* Fix bug in get datasource API and improve memory usage (#313)

Signed-off-by: Heemin Kim <[email protected]>

* Change package for Strings.hasText (#314) (#317)

Signed-off-by: Heemin Kim <[email protected]>

* Remove jitter and move index setting from DatasourceFacade to DatasourceExtension (#319)

Signed-off-by: Heemin Kim <[email protected]>

* Do not index blank value and do not enrich null property (#320)

Signed-off-by: Heemin Kim <[email protected]>

* Move index setting keys to constants (#321)

Signed-off-by: Heemin Kim <[email protected]>

* Return null index name for expired data (#322)

Return null index name for expired data so that it can be deleted
by clean up process. Clean up process exclude current index from deleting.
Signed-off-by: Heemin Kim <[email protected]>

* Add new fields in datasource (#325)

Signed-off-by: Heemin Kim <[email protected]>

* Delete index once it is expired (#326)

Signed-off-by: Heemin Kim <[email protected]>

* Add restoring event listener (#328)

In the listener, we trigger a geoip data update

Signed-off-by: Heemin Kim <[email protected]>

* Reverse forcemerge and refresh order (#331)

Otherwise, opensearch does not clear old segment files

Signed-off-by: Heemin Kim <[email protected]>

* Removed parameter and settings (#332)

* Removed first_only parameter
* Removed max_concurrency and batch_size setting

first_only parameter was added as current geoip processor has it.
However, the parameter have no benefit for ip2geo processor as we don't do a sequantial search for array data but use multi search.

max_concurrency and batch_size setting is removed as these are only reveal internal implementation and could be a future blocker to improve performance later.

Signed-off-by: Heemin Kim <[email protected]>

* Add a field in datasource for current index name (#333)

Signed-off-by: Heemin Kim <[email protected]>

* Delete GeoIP data indices after restoring complete (#334)

We don't want to use restored GeoIP data indices. Therefore we
delete the indices once restoring process complete.

When GeoIP metadata index is restored, we create a new GeoIP data index instead.

Signed-off-by: Heemin Kim <[email protected]>

* Use bool query for array form of IPs (#335)

Signed-off-by: Heemin Kim <[email protected]>

* Run update/delete request in a new thread (#337)

This is not to block transport thread

Signed-off-by: Heemin Kim <[email protected]>

* Remove IP2Geo processor validation (#336)

Cannot query index to get data to validate IP2Geo processor.
Will add validation when we decide to store some of data in cluster state metadata.

Signed-off-by: Heemin Kim <[email protected]>

* Acquire lock sychronously (#339)

By acquiring lock asychronously, the remaining part of the code
is being run by transport thread which does not allow blocking code.
We want only single update happen in a node using single thread. However,
it cannot be acheived if I acquire lock asynchronously and pass the listener.

Signed-off-by: Heemin Kim <[email protected]>

* Added a cache to store datasource metadata (#338)

Signed-off-by: Heemin Kim <[email protected]>

* Changed class name and package (#341)

Signed-off-by: Heemin Kim <[email protected]>

* Refactoring of code (#342)

1. Changed class name from Ip2GeoCache to Ip2GeoCachedDao
2. Moved the Ip2GeoCachedDao from cache to dao package

Signed-off-by: Heemin Kim <[email protected]>

* Add geo data cache (#340)

Signed-off-by: Heemin Kim <[email protected]>

* Add cache layer to reduce GeoIp data retrieval latency (#343)

Signed-off-by: Heemin Kim <[email protected]>

* Use _primary in query preference and few changes (#347)

1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to #346
2. Update datasource metadata index mapping
3. Move batch size from static value to setting

Signed-off-by: Heemin Kim <[email protected]>

* Wait until GeoIP data to be replicated to all data nodes (#348)

Signed-off-by: Heemin Kim <[email protected]>

* Update packages according to a change in OpenSearch core (#354)

* Update packages according to a change in OpenSearch core

Signed-off-by: Heemin Kim <[email protected]>

* Update packages according to a change in OpenSearch core (#353)

Signed-off-by: Heemin Kim <[email protected]>

---------

Signed-off-by: Heemin Kim <[email protected]>

---------

Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Heemin Kim <[email protected]>
Signed-off-by: Naveen Tatikonda <[email protected]>
Signed-off-by: Martin Gaievski <[email protected]>
Signed-off-by: Junqiu Lei <[email protected]>
Co-authored-by: Vijayan Balasubramanian <[email protected]>
Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>
Co-authored-by: Naveen Tatikonda <[email protected]>
Co-authored-by: Martin Gaievski <[email protected]>
Co-authored-by: Junqiu Lei <[email protected]>
heemin32 added a commit that referenced this issue Jul 24, 2023
* Implement creation of ip2geo feature (#257)

* Update gradle version to 7.6 (#265)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

* Implement creation of ip2geo feature

* Implementation of ip2geo datasource creation
* Implementation of ip2geo processor creation

Signed-off-by: Heemin Kim <[email protected]>
---------

Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Heemin Kim <[email protected]>
Co-authored-by: Vijayan Balasubramanian <[email protected]>

* Added unit tests with some refactoring of codes (#271)

* Add Unit tests
* Set cache true for search query
* Remove in memory cache implementation (Two way door decision)
 * Relying on search cache without custom cache
* Renamed datasource state from FAILED to CREATE_FAILED
* Renamed class name from *Helper to *Facade
* Changed updateIntervalInDays to updateInterval
* Changed value type of default update_interval from TimeValue to Long
* Read setting value from cluster settings directly

Signed-off-by: Heemin Kim <[email protected]>

* Sync from main (#280)

* Update gradle version to 7.6 (#265)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

* Exclude lombok generated code from jacoco coverage report (#268)

Signed-off-by: Heemin Kim <[email protected]>

* Make jacoco report to be generated faster in local (#267)

Signed-off-by: Heemin Kim <[email protected]>

* Update dependency org.json:json to v20230227 (#273)

Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>

* Baseline owners and maintainers (#275)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

---------

Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Heemin Kim <[email protected]>
Co-authored-by: Vijayan Balasubramanian <[email protected]>
Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>

* Add datasource name validation (#281)

Signed-off-by: Heemin Kim <[email protected]>

* Refactoring of code (#282)

1. Change variable name from datasourceName to name
2. Change variable name from id to name
3. Added helper methods in test code

Signed-off-by: Heemin Kim <[email protected]>

* Change field name from md5 to sha256 (#285)

Signed-off-by: Heemin Kim <[email protected]>

* Implement get datasource api (#279)

Signed-off-by: Heemin Kim <[email protected]>

* Update index option (#284)

1. Make geodata index as hidden
2. Make geodata index as read only allow delete after creation is done
3. Refresh datasource index immediately after update

Signed-off-by: Heemin Kim <[email protected]>

* Make some fields in manifest file as mandatory (#289)

Signed-off-by: Heemin Kim <[email protected]>

* Create datasource index explicitly (#283)

Signed-off-by: Heemin Kim <[email protected]>

* Add wrapper class of job scheduler lock service (#290)

Signed-off-by: Heemin Kim <[email protected]>

* Remove all unused client attributes (#293)

Signed-off-by: Heemin Kim <[email protected]>

* Update copyright header (#298)

Signed-off-by: Heemin Kim <[email protected]>

* Run system index handling code with stashed thread context (#297)

Signed-off-by: Heemin Kim <[email protected]>

* Reduce lock duration and renew the lock during update (#299)

Signed-off-by: Heemin Kim <[email protected]>

* Implements delete datasource API (#291)

Signed-off-by: Heemin Kim <[email protected]>

* Set User-Agent in http request (#300)

Signed-off-by: Heemin Kim <[email protected]>

* Implement datasource update API (#292)

Signed-off-by: Heemin Kim <[email protected]>

* Refactoring test code (#302)

Make buildGeoJSONFeatureProcessorConfig method to be more general

Signed-off-by: Heemin Kim <[email protected]>

* Add ip2geo processor integ test for failure case (#303)

Signed-off-by: Heemin Kim <[email protected]>

* Bug fix and refactoring of code (#305)

1. Bugfix: Ingest metadata can be null if there is no processor created
2. Refactoring: Moved private method to another class for better testing support
3. Refactoring: Set some private static final variable as public so that unit test can use it
4. Refactoring: Changed string value to static variable

Signed-off-by: Heemin Kim <[email protected]>

* Add integration test for Ip2GeoProcessor (#306)

Signed-off-by: Heemin Kim <[email protected]>

* Add ConcurrentModificationException (#308)

Signed-off-by: Heemin Kim <[email protected]>

* Add integration test for UpdateDatasource API (#307)

Signed-off-by: Heemin Kim <[email protected]>

* Bug fix on lock management and few performance improvements (#310)

* Release lock before response back to caller for update/delete API
* Release lock in background task for creation API
* Change index settings to improve indexing performance

Signed-off-by: Heemin Kim <[email protected]>

* Change index setting from read_only_allow_delete to write (#311)

read_only_allow_delete does not block write to an index.
The disk-based shard allocator may add and remove this block automatically.
Therefore, use index.blocks.write instead.

Signed-off-by: Heemin Kim <[email protected]>

* Fix bug in get datasource API and improve memory usage (#313)

Signed-off-by: Heemin Kim <[email protected]>

* Change package for Strings.hasText (#314) (#317)

Signed-off-by: Heemin Kim <[email protected]>

* Remove jitter and move index setting from DatasourceFacade to DatasourceExtension (#319)

Signed-off-by: Heemin Kim <[email protected]>

* Do not index blank value and do not enrich null property (#320)

Signed-off-by: Heemin Kim <[email protected]>

* Move index setting keys to constants (#321)

Signed-off-by: Heemin Kim <[email protected]>

* Return null index name for expired data (#322)

Return null index name for expired data so that it can be deleted
by clean up process. Clean up process exclude current index from deleting.
Signed-off-by: Heemin Kim <[email protected]>

* Add new fields in datasource (#325)

Signed-off-by: Heemin Kim <[email protected]>

* Delete index once it is expired (#326)

Signed-off-by: Heemin Kim <[email protected]>

* Add restoring event listener (#328)

In the listener, we trigger a geoip data update

Signed-off-by: Heemin Kim <[email protected]>

* Reverse forcemerge and refresh order (#331)

Otherwise, opensearch does not clear old segment files

Signed-off-by: Heemin Kim <[email protected]>

* Removed parameter and settings (#332)

* Removed first_only parameter
* Removed max_concurrency and batch_size setting

first_only parameter was added as current geoip processor has it.
However, the parameter have no benefit for ip2geo processor as we don't do a sequantial search for array data but use multi search.

max_concurrency and batch_size setting is removed as these are only reveal internal implementation and could be a future blocker to improve performance later.

Signed-off-by: Heemin Kim <[email protected]>

* Add a field in datasource for current index name (#333)

Signed-off-by: Heemin Kim <[email protected]>

* Delete GeoIP data indices after restoring complete (#334)

We don't want to use restored GeoIP data indices. Therefore we
delete the indices once restoring process complete.

When GeoIP metadata index is restored, we create a new GeoIP data index instead.

Signed-off-by: Heemin Kim <[email protected]>

* Use bool query for array form of IPs (#335)

Signed-off-by: Heemin Kim <[email protected]>

* Run update/delete request in a new thread (#337)

This is not to block transport thread

Signed-off-by: Heemin Kim <[email protected]>

* Remove IP2Geo processor validation (#336)

Cannot query index to get data to validate IP2Geo processor.
Will add validation when we decide to store some of data in cluster state metadata.

Signed-off-by: Heemin Kim <[email protected]>

* Acquire lock sychronously (#339)

By acquiring lock asychronously, the remaining part of the code
is being run by transport thread which does not allow blocking code.
We want only single update happen in a node using single thread. However,
it cannot be acheived if I acquire lock asynchronously and pass the listener.

Signed-off-by: Heemin Kim <[email protected]>

* Added a cache to store datasource metadata (#338)

Signed-off-by: Heemin Kim <[email protected]>

* Changed class name and package (#341)

Signed-off-by: Heemin Kim <[email protected]>

* Refactoring of code (#342)

1. Changed class name from Ip2GeoCache to Ip2GeoCachedDao
2. Moved the Ip2GeoCachedDao from cache to dao package

Signed-off-by: Heemin Kim <[email protected]>

* Add geo data cache (#340)

Signed-off-by: Heemin Kim <[email protected]>

* Add cache layer to reduce GeoIp data retrieval latency (#343)

Signed-off-by: Heemin Kim <[email protected]>

* Use _primary in query preference and few changes (#347)

1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to #346
2. Update datasource metadata index mapping
3. Move batch size from static value to setting

Signed-off-by: Heemin Kim <[email protected]>

* Wait until GeoIP data to be replicated to all data nodes (#348)

Signed-off-by: Heemin Kim <[email protected]>

* Update packages according to a change in OpenSearch core (#354)

* Update packages according to a change in OpenSearch core

Signed-off-by: Heemin Kim <[email protected]>

* Update packages according to a change in OpenSearch core (#353)

Signed-off-by: Heemin Kim <[email protected]>

---------

Signed-off-by: Heemin Kim <[email protected]>

---------

Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Heemin Kim <[email protected]>
Co-authored-by: Vijayan Balasubramanian <[email protected]>
Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>
opensearch-trigger-bot bot pushed a commit that referenced this issue Jul 24, 2023
* Implement creation of ip2geo feature (#257)

* Update gradle version to 7.6 (#265)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

* Implement creation of ip2geo feature

* Implementation of ip2geo datasource creation
* Implementation of ip2geo processor creation

Signed-off-by: Heemin Kim <[email protected]>
---------

Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Heemin Kim <[email protected]>
Co-authored-by: Vijayan Balasubramanian <[email protected]>

* Added unit tests with some refactoring of codes (#271)

* Add Unit tests
* Set cache true for search query
* Remove in memory cache implementation (Two way door decision)
 * Relying on search cache without custom cache
* Renamed datasource state from FAILED to CREATE_FAILED
* Renamed class name from *Helper to *Facade
* Changed updateIntervalInDays to updateInterval
* Changed value type of default update_interval from TimeValue to Long
* Read setting value from cluster settings directly

Signed-off-by: Heemin Kim <[email protected]>

* Sync from main (#280)

* Update gradle version to 7.6 (#265)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

* Exclude lombok generated code from jacoco coverage report (#268)

Signed-off-by: Heemin Kim <[email protected]>

* Make jacoco report to be generated faster in local (#267)

Signed-off-by: Heemin Kim <[email protected]>

* Update dependency org.json:json to v20230227 (#273)

Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>

* Baseline owners and maintainers (#275)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

---------

Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Heemin Kim <[email protected]>
Co-authored-by: Vijayan Balasubramanian <[email protected]>
Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>

* Add datasource name validation (#281)

Signed-off-by: Heemin Kim <[email protected]>

* Refactoring of code (#282)

1. Change variable name from datasourceName to name
2. Change variable name from id to name
3. Added helper methods in test code

Signed-off-by: Heemin Kim <[email protected]>

* Change field name from md5 to sha256 (#285)

Signed-off-by: Heemin Kim <[email protected]>

* Implement get datasource api (#279)

Signed-off-by: Heemin Kim <[email protected]>

* Update index option (#284)

1. Make geodata index as hidden
2. Make geodata index as read only allow delete after creation is done
3. Refresh datasource index immediately after update

Signed-off-by: Heemin Kim <[email protected]>

* Make some fields in manifest file as mandatory (#289)

Signed-off-by: Heemin Kim <[email protected]>

* Create datasource index explicitly (#283)

Signed-off-by: Heemin Kim <[email protected]>

* Add wrapper class of job scheduler lock service (#290)

Signed-off-by: Heemin Kim <[email protected]>

* Remove all unused client attributes (#293)

Signed-off-by: Heemin Kim <[email protected]>

* Update copyright header (#298)

Signed-off-by: Heemin Kim <[email protected]>

* Run system index handling code with stashed thread context (#297)

Signed-off-by: Heemin Kim <[email protected]>

* Reduce lock duration and renew the lock during update (#299)

Signed-off-by: Heemin Kim <[email protected]>

* Implements delete datasource API (#291)

Signed-off-by: Heemin Kim <[email protected]>

* Set User-Agent in http request (#300)

Signed-off-by: Heemin Kim <[email protected]>

* Implement datasource update API (#292)

Signed-off-by: Heemin Kim <[email protected]>

* Refactoring test code (#302)

Make buildGeoJSONFeatureProcessorConfig method to be more general

Signed-off-by: Heemin Kim <[email protected]>

* Add ip2geo processor integ test for failure case (#303)

Signed-off-by: Heemin Kim <[email protected]>

* Bug fix and refactoring of code (#305)

1. Bugfix: Ingest metadata can be null if there is no processor created
2. Refactoring: Moved private method to another class for better testing support
3. Refactoring: Set some private static final variable as public so that unit test can use it
4. Refactoring: Changed string value to static variable

Signed-off-by: Heemin Kim <[email protected]>

* Add integration test for Ip2GeoProcessor (#306)

Signed-off-by: Heemin Kim <[email protected]>

* Add ConcurrentModificationException (#308)

Signed-off-by: Heemin Kim <[email protected]>

* Add integration test for UpdateDatasource API (#307)

Signed-off-by: Heemin Kim <[email protected]>

* Bug fix on lock management and few performance improvements (#310)

* Release lock before response back to caller for update/delete API
* Release lock in background task for creation API
* Change index settings to improve indexing performance

Signed-off-by: Heemin Kim <[email protected]>

* Change index setting from read_only_allow_delete to write (#311)

read_only_allow_delete does not block write to an index.
The disk-based shard allocator may add and remove this block automatically.
Therefore, use index.blocks.write instead.

Signed-off-by: Heemin Kim <[email protected]>

* Fix bug in get datasource API and improve memory usage (#313)

Signed-off-by: Heemin Kim <[email protected]>

* Change package for Strings.hasText (#314) (#317)

Signed-off-by: Heemin Kim <[email protected]>

* Remove jitter and move index setting from DatasourceFacade to DatasourceExtension (#319)

Signed-off-by: Heemin Kim <[email protected]>

* Do not index blank value and do not enrich null property (#320)

Signed-off-by: Heemin Kim <[email protected]>

* Move index setting keys to constants (#321)

Signed-off-by: Heemin Kim <[email protected]>

* Return null index name for expired data (#322)

Return null index name for expired data so that it can be deleted
by clean up process. Clean up process exclude current index from deleting.
Signed-off-by: Heemin Kim <[email protected]>

* Add new fields in datasource (#325)

Signed-off-by: Heemin Kim <[email protected]>

* Delete index once it is expired (#326)

Signed-off-by: Heemin Kim <[email protected]>

* Add restoring event listener (#328)

In the listener, we trigger a geoip data update

Signed-off-by: Heemin Kim <[email protected]>

* Reverse forcemerge and refresh order (#331)

Otherwise, opensearch does not clear old segment files

Signed-off-by: Heemin Kim <[email protected]>

* Removed parameter and settings (#332)

* Removed first_only parameter
* Removed max_concurrency and batch_size setting

first_only parameter was added as current geoip processor has it.
However, the parameter have no benefit for ip2geo processor as we don't do a sequantial search for array data but use multi search.

max_concurrency and batch_size setting is removed as these are only reveal internal implementation and could be a future blocker to improve performance later.

Signed-off-by: Heemin Kim <[email protected]>

* Add a field in datasource for current index name (#333)

Signed-off-by: Heemin Kim <[email protected]>

* Delete GeoIP data indices after restoring complete (#334)

We don't want to use restored GeoIP data indices. Therefore we
delete the indices once restoring process complete.

When GeoIP metadata index is restored, we create a new GeoIP data index instead.

Signed-off-by: Heemin Kim <[email protected]>

* Use bool query for array form of IPs (#335)

Signed-off-by: Heemin Kim <[email protected]>

* Run update/delete request in a new thread (#337)

This is not to block transport thread

Signed-off-by: Heemin Kim <[email protected]>

* Remove IP2Geo processor validation (#336)

Cannot query index to get data to validate IP2Geo processor.
Will add validation when we decide to store some of data in cluster state metadata.

Signed-off-by: Heemin Kim <[email protected]>

* Acquire lock sychronously (#339)

By acquiring lock asychronously, the remaining part of the code
is being run by transport thread which does not allow blocking code.
We want only single update happen in a node using single thread. However,
it cannot be acheived if I acquire lock asynchronously and pass the listener.

Signed-off-by: Heemin Kim <[email protected]>

* Added a cache to store datasource metadata (#338)

Signed-off-by: Heemin Kim <[email protected]>

* Changed class name and package (#341)

Signed-off-by: Heemin Kim <[email protected]>

* Refactoring of code (#342)

1. Changed class name from Ip2GeoCache to Ip2GeoCachedDao
2. Moved the Ip2GeoCachedDao from cache to dao package

Signed-off-by: Heemin Kim <[email protected]>

* Add geo data cache (#340)

Signed-off-by: Heemin Kim <[email protected]>

* Add cache layer to reduce GeoIp data retrieval latency (#343)

Signed-off-by: Heemin Kim <[email protected]>

* Use _primary in query preference and few changes (#347)

1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to #346
2. Update datasource metadata index mapping
3. Move batch size from static value to setting

Signed-off-by: Heemin Kim <[email protected]>

* Wait until GeoIP data to be replicated to all data nodes (#348)

Signed-off-by: Heemin Kim <[email protected]>

* Update packages according to a change in OpenSearch core (#354)

* Update packages according to a change in OpenSearch core

Signed-off-by: Heemin Kim <[email protected]>

* Update packages according to a change in OpenSearch core (#353)

Signed-off-by: Heemin Kim <[email protected]>

---------

Signed-off-by: Heemin Kim <[email protected]>

---------

Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Heemin Kim <[email protected]>
Co-authored-by: Vijayan Balasubramanian <[email protected]>
Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>
(cherry picked from commit 0cd9153)
heemin32 added a commit that referenced this issue Jul 24, 2023
* Implement creation of ip2geo feature (#257)

* Update gradle version to 7.6 (#265)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

* Implement creation of ip2geo feature

* Implementation of ip2geo datasource creation
* Implementation of ip2geo processor creation

Signed-off-by: Heemin Kim <[email protected]>
---------

Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Heemin Kim <[email protected]>
Co-authored-by: Vijayan Balasubramanian <[email protected]>

* Added unit tests with some refactoring of codes (#271)

* Add Unit tests
* Set cache true for search query
* Remove in memory cache implementation (Two way door decision)
 * Relying on search cache without custom cache
* Renamed datasource state from FAILED to CREATE_FAILED
* Renamed class name from *Helper to *Facade
* Changed updateIntervalInDays to updateInterval
* Changed value type of default update_interval from TimeValue to Long
* Read setting value from cluster settings directly

Signed-off-by: Heemin Kim <[email protected]>

* Sync from main (#280)

* Update gradle version to 7.6 (#265)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

* Exclude lombok generated code from jacoco coverage report (#268)

Signed-off-by: Heemin Kim <[email protected]>

* Make jacoco report to be generated faster in local (#267)

Signed-off-by: Heemin Kim <[email protected]>

* Update dependency org.json:json to v20230227 (#273)

Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>

* Baseline owners and maintainers (#275)

Signed-off-by: Vijayan Balasubramanian <[email protected]>

---------

Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Heemin Kim <[email protected]>
Co-authored-by: Vijayan Balasubramanian <[email protected]>
Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>

* Add datasource name validation (#281)

Signed-off-by: Heemin Kim <[email protected]>

* Refactoring of code (#282)

1. Change variable name from datasourceName to name
2. Change variable name from id to name
3. Added helper methods in test code

Signed-off-by: Heemin Kim <[email protected]>

* Change field name from md5 to sha256 (#285)

Signed-off-by: Heemin Kim <[email protected]>

* Implement get datasource api (#279)

Signed-off-by: Heemin Kim <[email protected]>

* Update index option (#284)

1. Make geodata index as hidden
2. Make geodata index as read only allow delete after creation is done
3. Refresh datasource index immediately after update

Signed-off-by: Heemin Kim <[email protected]>

* Make some fields in manifest file as mandatory (#289)

Signed-off-by: Heemin Kim <[email protected]>

* Create datasource index explicitly (#283)

Signed-off-by: Heemin Kim <[email protected]>

* Add wrapper class of job scheduler lock service (#290)

Signed-off-by: Heemin Kim <[email protected]>

* Remove all unused client attributes (#293)

Signed-off-by: Heemin Kim <[email protected]>

* Update copyright header (#298)

Signed-off-by: Heemin Kim <[email protected]>

* Run system index handling code with stashed thread context (#297)

Signed-off-by: Heemin Kim <[email protected]>

* Reduce lock duration and renew the lock during update (#299)

Signed-off-by: Heemin Kim <[email protected]>

* Implements delete datasource API (#291)

Signed-off-by: Heemin Kim <[email protected]>

* Set User-Agent in http request (#300)

Signed-off-by: Heemin Kim <[email protected]>

* Implement datasource update API (#292)

Signed-off-by: Heemin Kim <[email protected]>

* Refactoring test code (#302)

Make buildGeoJSONFeatureProcessorConfig method to be more general

Signed-off-by: Heemin Kim <[email protected]>

* Add ip2geo processor integ test for failure case (#303)

Signed-off-by: Heemin Kim <[email protected]>

* Bug fix and refactoring of code (#305)

1. Bugfix: Ingest metadata can be null if there is no processor created
2. Refactoring: Moved private method to another class for better testing support
3. Refactoring: Set some private static final variable as public so that unit test can use it
4. Refactoring: Changed string value to static variable

Signed-off-by: Heemin Kim <[email protected]>

* Add integration test for Ip2GeoProcessor (#306)

Signed-off-by: Heemin Kim <[email protected]>

* Add ConcurrentModificationException (#308)

Signed-off-by: Heemin Kim <[email protected]>

* Add integration test for UpdateDatasource API (#307)

Signed-off-by: Heemin Kim <[email protected]>

* Bug fix on lock management and few performance improvements (#310)

* Release lock before response back to caller for update/delete API
* Release lock in background task for creation API
* Change index settings to improve indexing performance

Signed-off-by: Heemin Kim <[email protected]>

* Change index setting from read_only_allow_delete to write (#311)

read_only_allow_delete does not block write to an index.
The disk-based shard allocator may add and remove this block automatically.
Therefore, use index.blocks.write instead.

Signed-off-by: Heemin Kim <[email protected]>

* Fix bug in get datasource API and improve memory usage (#313)

Signed-off-by: Heemin Kim <[email protected]>

* Change package for Strings.hasText (#314) (#317)

Signed-off-by: Heemin Kim <[email protected]>

* Remove jitter and move index setting from DatasourceFacade to DatasourceExtension (#319)

Signed-off-by: Heemin Kim <[email protected]>

* Do not index blank value and do not enrich null property (#320)

Signed-off-by: Heemin Kim <[email protected]>

* Move index setting keys to constants (#321)

Signed-off-by: Heemin Kim <[email protected]>

* Return null index name for expired data (#322)

Return null index name for expired data so that it can be deleted
by clean up process. Clean up process exclude current index from deleting.
Signed-off-by: Heemin Kim <[email protected]>

* Add new fields in datasource (#325)

Signed-off-by: Heemin Kim <[email protected]>

* Delete index once it is expired (#326)

Signed-off-by: Heemin Kim <[email protected]>

* Add restoring event listener (#328)

In the listener, we trigger a geoip data update

Signed-off-by: Heemin Kim <[email protected]>

* Reverse forcemerge and refresh order (#331)

Otherwise, opensearch does not clear old segment files

Signed-off-by: Heemin Kim <[email protected]>

* Removed parameter and settings (#332)

* Removed first_only parameter
* Removed max_concurrency and batch_size setting

first_only parameter was added as current geoip processor has it.
However, the parameter have no benefit for ip2geo processor as we don't do a sequantial search for array data but use multi search.

max_concurrency and batch_size setting is removed as these are only reveal internal implementation and could be a future blocker to improve performance later.

Signed-off-by: Heemin Kim <[email protected]>

* Add a field in datasource for current index name (#333)

Signed-off-by: Heemin Kim <[email protected]>

* Delete GeoIP data indices after restoring complete (#334)

We don't want to use restored GeoIP data indices. Therefore we
delete the indices once restoring process complete.

When GeoIP metadata index is restored, we create a new GeoIP data index instead.

Signed-off-by: Heemin Kim <[email protected]>

* Use bool query for array form of IPs (#335)

Signed-off-by: Heemin Kim <[email protected]>

* Run update/delete request in a new thread (#337)

This is not to block transport thread

Signed-off-by: Heemin Kim <[email protected]>

* Remove IP2Geo processor validation (#336)

Cannot query index to get data to validate IP2Geo processor.
Will add validation when we decide to store some of data in cluster state metadata.

Signed-off-by: Heemin Kim <[email protected]>

* Acquire lock sychronously (#339)

By acquiring lock asychronously, the remaining part of the code
is being run by transport thread which does not allow blocking code.
We want only single update happen in a node using single thread. However,
it cannot be acheived if I acquire lock asynchronously and pass the listener.

Signed-off-by: Heemin Kim <[email protected]>

* Added a cache to store datasource metadata (#338)

Signed-off-by: Heemin Kim <[email protected]>

* Changed class name and package (#341)

Signed-off-by: Heemin Kim <[email protected]>

* Refactoring of code (#342)

1. Changed class name from Ip2GeoCache to Ip2GeoCachedDao
2. Moved the Ip2GeoCachedDao from cache to dao package

Signed-off-by: Heemin Kim <[email protected]>

* Add geo data cache (#340)

Signed-off-by: Heemin Kim <[email protected]>

* Add cache layer to reduce GeoIp data retrieval latency (#343)

Signed-off-by: Heemin Kim <[email protected]>

* Use _primary in query preference and few changes (#347)

1. Use _primary preference to get datasource metadata so that it can read the latest data. RefreshPolicy.IMMEDIATE won't refresh replica shards immediately according to #346
2. Update datasource metadata index mapping
3. Move batch size from static value to setting

Signed-off-by: Heemin Kim <[email protected]>

* Wait until GeoIP data to be replicated to all data nodes (#348)

Signed-off-by: Heemin Kim <[email protected]>

* Update packages according to a change in OpenSearch core (#354)

* Update packages according to a change in OpenSearch core

Signed-off-by: Heemin Kim <[email protected]>

* Update packages according to a change in OpenSearch core (#353)

Signed-off-by: Heemin Kim <[email protected]>

---------

Signed-off-by: Heemin Kim <[email protected]>

---------

Signed-off-by: Vijayan Balasubramanian <[email protected]>
Signed-off-by: Heemin Kim <[email protected]>
Co-authored-by: Vijayan Balasubramanian <[email protected]>
Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com>
(cherry picked from commit 0cd9153)

Co-authored-by: Heemin Kim <[email protected]>
@dreamer-89
Copy link
Member Author

Follow up for plugin owners

With #347, we also added _primary preference for search queries. I wanted to callout the downside of using _primary preference which is that request will fail if the primary is unavailable and not route to replicas that are healthy. Thus, there is availability hit. Thus, if availability is a concern then _primary_first is other option which will attempt primary first and fallback to replicas only if the node is weighed away. Thus, its a question of consistency (_primary) vs avaialbility (_primary_first). I will let plugin owners to decide on that and take necessary action.

Pinging folks for traction @heemin32 @junqiu-lei @naveentatikonda @vamshin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request v2.9.0 v2.9.0
Projects
None yet
Development

No branches or pull requests

4 participants