Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify Usage of Opensearch Core System and Hidden Indices #1203

Open
Tracked by #9239
Rishikesh1159 opened this issue Aug 11, 2023 · 12 comments
Open
Tracked by #9239

Verify Usage of Opensearch Core System and Hidden Indices #1203

Rishikesh1159 opened this issue Aug 11, 2023 · 12 comments
Labels
enhancement New feature or request

Comments

@Rishikesh1159
Copy link
Member

Rishikesh1159 commented Aug 11, 2023

Description/Concept of System Indices and Hidden Indices on Opensearch core:

System Index - An index must extend system index plugin for a index to be called as system index.
Example: Security Plugin correctly extends and uses system Indices, more info here

Hidden index - An index must have hidden SETTING_INDEX_HIDDEN = "index.hidden" set on the index setting to call it as hidden. It doesn't matter if it starts with "." or not.
Example: Asynchronous-Search plugin correctly sets the index setting value here

Misconception:

Many plugins still misunderstand the actual definition of system and hidden indices. Usual misconception is that any index starting with . like .indexName is a system or hidden index, but this is incorrect. Any user can create an index with .indexName which is neither a system or hidden index. So users can mistakenly still create index starting with "." there is nothing stopping them from doing it.

To avoid this misconception all plugins should adopt/on-board with concept of system and hidden indices defined in opensearch core.

Goal:

The main ask of this issue is to make sure all plugins having/using system and hidden indices must on-board/adopt with concept defined in opensearch core.

Any plugin already on-board with opensearch core defined concept of system and hidden indices can ignore this issue and close the issue as completed on the plugin repo.

Additional info:

The following info provided below is not necessary for system/hidden indices, but might be useful info for plugins using system indices :

If your system indices need additonal security features/benefits provided by security plugin, follow the steps provided here. But to make sure these are additonal features provided by security plugin and it is completely decoupled from concept of system indices. It is upto the plugin owners to decide if they need these additional security benefits.

Open questions

In case of any questions or issues, please post it in core issue

@ylwu-amzn
Copy link
Collaborator

ylwu-amzn commented Aug 14, 2023

Have same confusion like opensearch-project/OpenSearch#9239 (comment)

Not to add more confusion to system indices, but the security plugin also has a notion of protected indices which are indices that are given special protections, but are not system indices.

ml-commons is using security protected indices, not system you defined as

An index must extend system index plugin for a index to be called as system index.

Should we close this issue for now?

@Rishikesh1159
Copy link
Member Author

@ylwu-amzn I think, we need to figure out what is the purpose of index here. If the index holds some important info related to system/plugin and if we don't want a regular user to play around with index then it makes sense to define the index as system index. I want to make sure we are not intermixing both system index and security provided protected indices, both are different concepts and are completely decoupled.

It's upto plugin owners to decide which indices should be
-> system indices and
-> which should be security protected indices and
-> which should be both system and security protected indices

@saratvemulapalli
Copy link
Member

Some confusion here @Rishikesh1159.
In ML-Commons/most plugins, they do not implement the system index interface[1] instead use security plugin to configure security plugin's "system indices"[2]. Security Plugin internally doesn't register[3] these with core i.e from core perspective ML-Commons/plugins do not use core system index SystemIndexDescriptor[1].

Security Plugin introduced the concept of system indices with commit[4], this is a new feature while deprecating protected indices[5].

That said coming back to the original issue, ML-Commons has 2 use cases:

  1. When no security in the cluster: ML-Commons doesn't care if a user is accessing plugin metadata, because all users can access all resources (models, connectors etc in this case) i.e it doesn't need a SystemIndex.
  2. When security is present in the cluster: ML-Commons wants to provide fine grained access control (FGAC) for its resources and it would only be possible by constructs provided by the security plugin (like user identity, roles, backend roles etc). So without security plugin, FGAC doesn't work which is why ML-Commons uses the system index concept introduced by security plugin (not core).

I hope this helps. This is how most plugins are configured to use today. Closing this issue, let me know if you need anything else.

[1] https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/plugins/SystemIndexPlugin.java
[2] https://github.com/opensearch-project/security/blob/main/tools/install_demo_configuration.sh#L385-L386
[3] https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/OpenSearchSecurityPlugin.java#L1886
[4] opensearch-project/security@d420f16
[5] https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/support/ConfigConstants.java#L296

@Arpit-Bandejiya
Copy link

Hi @saratvemulapalli , as part of this activity we want to move the indices created by the plugins to have a standardization. If your team feels that they should not onboard to system indices, they can create the indices with hidden property true(more details in the meta issue). Though I would strongly recommend that verify if you need special handling for the index created by your plugin by core and onboard to system indices.

That said coming back to the original issue, ML-Commons has 2 use cases:
When no security in the cluster: ML-Commons doesn't care if a user is accessing plugin metadata, because all users can access all resources (models, connectors etc in this case) i.e it doesn't need a SystemIndex.
When security is present in the cluster: ML-Commons wants to provide fine grained access control (FGAC) for its resources and it would only be possible by constructs provided by the security plugin (like user identity, roles, backend roles etc). So without security plugin, FGAC doesn't work which is why ML-Commons uses the system index concept introduced by security plugin (not core).

I think we are confusing the system indices with how a user can access the indices or not in non fgac or fgac setting.

You can take the following issues/PR/code on how other plugins are handling it:

https://github.com/opensearch-project/asynchronous-search/blob/e70aa78c38cf3fa0e7b29656acbd085473dbda4c/src/main/java/org/opensearch/search/asynchronous/plugin/AsynchronousSearchPlugin.java#L76

opensearch-project/anomaly-detection#975

opensearch-project/cross-cluster-replication#1290

@dhrubo-os
Copy link
Collaborator

Regarding this comment, I feel like we should re-visit our system indices implementation especially for encryption key. Currently we need to depend on security plugin, which is a bottleneck for us.

@ylwu-amzn
Copy link
Collaborator

ylwu-amzn commented Dec 13, 2023

I think we should follow same logic as AD opensearch-project/anomaly-detection#975 (comment)

All AD indices are correctly configured as hidden indices. We have intentionally not enabled them as system indices. While system indices benefit from a dedicated threadpool for read/write operations and have higher priority compared to user indices, we want to ensure that the operations related to AD indices do not disproportionately impact the overall system performance. Hence, we chose not to use system indices for AD.

It's same for ml-commons, we don't want to impact the overall system performance

@Arpit-Bandejiya
Copy link

Regarding opensearch-project/OpenSearch#9239 (comment), I feel like we should re-visit our system indices implementation especially for encryption key. Currently we need to depend on security plugin, which is a bottleneck for us.

Sorry about the initial confusion, the system indices of core has no correlation with accessibility and have updated the comment. Thanks!

@Arpit-Bandejiya
Copy link

It's same for ml-commons, we don't want to impact the overall system performance

What all operations we do on the system indices of ml-commons? Are we doing heavy ingestion/search on these indices?

@ylwu-amzn
Copy link
Collaborator

What all operations we do on the system indices of ml-commons? Are we doing heavy ingestion/search on these indices?

We save model/task related data into system indices when user create/update. User can get/search these data. No heavy ingestion/search on these indices.

@saratvemulapalli
Copy link
Member

@Arpit-Bandejiya
AFAIK system indices in ML-Commons is used for protecting the access to the plugin metadata index.
ML-Commons doesn't really have heavy ingest/search operations on these indices, which is why we really do not want use system indices and do not dent overall performance of the cluster.

@Arpit-Bandejiya
Copy link

Arpit-Bandejiya commented Dec 19, 2023

I think there is some confusion here:

  1. If the indices are not ingest/search operation heavy, we should ideally move the indices to the system indices. This is for better resiliency of the system indices. Why do we believe it will impact the performance of the cluster?
  2. The system indices list of security plugin is a totally different thing, that is used for security of the index being accessed. The current issue is for the system indices of the core.

As far as I understood, we can move the ml-commons indices to system indices of core as well(Do not change anything in system indices list of security plugin). This will be helpfull for cases when the backpressure is applied and the ml-common plugin is being ingest/searched. We won't be rejecting those requests and the plugin functionality won't be impacted.

@dblock dblock removed the untriaged label Jun 6, 2024
@dblock
Copy link
Member

dblock commented Jun 6, 2024

[Triage -- attendees 1, 2, 3, 4, 5, 6, 7]

@Arpit-Bandejiya what do you recommend we do with this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants