Skip to content

Commit

Permalink
Merge branch 'main' into fix/explicit-security-config
Browse files Browse the repository at this point in the history
  • Loading branch information
Naarcha-AWS authored Apr 1, 2024
2 parents 5ad7dcf + 13eca4f commit b069d99
Show file tree
Hide file tree
Showing 52 changed files with 1,023 additions and 411 deletions.
2 changes: 2 additions & 0 deletions .github/vale/styles/Vocab/OpenSearch/Words/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,7 @@ Levenshtein
[Oo]nboarding
pebibyte
[Pp]erformant
[Pp]laintext
[Pp]luggable
[Pp]reconfigure
[Pp]refetch
Expand All @@ -92,6 +93,7 @@ pebibyte
[Pp]reprocess
[Pp]retrain
[Pp]seudocode
[Quantiz](e|ation|ing|er)
[Rr]ebalance
[Rr]ebalancing
[Rr]edownload
Expand Down
Binary file not shown.
15 changes: 0 additions & 15 deletions .github/workflows/delete_backport_branch.yml

This file was deleted.

22 changes: 22 additions & 0 deletions .github/workflows/delete_merged_branch.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: Delete merged branch of the PRs
on:
pull_request:
types:
- closed

jobs:
delete-branch:
runs-on: ubuntu-latest
if: |
startsWith(github.event.pull_request.head.repo.full_name, 'opensearch-project/documentation-website') &&
${{ !startsWith(github.event.pull_request.head.ref, 'main') }} &&
${{ !startsWith(github.event.pull_request.head.ref, '1.') }} &&
${{ !startsWith(github.event.pull_request.head.ref, '2.') }} &&
${{ !startsWith(github.event.pull_request.head.ref, 'version/') }}
steps:
- name: Echo remove branch
run: echo Removing ${{github.event.pull_request.head.ref}}
- name: Delete merged branch
uses: SvanBoxel/delete-merged-branch@main
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
2 changes: 1 addition & 1 deletion _automating-configurations/api/create-workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Creating a workflow adds the content of a workflow template to the flow framewor

To obtain the validation template for workflow steps, call the [Get Workflow Steps API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-steps/).

You can include placeholder expressions in the value of workflow step fields. For example, you can specify a credential field in a template as `openAI_key: '${{ openai_key }}'`. The expression will be substituted with the user-provided value during provisioning, using the format `${{ <value> }}`. You can pass the actual key as a parameter using the [Provision Workflow API]({{site.url}}{{site.baseurl}}/automating-configurations/api/provision-workflow/) or using this API with the `provision` parameter set to `true`.
You can include placeholder expressions in the value of workflow step fields. For example, you can specify a credential field in a template as `openAI_key: '${{ openai_key }}'`. The expression will be substituted with the user-provided value during provisioning, using the format {% raw %}`${{ <value> }}`{% endraw %}. You can pass the actual key as a parameter by using the [Provision Workflow API]({{site.url}}{{site.baseurl}}/automating-configurations/api/provision-workflow/) or by using this API with the `provision` parameter set to `true`.

Once a workflow is created, provide its `workflow_id` to other APIs.

Expand Down
2 changes: 1 addition & 1 deletion _automating-configurations/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ redirect_from: /automating-configurations/
---

# Automating configurations
**Introduced 2.12**
**Introduced 2.13**
{: .label .label-purple }

You can automate complex OpenSearch setup and preprocessing tasks by providing templates for common use cases. For example, automating machine learning (ML) setup tasks streamlines the use of OpenSearch ML offerings.
Expand Down
147 changes: 147 additions & 0 deletions _automating-configurations/workflow-templates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
---
layout: default
title: Workflow templates
nav_order: 25
---

# Workflow templates

OpenSearch provides several workflow templates for some common machine learning (ML) use cases, such as semantic or conversational search.

You can specify a workflow template when you call the [Create Workflow API]({{site.url}}{{site.baseurl}}/automating-configurations/api/create-workflow/). To provision the workflow, specify `provision=true` as a query parameter. For example, you can configure [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/) by providing the `local_neural_sparse_search_bi_encoder` query parameter as `use_case`, as shown in the following request:

```json
POST /_plugins/_flow_framework/workflow?use_case=local_neural_sparse_search_bi_encoder
```
{% include copy-curl.html %}

The workflow created using this template performs the following configuration steps:

- Deploys the default pretrained sparse encoding model (`amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`).
- Creates an ingest pipeline that contains a `sparse_encoding` processor, which converts the text in a document field to vector embeddings using the deployed model.
- Creates a sample index for sparse search, specifying the default pipeline as the newly created ingest pipeline.

## Parameters

Each workflow template has a defined schema and a set of APIs with predefined default values for each step. For more information about template parameter defaults, see [Supported workflow templates](#supported-workflow-templates).

### Overriding default values

To override a template's default values, provide the new values in the request body when sending a create workflow request. For example, the following request changes the Cohere model, the name of the `text_embedding` processor output field, and the name of the sparse index of the `semantic_search_with_cohere_embedding` template:

```json
POST /_plugins/_flow_framework/workflow?use_case=semantic_search_with_cohere_embedding
{
"create_connector.model" : "embed-multilingual-v3.0",
"text_embedding.field_map.output": "book_embedding",
"create_index.name": "sparse-book-index"
}
```
{% include copy-curl.html %}

## Example

In this example, you'll configure the `semantic_search_with_cohere_embedding_query_enricher` workflow template. The workflow created using this template performs the following configuration steps:

- Deploys an externally hosted Cohere model
- Creates an ingest pipeline using the model
- Creates a sample k-NN index and configures a search pipeline to define the default model ID for that index

### Step 1: Create and provision the workflow

Send the following request to create and provision a workflow using the `semantic_search_with_cohere_embedding_query_enricher` workflow template. The only required request body field for this template is the API key for the Cohere Embed model:

```json
POST /_plugins/_flow_framework/workflow?use_case=semantic_search_with_cohere_embedding_query_enricher&provision=true
{
"create_connector.credential.key" : "<YOUR API KEY>"
}
```
{% include copy-curl.html %}

OpenSearch responds with a workflow ID for the created workflow:

```json
{
"workflow_id" : "8xL8bowB8y25Tqfenm50"
}
```

The workflow in the previous step creates a default k-NN index. The default index name is `my-nlp-index`:

```json
{
"create_index.name": "my-nlp-index"
}
```

For all default parameter values for this workflow template, see [Cohere Embed semantic search defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-embedding-semantic-search-defaults.json).

### Step 2: Ingest documents into the index

To ingest documents into the index created in the previous step, send the following request:

```json
PUT /my-nlp-index/_doc/1
{
"passage_text": "Hello world",
"id": "s1"
}
```
{% include copy-curl.html %}

### Step 3: Perform vector search

To perform a vector search on your index, use a [`neural` query]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural/) clause:

```json
GET /my-nlp-index/_search
{
"_source": {
"excludes": [
"passage_embedding"
]
},
"query": {
"neural": {
"passage_embedding": {
"query_text": "Hi world",
"k": 100
}
}
}
}
```
{% include copy-curl.html %}

## Viewing workflow resources

The workflow you created provisioned all the necessary resources for semantic search. To view the provisioned resources, call the [Get Workflow Status API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-status/) and provide the `workflowID` for your workflow:

```json
GET /_plugins/_flow_framework/workflow/8xL8bowB8y25Tqfenm50/_status
```
{% include copy-curl.html %}

## Supported workflow templates

The following table lists the supported workflow templates. To use a workflow template, specify it in the `use_case` query parameter when creating a workflow.

| Template use case | Description | Required parameters | Defaults |
| `bedrock_titan_embedding_model_deploy` | Creates and deploys an Amazon Bedrock embedding model (by default, `titan-embed-text-v1`).| `create_connector.credential.access_key`, `create_connector.credential.secret_key`, `create_connector.credential.session_token` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/bedrock-titan-embedding-defaults.json)|
| `bedrock_titan_multimodal_model_deploy` | Creates and deploys an Amazon Bedrock multimodal embedding model (by default, `titan-embed-image-v1`). | `create_connector.credential.access_key`, `create_connector.credential.secret_key`, `create_connector.credential.session_token` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/bedrock-titan-multimodal-defaults.json). |
| `cohere_embedding_model_deploy`| Creates and deploys a Cohere embedding model (by default, `embed-english-v3.0`). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-embedding-defaults.json) |
| `cohere_chat_model_deploy` | Creates and deploys a Cohere chat model (by default, Cohere Command). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-chat-defaults.json) |
| `open_ai_embedding_model_deploy` | Creates and deploys an OpenAI embedding model (by default, `text-embedding-ada-002`). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/openai-embedding-defaults.json) |
| `openai_chat_model_deploy` | Creates and deploys an OpenAI chat model (by default, `gpt-3.5-turbo`). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/openai-chat-defaults.json) |
| `local_neural_sparse_search_bi_encoder` | Configures [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/): <br> - Deploys a pretrained sparse encoding model.<br> - Creates an ingest pipeline with a sparse encoding processor. <br> - Creates a sample index to use for sparse search, specifying the newly created pipeline as the default pipeline. | None |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/local-sparse-search-biencoder-defaults.json) |
| `semantic_search` | Configures [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/): <br> - Creates an ingest pipeline with a `text_embedding` processor and a k-NN index <br> You must provide the model ID of the text embedding model to be used. | `create_ingest_pipeline.model_id` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-defaults.json) |
| `semantic_search_with_query_enricher` | Configures [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/) similarly to the `semantic_search` template. Adds a [`query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) search processor that sets a default model ID for neural queries. You must provide the model ID of the text embedding model to be used. | `create_ingest_pipeline.model_id` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-query-enricher-defaults.json) |
| `semantic_search_with_cohere_embedding` | Configures [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/) and deploys a Cohere embedding model. You must provide the API key for the Cohere model. | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-embedding-semantic-search-defaults.json) |
| `semantic_search_with_cohere_embedding_query_enricher` | Configures [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/) and deploys a Cohere embedding model. Adds a [`query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) search processor that sets a default model ID for neural queries. You must provide the API key for the Cohere model. | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-embedding-semantic-search-with-query-enricher-defaults.json) |
| `multimodal_search` | Configures an ingest pipeline with a `text_image_embedding` processor and a k-NN index for [multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/multimodal-search/). You must provide the model ID of the multimodal embedding model to be used. | `create_ingest_pipeline.model_id` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/multi-modal-search-defaults.json) |
| `multimodal_search_with_bedrock_titan` | Deploys an Amazon Bedrock multimodal model and configures an ingest pipeline with a `text_image_embedding` processor and a k-NN index for [multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/multimodal-search/). You must provide your AWS credentials. | `create_connector.credential.access_key`, `create_connector.credential.secret_key`, `create_connector.credential.session_token` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/multimodal-search-bedrock-titan-defaults.json) |
| `hybrid_search` | Configures [hybrid search]({{site.url}}{{site.baseurl}}/search-plugins/hybrid-search/): <br> - Creates an ingest pipeline, a k-NN index, and a search pipeline with a `normalization_processor`. You must provide the model ID of the text embedding model to be used. | `create_ingest_pipeline.model_id` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/hybrid-search-defaults.json) |
| `conversational_search_with_llm_deploy` | Deploys a large language model (LLM) (by default, Cohere Chat) and configures a search pipeline with a `retrieval_augmented_generation` processor for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/conversational-search-defaults.json) |


Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
---
layout: default
title: Creating OpenSearch Benchmark workloads
title: Creating custom workloads
nav_order: 10
parent: User guide
redirect_from:
- /benchmark/creating-custom-workloads/
- /benchmark/user-guide/creating-custom-workloads
- /benchmark/user-guide/creating-osb-workloads/
---

# Creating custom workloads
Expand Down
8 changes: 1 addition & 7 deletions _dashboards/dashboards-assistant/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,11 @@ has_children: false
has_toc: false
---

This is an experimental feature and is not recommended for use in a production environment. For updates on the feature's progress or to leave feedback, go to the [`dashboards-assistant` repository](https://github.com/opensearch-project/dashboards-assistant) on GitHub or the associated [OpenSearch forum thread](https://forum.opensearch.org/t/feedback-opensearch-assistant/16741).
{: .warning}

Note that machine learning models are probabilistic and that some may perform better than others, so the OpenSearch Assistant may occasionally produce inaccurate information. We recommend evaluating outputs for accuracy as appropriate to your use case, including reviewing the output or combining it with other verification factors.
{: .important}

# OpenSearch Assistant for OpenSearch Dashboards
Introduced 2.12
**Introduced 2.13**
{: .label .label-purple }

The OpenSearch Assistant toolkit helps you create AI-powered assistants for OpenSearch Dashboards without requiring you to have specialized query tools or skills.
Expand Down Expand Up @@ -49,9 +46,6 @@ A screenshot of the interface is shown in the following image.

<img width="700" src="{{site.url}}{{site.baseurl}}/images/dashboards/opensearch-assistant-full-frame.png" alt="OpenSearch Assistant interface">

For more information about ways to enable experimental features, see [Experimental feature flags]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/experimental/).
{: .note}

## Configuring OpenSearch Assistant

You can use the OpenSearch Dashboards interface to configure OpenSearch Assistant. Go to the [Getting started guide](https://github.com/opensearch-project/dashboards-assistant/blob/main/GETTING_STARTED_GUIDE.md) for step-by-step instructions. For the chatbot template, go to the [Flow Framework plugin](https://github.com/opensearch-project/flow-framework) documentation. You can modify this template to use your own model and customize the chatbot tools.
Expand Down
1 change: 0 additions & 1 deletion _data-prepper/pipelines/configuration/sinks/opensearch.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,6 @@ Option | Required | Type | Description
`document_root_key` | No | String | The key in the event that will be used as the root in the document. The default is the root of the event. If the key does not exist, then the entire event is written as the document. If `document_root_key` is of a basic value type, such as a string or integer, then the document will have a structure of `{"data": <value of the document_root_key>}`.
`serverless` | No | Boolean | Determines whether the OpenSearch backend is Amazon OpenSearch Serverless. Set this value to `true` when the destination for the `opensearch` sink is an Amazon OpenSearch Serverless collection. Default is `false`.
`serverless_options` | No | Object | The network configuration options available when the backend of the `opensearch` sink is set to Amazon OpenSearch Serverless. For more information, see [Serverless options](#serverless-options).

<!-- vale on -->

## aws
Expand Down
Loading

0 comments on commit b069d99

Please sign in to comment.