Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for default use cases #6767

Merged
merged 13 commits into from
Mar 29, 2024
2 changes: 1 addition & 1 deletion _automating-configurations/api/create-workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Creating a workflow adds the content of a workflow template to the flow framewor

To obtain the validation template for workflow steps, call the [Get Workflow Steps API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-steps/).

You can include placeholder expressions in the value of workflow step fields. For example, you can specify a credential field in a template as `openAI_key: '${{ openai_key }}'`. The expression will be substituted with the user-provided value during provisioning, using the format `${{ <value> }}`. You can pass the actual key as a parameter using the [Provision Workflow API]({{site.url}}{{site.baseurl}}/automating-configurations/api/provision-workflow/) or using this API with the `provision` parameter set to `true`.
You can include placeholder expressions in the value of workflow step fields. For example, you can specify a credential field in a template as `openAI_key: '${{ openai_key }}'`. The expression will be substituted with the user-provided value during provisioning, using the format {% raw %}`${{ <value> }}`{% endraw %}. You can pass the actual key as a parameter using the [Provision Workflow API]({{site.url}}{{site.baseurl}}/automating-configurations/api/provision-workflow/) or using this API with the `provision` parameter set to `true`.
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

Once a workflow is created, provide its `workflow_id` to other APIs.

Expand Down
145 changes: 145 additions & 0 deletions _automating-configurations/workflow-templates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
---
layout: default
title: Workflow templates
nav_order: 25
---

# Workflow templates

OpenSearch provides several workflow templates for some common machine learning (ML) use cases, such as semantic or conversational search.

You can specify a workflow template when you call the [Create Workflow API]({{site.url}}{{site.baseurl}}/automating-configurations/api/create-workflow/). To provision the workflow, specify `provision=true` as a query parameter. For example, you can configure [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/) by using the `local_neural_sparse_search_bi_encoder` workflow template, as shown in the following request:
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

```json
POST /_plugins/_flow_framework/workflow?use_case=local_neural_sparse_search_bi_encoder
```
{% include copy-curl.html %}

The workflow created using this template performs the following configuration steps:

- Deploys the default pretrained sparse encoding model (`amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`)
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
- Creates an ingest pipeline that contains a `sparse_encoding` processor, which converts the text in a document field to vector embeddings using the deployed model
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
- Creates a sample index for sparse search, specifying the default pipeline as the newly created ingest pipeline
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I initially said no periods here, but once the fragments become complex/lengthy, as the second two are, then periods are preferred.


## Parameters

Each workflow template has a defined schema and a set of APIs with predefined defaults for each step. For more information about template parameter default values, see [Supported workflow templates](#supported-workflow-templates).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"default values" instead of "defaults"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to use "defaults" here because the next sentence says "default values".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, as long as it's OK to not clarify what type of defaults on first appearance.

kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved

### Overriding default values

To override a template's default values, provide the new values in the request body when sending a create workflow request. For example, the following request changes the Cohere model, the name of the `text_embedding` processor output field, and the name of the sparse index of the `semantic_search_with_cohere_embedding` template:

```json
POST /_plugins/_flow_framework/workflow?use_case=semantic_search_with_cohere_embedding
{
"create_connector.model" : "embed-multilingual-v3.0",
"text_embedding.field_map.output": "book_embedding",
"create_index.name": "sparse-book-index"
}
```
{% include copy-curl.html %}

## Example

In this example, you'll configure the `semantic_search_with_cohere_embedding_query_enricher` workflow template. The workflow created using this template performs the following configuration steps:

- Deploys a Cohere externally hosted model
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
- Creates an ingest pipeline using the model
- Creates a sample k-NN index and configures a search pipeline to define the default model ID for that index

### Step 1: Create and provision the workflow

Send the following request to create and provision a workflow using the `semantic_search_with_cohere_embedding_query_enricher` workflow template. The only required request body field for this template is the API key for the Cohere Embed model:

```json
POST /_plugins/_flow_framework/workflow?use_case=semantic_search_with_cohere_embedding_query_enricher&provision=true
{
"create_connector.credential.key" : "<YOUR API KEY>"
}
```
{% include copy-curl.html %}

OpenSearch responds with a workflow ID for the created workflow:

```json
{
"workflow_id" : "8xL8bowB8y25Tqfenm50"
}
```

The workflow in the previous step creates a default k-NN index. The default index name is `my-nlp-index`:

```json
{
"create_index.name": "my-nlp-index"
}
```

For all default parameter values for this workflow template, see [Cohere Embed semantic search defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-embedding-semantic-search-defaults.json).

### Step 2: Ingest documents into the index

To ingest documents into the index created in the previous step, send the following request:

```json
PUT /my-nlp-index/_doc/1
{
"passage_text": "Hello world",
"id": "s1"
}
```
{% include copy-curl.html %}

### Step 3: Perform vector search

To perform a vector search on your index, use a [`neural` query]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural/) clause:

```json
GET /my-nlp-index/_search
{
"_source": {
"excludes": [
"passage_embedding"
]
},
"query": {
"neural": {
"passage_embedding": {
"query_text": "Hi world",
"k": 100
}
}
}
}
```
{% include copy-curl.html %}

## Viewing workflow resources

The workflow you created provisioned all the necessary resources for semantic search. To view the provisioned resources, call the [Get Workflow Status API]({{site.url}}{{site.baseurl}}/automating-configurations/api/get-workflow-status/) and provide the `workflowID` for your workflow:

```json
GET /_plugins/_flow_framework/workflow/8xL8bowB8y25Tqfenm50/_status
```
{% include copy-curl.html %}

## Supported workflow templates

| Template name | Description | Required parameters | Defaults |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global: "Default values"?

kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
| `bedrock_titan_embedding_model_deploy` | Creates and deploys an Amazon Bedrock embedding model (by default, `titan-embed-text-v1`).| `create_connector.credential.access_key`, `create_connector.credential.secret_key`, `create_connector.credential.session_token` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/bedrock-titan-embedding-defaults.json)|
| `bedrock_titan_multimodal_model_deploy` | Creates and deploys an Amazon Bedrock multimodal embedding model (by default, `titan-embed-image-v1`). | `create_connector.credential.access_key`, `create_connector.credential.secret_key`, `create_connector.credential.session_token` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/bedrock-titan-multimodal-defaults.json). |
| `cohere_embedding_model_deploy`| Creates and deploys a Cohere embedding model (by default, `embed-english-v3.0`). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-embedding-defaults.json) |
| `cohere_chat_model_deploy` | Creates and deploys a Cohere chat model (by default, Cohere Command). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-chat-defaults.json) |
| `open_ai_embedding_model_deploy` | Creates and deploys an OpenAI embedding model (by default, `text-embedding-ada-002`). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/openai-embedding-defaults.json) |
| `openai_chat_model_deploy` | Creates and deploys an OpenAI chat model (by default, `gpt-3.5-turbo`). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/openai-chat-defaults.json) |
| `local_neural_sparse_search_bi_encoder` | Configures [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/): <br> - Deploys a pretrained sparse encoding model<br> - Creates an ingest pipeline with a sparse encoding processor <br> - Creates a sample index to use for sparse search, specifying the newly created pipeline as default pipeline | None |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/local-sparse-search-biencoder-defaults.json) |
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
| `semantic_search` | Configures [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/): <br> - Creates an ingest pipeline with a `text_embedding` processor and a k-NN index <br> You must provide a model ID of the text embedding model to use. | `create_ingest_pipeline.model_id` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-defaults.json) |
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
| `semantic_search_with_query_enricher` | Configures [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/) similarly to the `semantic_search` template. Adds a [`query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) search processor that sets a default model ID is defaulted for neural queries. You must provide a model ID of the text embedding model to use. | `create_ingest_pipeline.model_id` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/semantic-search-query-enricher-defaults.json) |
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following "search processor that sets a default model ID is defaulted for neural queries".

| `semantic_search_with_cohere_embedding` | Configures [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/) and deploys a Cohere embedding model. You must provide the API key for the Cohere model. | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-embedding-semantic-search-defaults.json) |
| `semantic_search_with_cohere_embedding_query_enricher` | Configures [semantic search]({{site.url}}{{site.baseurl}}/search-plugins/semantic-search/) and deploys a Cohere embedding model. Adds a [`query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/) search processor that sets a default model ID is defaulted for neural queries. You must provide the API key for the Cohere model. | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/cohere-embedding-semantic-search-with-query-enricher-defaults.json) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
| `multimodal_search` | Configures an ingest pipeline with a `text_image_embedding` processor and a k-NN index for [multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/multimodal-search/). You must provide a model ID of the multimodal embedding model to use. | `create_ingest_pipeline.model_id` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/multi-modal-search-defaults.json) |
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
| `multimodal_search_with_bedrock_titan` | Deploys an Amazon Bedrock multimodal model and configures an ingest pipeline with a `text_image_embedding` processor and a k-NN index for [multimodal search]({{site.url}}{{site.baseurl}}/search-plugins/multimodal-search/). You must provide your AWS credentials. | `create_connector.credential.access_key`, `create_connector.credential.secret_key`, `create_connector.credential.session_token` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/multimodal-search-bedrock-titan-defaults.json) |
| `hybrid_search` | Configures [hybrid search]({{site.url}}{{site.baseurl}}/search-plugins/hybrid-search/): <br> - Creates an ingest pipeline, a k-NN index and a search pipeline with a `normalization_processor`. You must provide a model ID of the text embedding model to use. | `create_ingest_pipeline.model_id` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/hybrid-search-defaults.json) |
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved
| `conversational_search_with_llm_deploy` | Deploys an LLM model (by default, Cohere Chat) and configures a search pipeline with a `retrieval_augmented_generation` processor for [conversational search]({{site.url}}{{site.baseurl}}/search-plugins/conversational-search/). | `create_connector.credential.key` |[Defaults](https://github.com/opensearch-project/flow-framework/blob/2.13/src/main/resources/defaults/conversational-search-defaults.json) |
kolchfa-aws marked this conversation as resolved.
Show resolved Hide resolved


Loading