diff --git a/docs/tutorials/aws/images/semantic_search/semantic_search_remote_model_Integration_1.png b/docs/tutorials/aws/images/semantic_search/semantic_search_remote_model_Integration_1.png new file mode 100644 index 0000000000..5873e54f97 Binary files /dev/null and b/docs/tutorials/aws/images/semantic_search/semantic_search_remote_model_Integration_1.png differ diff --git a/docs/tutorials/aws/images/semantic_search/semantic_search_remote_model_Integration_2.png b/docs/tutorials/aws/images/semantic_search/semantic_search_remote_model_Integration_2.png new file mode 100644 index 0000000000..dd7cda449c Binary files /dev/null and b/docs/tutorials/aws/images/semantic_search/semantic_search_remote_model_Integration_2.png differ diff --git a/docs/tutorials/aws/images/semantic_search/semantic_search_remote_model_Integration_3.png b/docs/tutorials/aws/images/semantic_search/semantic_search_remote_model_Integration_3.png new file mode 100644 index 0000000000..ce1c9d356e Binary files /dev/null and b/docs/tutorials/aws/images/semantic_search/semantic_search_remote_model_Integration_3.png differ diff --git a/docs/tutorials/aws/semantic_search_with_CFN_template_for_Sagemaker.md b/docs/tutorials/aws/semantic_search_with_CFN_template_for_Sagemaker.md new file mode 100644 index 0000000000..bf6bd395a2 --- /dev/null +++ b/docs/tutorials/aws/semantic_search_with_CFN_template_for_Sagemaker.md @@ -0,0 +1,240 @@ +# Topic + +This doc describes how to build semantic search in Amazon-managed OpenSearch service with [AWS CloudFormation](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/cfn-template.html) and SageMaker. +If you are not using Amazon OpenSearch, refer to [sagemaker_connector_blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/sagemaker_connector_blueprint.md) and [OpenSearch semantic search](https://opensearch.org/docs/latest/search-plugins/semantic-search/). + +The CloudFormation integration automates the manual process described in the [semantic_search_with_sagemaker_embedding_model tutorial](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/tutorials/aws/semantic_search_with_sagemaker_embedding_model.md). + +The CloudFormation template creates an IAM role and then uses a Lambda function to create an AI connector and model. + +Make sure your SageMaker model inputs follow the format that the [default pre-processing function](https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/#preprocessing-function) requires. The model input must be an array of strings. +``` +["hello world", "how are you"] +``` +Additionally, make sure the model output follows the format that the [default post-processing function](https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/#post-processing-function) requires. The model output must be an array of arrays, where each array corresponds to the embedding of an input string. +``` +[ + [ + -0.048237994, + -0.07612697, + ... + ], + [ + 0.32621247, + 0.02328475, + ... + ] +] +``` + +If your model input/output is not the same as the required default, you can build your own pre/post-processing function using a [Painless script](https://opensearch.org/docs/latest/api-reference/script-apis/exec-script/). + +For example, the Amazon Bedrock Titan embedding model ([blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/bedrock_connector_titan_embedding_blueprint.md#2-create-connector-for-amazon-bedrock)) input is +``` +{ "inputText": "your_input_text" } +``` +The Neural Search plugin will sends the model input to ml-commons as follows: +``` +{ "text_docs": [ "your_input_text1", "your_input_text2"] } +``` +Thus, you need to build a pre-processing function to transform `text_docs` into `inputText`: +``` +"pre_process_function": """ + StringBuilder builder = new StringBuilder(); + builder.append("\""); + String first = params.text_docs[0];// Get the first doc, ml-commons will iterate all docs + builder.append(first); + builder.append("\""); + def parameters = "{" +"\"inputText\":" + builder + "}"; // This is the Bedrock Titan embedding model input + return "{" +"\"parameters\":" + parameters + "}";""" +``` + +The default Amazon Bedrock Titan embedding model output has the following format: +``` +{ + "embedding": +} +``` +However, the Neural Search plugin expects the following format: +``` +{ + "name": "sentence_embedding", + "data_type": "FLOAT32", + "shape": [ ], + "data": +} +``` +Similarly, you need to build a post-processing function to transform the Bedrock Titan embedding model output into a format that the Neural Search plugin requires: + +``` +"post_process_function": """ + def name = "sentence_embedding"; + def dataType = "FLOAT32"; + if (params.embedding == null || params.embedding.length == 0) { + return params.message; + } + def shape = [params.embedding.length]; + def json = "{" + + "\"name\":\"" + name + "\"," + + "\"data_type\":\"" + dataType + "\"," + + "\"shape\":" + shape + "," + + "\"data\":" + params.embedding + + "}"; + return json; + """ +``` + +Note: Replace the placeholders that start with the prefix `your_` with your own values. + +# Steps + +## 0. Create an OpenSearch cluster + +Go to the AWS OpenSearch console UI and create an OpenSearch domain. + +Note the domain ARN; you'll use it in the next step. + +## 1. Map backend role + +AWS OpenSearch Integration CloudFormation template will use a Lambda function to create an AI connector with an IAM role. You need to +map the IAM role to `ml_full_access` to grant it the required permissions. +Refer to [semantic_search_with_sagemaker_embedding_model#map-backend-role](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/tutorials/aws/semantic_search_with_sagemaker_embedding_model.md#22-map-backend-role). + +You can find the IAM role in the `Lambda Invoke OpenSearch ML Commons Role Name` field in the CloudFormation template (see the screenshot in step 2.1). + +The default IAM role is `LambdaInvokeOpenSearchMLCommonsRole`, so you need to map the `arn:aws:iam::your_aws_account_id:role/LambdaInvokeOpenSearchMLCommonsRole` backend role to `ml_full_access`. + +For a quick start, you can also map all roles to `ml_full_access` using a wildcard `arn:aws:iam::your_aws_account_id:role/*` + +Because `all_access` has more permissions than `ml_full_access`, it's OK to map the backend role to `all_access`. + + +## 2. Run CloudFormation template + +You can find CloudFormation template integration in the AWS OpenSearch console. + +![Alt text](images/semantic_search/semantic_search_remote_model_Integration_1.png) + +For all options below, you can find the OpenSearch AI connector and model IDs in the CloudFormation stack `Outputs` when it completes. + +If you see any failure, you can find the log in the SageMaker Console by searching for `Log Groups` with the CloudFormation stack name. + +### 2.1 Option 1: Deploy pretrained model to SageMaker + +You can deploy a pretrained Huggingface sentence-transformer embedding model from the [DJL](https://djl.ai/) model repo. + +Fill out the following fields as described. Keep the default values for all fields not mentioned below: + +1. You must fill your `Amazon OpenSearch Endpoint`. +2. You can use the default setting of the `Sagemaker Configuration` field for a quick start. If necessary, you can change these values. For all supported SageMaker instance types, see [SageMaker documentation](https://aws.amazon.com/sagemaker/pricing/). +3. You must leave the `SageMaker Endpoint Url` empty. If you input a URL in this field, you will not deploy the model to SageMaker to create a new inference endpoint. +4. You can leave the `Custom Image` field empty. The default is `djl-inference:0.22.1-cpu-full`. For all available images, see [this document](https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/deep-learning-containers-images.html). +5. You must leave the `Custom Model Data Url` empty. +6. The default value of `Custom Model Environment` is `djl://ai.djl.huggingface.pytorch/sentence-transformers/all-MiniLM-L6-v2`. For all supported models see the [Appendix](#appendix). + +![Alt text](images/semantic_search/semantic_search_remote_model_Integration_2.png) + + +### 2.2 Option 2: Create model with your existing Sagemaker inference endpoint + +If you already have a SageMaker inference endpoint, you can create a remote model directly using this endpoint. + +Fill out the following fields as described. Keep the default values for all fields not mentioned below: +1. You must fill your `Amazon OpenSearch Endpoint`. +2. You must fill your `SageMaker Endpoint Url`. +3. You must leave `Custom Image`, `Custom Model Data Url`, and `Custom Model Environment` empty. + +![Alt text](images/semantic_search/semantic_search_remote_model_Integration_3.png) + + +# Appendix +## Huggingface sentence-transformer embedding models available in DJL model repo +``` +djl://ai.djl.huggingface.pytorch/sentence-transformers/LaBSE/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/all-MiniLM-L12-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/all-MiniLM-L12-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/all-MiniLM-L6-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/all-MiniLM-L6-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/all-distilroberta-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/all-mpnet-base-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/all-mpnet-base-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/all-roberta-large-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/allenai-specter/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/bert-base-nli-cls-token/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/bert-base-nli-max-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/bert-base-nli-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/bert-base-nli-stsb-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/bert-base-wikipedia-sections-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/bert-large-nli-cls-token/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/bert-large-nli-max-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/bert-large-nli-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/bert-large-nli-stsb-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/clip-ViT-B-32-multilingual-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/distilbert-base-nli-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/distilbert-base-nli-stsb-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/distilbert-base-nli-stsb-quora-ranking/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/distilbert-multilingual-nli-stsb-quora-ranking/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/distiluse-base-multilingual-cased-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/facebook-dpr-ctx_encoder-multiset-base/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/facebook-dpr-ctx_encoder-single-nq-base/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/facebook-dpr-question_encoder-multiset-base/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/facebook-dpr-question_encoder-single-nq-base/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-MiniLM-L-12-v3/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-MiniLM-L-6-v3/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-MiniLM-L12-cos-v5/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-MiniLM-L6-cos-v5/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-bert-base-dot-v5/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-bert-co-condensor/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-distilbert-base-dot-prod-v3/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-distilbert-base-tas-b/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-distilbert-base-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-distilbert-base-v3/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-distilbert-base-v4/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-distilbert-cos-v5/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-distilbert-dot-v5/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-distilbert-multilingual-en-de-v2-tmp-lng-aligned/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-distilbert-multilingual-en-de-v2-tmp-trained-scratch/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-distilroberta-base-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-roberta-base-ance-firstp/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-roberta-base-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/msmarco-roberta-base-v3/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/multi-qa-MiniLM-L6-cos-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/multi-qa-MiniLM-L6-dot-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/multi-qa-distilbert-cos-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/multi-qa-distilbert-dot-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/nli-bert-base/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/nli-bert-large-max-pooling/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/nli-distilbert-base/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/nli-distilroberta-base-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/nli-roberta-base-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/nli-roberta-large/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/nq-distilbert-base-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/paraphrase-MiniLM-L12-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/paraphrase-MiniLM-L3-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/paraphrase-MiniLM-L6-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/paraphrase-TinyBERT-L6-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/paraphrase-albert-base-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/paraphrase-albert-small-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/paraphrase-distilroberta-base-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/paraphrase-multilingual-mpnet-base-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/paraphrase-xlm-r-multilingual-v1/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/quora-distilbert-base/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/quora-distilbert-multilingual/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/roberta-base-nli-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/roberta-base-nli-stsb-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/roberta-large-nli-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/roberta-large-nli-stsb-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/stsb-bert-base/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/stsb-bert-large/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/stsb-distilbert-base/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/stsb-distilroberta-base-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/stsb-roberta-base-v2/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/stsb-roberta-base/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/stsb-roberta-large/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/stsb-xlm-r-multilingual/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/use-cmlm-multilingual/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/xlm-r-100langs-bert-base-nli-stsb-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/xlm-r-bert-base-nli-stsb-mean-tokens/ +djl://ai.djl.huggingface.pytorch/sentence-transformers/xlm-r-distilroberta-base-paraphrase-v1/ +``` \ No newline at end of file diff --git a/docs/tutorials/aws/semantic_search_with_bedrock_cohere_embedding_model.md b/docs/tutorials/aws/semantic_search_with_bedrock_cohere_embedding_model.md new file mode 100644 index 0000000000..03353115ae --- /dev/null +++ b/docs/tutorials/aws/semantic_search_with_bedrock_cohere_embedding_model.md @@ -0,0 +1,356 @@ +# Topic + +> The easiest way for setting up embedding model on your Amazon OpenSearch cluster is using [AWS CloudFormation](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/cfn-template.html) + +> This tutorial explains detail steps if you want to configure everything manually. + +> Bedrock has [quota limit](https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html). You can purchase [Provisioned Throughput](https://docs.aws.amazon.com/bedrock/latest/userguide/prov-throughput.html) to increase quota limit. + +This doc introduces how to build semantic search in Amazon managed OpenSearch with [Bedrock Titan embedding model](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html). +If you are not using Amazon OpenSearch, you can refer to [bedrock_connector_titan_embedding_blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/bedrock_connector_titan_embedding_blueprint.md). + +Note: You should replace the placeholders with prefix `your_` with your own value + +# Steps + +## 0. Create OpenSearch cluster + +Go to AWS OpenSearch console UI and create OpenSearch domain. + +Copy the domain ARN which will be used in later steps. + +## 1. Create IAM role to invoke Bedrock model +To invoke Bedrock model, we need to create an IAM role with proper permission. +This IAM role will be configured in connector. Connector will use this role to invoke Bedrock model. + +Go to IAM console, create IAM role `my_invoke_bedrock_cohere_role` with: + +- Custom trust policy: +``` +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Principal": { + "Service": "es.amazonaws.com" + }, + "Action": "sts:AssumeRole" + } + ] +} +``` +- Permission +``` +{ + "Version": "2012-10-17", + "Statement": [ + { + "Action": [ + "bedrock:InvokeModel" + ], + "Effect": "Allow", + "Resource": "arn:aws:bedrock:*::foundation-model/cohere.embed-english-v3" + } + ] +} +``` + +If you need to support multi-language, you can use multilingual model: `cohere.embed-multilingual-v3` + +Copy the role ARN which will be used in later steps. + +## 2. Configure IAM role in OpenSearch + +### 2.1 Create IAM role for Signing create connector request + +Generate a new IAM role specifically for signing your create connector request. + + +Create IAM role `my_create_bedrock_cohere_connector_role` with +- Custom trust policy. Note: `your_iam_user_arn` is the IAM user which will run `aws sts assume-role` in step 3.1 +``` +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Principal": { + "AWS": "your_iam_user_arn" + }, + "Action": "sts:AssumeRole" + } + ] +} +``` +- permission +``` +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": "iam:PassRole", + "Resource": "your_iam_role_arn_created_in_step1" + }, + { + "Effect": "Allow", + "Action": "es:ESHttpPost", + "Resource": "your_opensearch_domain_arn_created_in_step0" + } + ] +} +``` + +Copy this role ARN which will be used in later steps. + +### 2.2 Map backend role + +1. Log in to your OpenSearch Dashboard and navigate to the "Security" page, which you can find in the left-hand menu. +2. Then click "Roles" on security page (you can find it on left-hand), then find "ml_full_access" role and click it. +3. On "ml_full_access" role detail page, click "Mapped users", then click "Manage mapping". Paste IAM role ARN created in step 2.1 to backend roles part. +Click "Map", then the IAM role configured successfully in your OpenSearch cluster. + +![Alt text](images/semantic_search/mapping_iam_role_arn.png) + +## 3. Create Connector + +Find more details on [connector](https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/connectors/) + + +### 3.1 Get temporary credential of the role created in step 2.1: +``` +aws sts assume-role --role-arn your_iam_role_arn_created_in_step2.1 --role-session-name your_session_name +``` + +Configure the temporary credential in `~/.aws/credentials` like this + +``` +[default] +AWS_ACCESS_KEY_ID=your_access_key_of_role_created_in_step2.1 +AWS_SECRET_ACCESS_KEY=your_secret_key_of_role_created_in_step2.1 +AWS_SESSION_TOKEN=your_session_token_of_role_created_in_step2.1 +``` + +### 3.2 Create connector + +Run this python code with the temporary credential configured in `~/.aws/credentials` + +Read [Cohere blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/cohere_connector_embedding_blueprint.md) for more details. + +``` +import boto3 +import requests +from requests_aws4auth import AWS4Auth + +host = 'your_amazon_opensearch_domain_endpoint_created_in_step0' +region = 'your_amazon_opensearch_domain_region' +service = 'es' + +credentials = boto3.Session().get_credentials() +awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token) + +path = '/_plugins/_ml/connectors/_create' +url = host + path + +payload = { + "name": "Amazon Bedrock Cohere Connector: embedding v3", + "description": "The connector to Bedrock Cohere embedding model", + "version": 1, + "protocol": "aws_sigv4", + "parameters": { + "region": "your_bedrock_model_region", + "service_name": "bedrock", + "input_type":"search_document", + "truncate": "END" + }, + "credential": { + "roleArn": "your_iam_role_arn_created_in_step1" + }, + "actions": [ + { + "action_type": "predict", + "method": "POST", + "url": "https://bedrock-runtime.your_bedrock_model_region.amazonaws.com/model/cohere.embed-english-v3/invoke", + "headers": { + "content-type": "application/json", + "x-amz-content-sha256": "required" + }, + "request_body": "{ \"texts\": ${parameters.texts}, \"truncate\": \"${parameters.truncate}\", \"input_type\": \"${parameters.input_type}\" }", + "pre_process_function": "connector.pre_process.cohere.embedding", + "post_process_function": "connector.post_process.cohere.embedding" + } + ] +} + +headers = {"Content-Type": "application/json"} + +r = requests.post(url, auth=awsauth, json=payload, headers=headers) +print(r.text) +``` +The script will output connector id. + +sample output +``` +{"connector_id":"1p0u8o0BWbTmLN9F2Y7m"} +``` + +Copy connector id which will be used in later steps. + +## 4. Create Model and test + +Login your OpenSearch Dashboard, open DevTools, then run these + +1. Create model group +``` +POST /_plugins/_ml/model_groups/_register +{ + "name": "Bedrock_embedding_model", + "description": "Test model group for bedrock embedding model" +} +``` +Sample output +``` +{ + "model_group_id": "050q8o0BWbTmLN9Foo4f", + "status": "CREATED" +} +``` + +2. Register model + +``` +POST /_plugins/_ml/models/_register +{ + "name": "Bedrock Cohere embedding model v3", + "function_name": "remote", + "description": "test embedding model", + "model_group_id": "050q8o0BWbTmLN9Foo4f", + "connector_id": "0p0p8o0BWbTmLN9F-o4G" +} +``` +Sample output +``` +{ + "task_id": "TRUr8o0BTaDH9c7tSRfx", + "status": "CREATED", + "model_id": "VRUu8o0BTaDH9c7t9xet" +} +``` + +3. Deploy model +``` +POST /_plugins/_ml/models/VRUu8o0BTaDH9c7t9xet/_deploy +``` +Sample output +``` +{ + "task_id": "1J0r8o0BWbTmLN9FjY6I", + "task_type": "DEPLOY_MODEL", + "status": "COMPLETED" +} +``` +4. Predict +``` +POST /_plugins/_ml/models/VRUu8o0BTaDH9c7t9xet/_predict +{ + "parameters": { + "texts": ["hello world"] + } +} +``` +Sample response +``` +{ + "inference_results": [ + { + "output": [ + { + "name": "sentence_embedding", + "data_type": "FLOAT32", + "shape": [ + 1024 + ], + "data": [ + -0.02973938, + -0.023651123, + -0.06021118, + ...] + } + ], + "status_code": 200 + } + ] +} +``` + +## 5. Semantic search + +### 5.1 create ingest pipeline +Find more details: [ingest pipeline](https://opensearch.org/docs/latest/ingest-pipelines/) + +``` +PUT /_ingest/pipeline/my_bedrock_cohere_embedding_pipeline +{ + "description": "text embedding pentest", + "processors": [ + { + "text_embedding": { + "model_id": "your_bedrock_embedding_model_id_created_in_step4", + "field_map": { + "text": "text_knn" + } + } + } + ] +} +``` +### 5.2 create k-NN index +Find more details: [k-NN index](https://opensearch.org/docs/latest/search-plugins/knn/knn-index/) + +You should customize your k-NN index for better performance. +``` +PUT my_index +{ + "settings": { + "index": { + "knn.space_type": "cosinesimil", + "default_pipeline": "my_bedrock_cohere_embedding_pipeline", + "knn": "true" + } + }, + "mappings": { + "properties": { + "text_knn": { + "type": "knn_vector", + "dimension": 1024 + } + } + } +} +``` +### 5.3 ingest test data +``` +POST /my_index/_doc/1000001 +{ + "text": "hello world." +} +``` +### 5.4 search +Find more details: [neural search](https://opensearch.org/docs/latest/search-plugins/neural-search/). +``` +POST /my_index/_search +{ + "query": { + "neural": { + "text_knn": { + "query_text": "hello", + "model_id": "your_embedding_model_id_created_in_step4", + "k": 100 + } + } + }, + "size": "1", + "_source": ["text"] +} +``` \ No newline at end of file diff --git a/docs/tutorials/aws/semantic_search_with_bedrock_titan_embedding_model.md b/docs/tutorials/aws/semantic_search_with_bedrock_titan_embedding_model.md index 5e2085141b..40d7b27eb5 100644 --- a/docs/tutorials/aws/semantic_search_with_bedrock_titan_embedding_model.md +++ b/docs/tutorials/aws/semantic_search_with_bedrock_titan_embedding_model.md @@ -3,6 +3,8 @@ > The easiest way for setting up embedding model on your Amazon OpenSearch cluster is using [AWS CloudFormation](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/cfn-template.html) > This tutorial explains detail steps if you want to configure everything manually. + +> Bedrock has [quota limit](https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html). You can purchase [Provisioned Throughput](https://docs.aws.amazon.com/bedrock/latest/userguide/prov-throughput.html) to increase quota limit. This doc introduces how to build semantic search in Amazon managed OpenSearch with [Bedrock Titan embedding model](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html). If you are not using Amazon OpenSearch, you can refer to [bedrock_connector_titan_embedding_blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/bedrock_connector_titan_embedding_blueprint.md).