diff --git a/_ml-commons-plugin/api/index.md b/_ml-commons-plugin/api/index.md
index ec4cf12492..65171b163f 100644
--- a/_ml-commons-plugin/api/index.md
+++ b/_ml-commons-plugin/api/index.md
@@ -21,6 +21,5 @@ ML Commons supports the following APIs:
 - [Controller APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/controller-apis/index/)
 - [Execute Algorithm API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/execute-algorithm/)
 - [Tasks APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/index/)
-- [Train and Predict APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/index/)
 - [Profile API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/profile/)
 - [Stats API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/stats/)
diff --git a/_ml-commons-plugin/api/model-apis/batch-predict.md b/_ml-commons-plugin/api/model-apis/batch-predict.md
new file mode 100644
index 0000000000..b32fbb108d
--- /dev/null
+++ b/_ml-commons-plugin/api/model-apis/batch-predict.md
@@ -0,0 +1,167 @@
+---
+layout: default
+title:  Batch predict
+parent: Model APIs
+grand_parent: ML Commons APIs
+nav_order: 65
+---
+
+# Batch predict
+
+This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/2488).
+{: .warning}
+
+ML Commons can perform inference on large datasets in an offline asynchronous mode using a model deployed on external model servers. To use the Batch Predict API, you must provide the `model_id` for an externally hosted model. Amazon SageMaker, Cohere, and OpenAI are currently the only verified external servers that support this API.
+
+For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations).
+
+For information about externally hosted models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/). 
+
+For instructions on how set up batch inference and connector blueprints, see the following:
+
+- [Amazon SageMaker batch predict connector blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_sagemaker_connector_blueprint.md)
+
+- [OpenAI batch predict connector blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md)
+
+## Path and HTTP methods
+
+```json
+POST /_plugins/_ml/models/<model_id>/_batch_predict
+```
+
+## Prerequisites
+
+Before using the Batch Predict API, you need to create a connector to the externally hosted model. For example, to create a connector to an OpenAI `text-embedding-ada-002` model, send the following request:
+
+```json
+POST /_plugins/_ml/connectors/_create
+{
+  "name": "OpenAI Embedding model",
+  "description": "OpenAI embedding model for testing offline batch",
+  "version": "1",
+  "protocol": "http",
+  "parameters": {
+    "model": "text-embedding-ada-002",
+    "input_file_id": "<your input file id in OpenAI>",
+    "endpoint": "/v1/embeddings"
+  },
+  "credential": {
+    "openAI_key": "<your openAI key>"
+  },
+  "actions": [
+    {
+      "action_type": "predict",
+      "method": "POST",
+      "url": "https://api.openai.com/v1/embeddings",
+      "headers": {
+        "Authorization": "Bearer ${credential.openAI_key}"
+      },
+      "request_body": "{ \"input\": ${parameters.input}, \"model\": \"${parameters.model}\" }",
+      "pre_process_function": "connector.pre_process.openai.embedding",
+      "post_process_function": "connector.post_process.openai.embedding"
+    },
+    {
+      "action_type": "batch_predict",
+      "method": "POST",
+      "url": "https://api.openai.com/v1/batches",
+      "headers": {
+        "Authorization": "Bearer ${credential.openAI_key}"
+      },
+      "request_body": "{ \"input_file_id\": \"${parameters.input_file_id}\", \"endpoint\": \"${parameters.endpoint}\", \"completion_window\": \"24h\" }"
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+The response contains a connector ID that you'll use in the next steps:
+
+```json
+{
+  "connector_id": "XU5UiokBpXT9icfOM0vt"
+}
+```
+
+Next, register an externally hosted model and provide the connector ID of the created connector:
+
+```json
+POST /_plugins/_ml/models/_register?deploy=true
+{
+    "name": "OpenAI model for realtime embedding and offline batch inference",
+    "function_name": "remote",
+    "description": "OpenAI text embedding model",
+    "connector_id": "XU5UiokBpXT9icfOM0vt"
+}
+```
+{% include copy-curl.html %}
+
+The response contains the task ID for the register operation:
+
+```json
+{
+  "task_id": "rMormY8B8aiZvtEZIO_j",
+  "status": "CREATED",
+  "model_id": "lyjxwZABNrAVdFa9zrcZ"
+}
+```
+
+To check the status of the operation, provide the task ID to the [Tasks API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/). Once the registration is complete, the task `state` changes to `COMPLETED`.
+
+#### Example request
+
+Once you have completed the prerequisite steps, you can call the Batch Predict API. The parameters in the batch predict request override those defined in the connector:
+
+```json
+POST /_plugins/_ml/models/lyjxwZABNrAVdFa9zrcZ/_batch_predict
+{
+  "parameters": {
+    "model": "text-embedding-3-large"
+  }
+}
+```
+{% include copy-curl.html %}
+
+#### Example response
+
+```json
+{
+  "inference_results": [
+    {
+      "output": [
+        {
+          "name": "response",
+          "dataAsMap": {
+            "id": "batch_<your file id>",
+            "object": "batch",
+            "endpoint": "/v1/embeddings",
+            "errors": null,
+            "input_file_id": "file-<your input file id>",
+            "completion_window": "24h",
+            "status": "validating",
+            "output_file_id": null,
+            "error_file_id": null,
+            "created_at": 1722037257,
+            "in_progress_at": null,
+            "expires_at": 1722123657,
+            "finalizing_at": null,
+            "completed_at": null,
+            "failed_at": null,
+            "expired_at": null,
+            "cancelling_at": null,
+            "cancelled_at": null,
+            "request_counts": {
+              "total": 0,
+              "completed": 0,
+              "failed": 0
+            },
+            "metadata": null
+          }
+        }
+      ],
+      "status_code": 200
+    }
+  ]
+}
+```
+
+For the definition of each field in the result, see [OpenAI Batch API](https://platform.openai.com/docs/guides/batch). Once the batch inference is complete, you can download the output by calling the [OpenAI Files API](https://platform.openai.com/docs/api-reference/files) and providing the file name specified in the `id` field of the response.
\ No newline at end of file
diff --git a/_ml-commons-plugin/api/model-apis/index.md b/_ml-commons-plugin/api/model-apis/index.md
index 444da1fe70..9cf992d54b 100644
--- a/_ml-commons-plugin/api/model-apis/index.md
+++ b/_ml-commons-plugin/api/model-apis/index.md
@@ -9,16 +9,40 @@ has_toc: false
 
 # Model APIs
 
-ML Commons supports the following model-level APIs:
-
-- [Register model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/)
-- [Deploy model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/)
-- [Get model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/get-model/)
-- [Search model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/search-model/)
-- [Update model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/update-model/)
-- [Undeploy model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/undeploy-model/)
-- [Delete model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/delete-model/)
-- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) (invokes a model)
+ML Commons supports the following model-level CRUD APIs:
+
+- [Register Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/)
+- [Deploy Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/)
+- [Get Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/get-model/)
+- [Search Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/search-model/)
+- [Update Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/update-model/)
+- [Undeploy Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/undeploy-model/)
+- [Delete Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/delete-model/)
+
+# Predict APIs
+
+Predict APIs are used to invoke machine learning (ML) models. ML Commons supports the following Predict APIs:
+
+- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) 
+- [Batch Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/batch-predict/) (experimental)
+
+# Train API
+
+The ML Commons Train API lets you train ML algorithms synchronously and asynchronously:
+
+- [Train]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/train/)
+
+To train tasks through the API, three inputs are required: 
+
+- Algorithm name: Must be a [FunctionName](https://github.com/opensearch-project/ml-commons/blob/1.3/common/src/main/java/org/opensearch/ml/common/parameter/FunctionName.java). This determines what algorithm the ML model runs. To add a new function, see [How To Add a New Function](https://github.com/opensearch-project/ml-commons/blob/main/docs/how-to-add-new-function.md).
+- Model hyperparameters: Adjust these parameters to improve model accuracy.  
+- Input data: The data that trains the ML model or applies it to predictions. You can input data in two ways: query against your index or use a data frame.
+
+# Train and Predict API
+
+The Train and Predict API lets you train and invoke the model using the same dataset:
+
+- [Train and Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/train-and-predict/)
 
 ## Model access control considerations
 
diff --git a/_ml-commons-plugin/api/train-predict/index.md b/_ml-commons-plugin/api/train-predict/index.md
deleted file mode 100644
index 8486b4beb9..0000000000
--- a/_ml-commons-plugin/api/train-predict/index.md
+++ /dev/null
@@ -1,24 +0,0 @@
----
-layout: default
-title: Train and Predict APIs
-parent: ML Commons APIs
-has_children: true
-has_toc: false
-nav_order: 30
----
-
-# Train and Predict APIs
-
-The ML Commons API lets you train machine learning (ML) algorithms synchronously and asynchronously, make predictions with that trained model, and train and predict with the same dataset.
-
-To train tasks through the API, three inputs are required: 
-
-- Algorithm name: Must be one of a [FunctionName](https://github.com/opensearch-project/ml-commons/blob/1.3/common/src/main/java/org/opensearch/ml/common/parameter/FunctionName.java). This determines what algorithm the ML Engine runs. To add a new function, see [How To Add a New Function](https://github.com/opensearch-project/ml-commons/blob/main/docs/how-to-add-new-function.md).
-- Model hyperparameters: Adjust these parameters to improve model accuracy.  
-- Input data: The data that trains the ML model, or applies the ML models to predictions. You can input data in two ways, query against your index or use a data frame.
-
-ML Commons supports the following Train and Predict APIs:
-
-- [Train]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/train/)
-- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/)
-- [Train and Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/train-and-predict/)
diff --git a/_ml-commons-plugin/api/train-predict/predict.md b/_ml-commons-plugin/api/train-predict/predict.md
index 299c957122..ea0938da36 100644
--- a/_ml-commons-plugin/api/train-predict/predict.md
+++ b/_ml-commons-plugin/api/train-predict/predict.md
@@ -1,9 +1,9 @@
 ---
 layout: default
 title: Predict
-parent: Train and Predict APIs
+parent: Model APIs
 grand_parent: ML Commons APIs
-nav_order: 20
+nav_order: 60
 ---
 
 # Predict
diff --git a/_ml-commons-plugin/api/train-predict/train-and-predict.md b/_ml-commons-plugin/api/train-predict/train-and-predict.md
index 1df0e5e3be..f8f8f7893a 100644
--- a/_ml-commons-plugin/api/train-predict/train-and-predict.md
+++ b/_ml-commons-plugin/api/train-predict/train-and-predict.md
@@ -1,9 +1,9 @@
 ---
 layout: default
 title: Train and predict 
-parent: Train and Predict APIs
+parent: Model APIs
 grand_parent: ML Commons APIs
-nav_order: 10
+nav_order: 70
 ---
 
 ## Train and predict
diff --git a/_ml-commons-plugin/api/train-predict/train.md b/_ml-commons-plugin/api/train-predict/train.md
index 8de486198d..80cbf8abdb 100644
--- a/_ml-commons-plugin/api/train-predict/train.md
+++ b/_ml-commons-plugin/api/train-predict/train.md
@@ -1,9 +1,9 @@
 ---
 layout: default
 title: Train 
-parent: Train and Predict APIs
+parent: Model APIs
 grand_parent: ML Commons APIs
-nav_order: 10
+nav_order: 50
 ---
 
 # Train