Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ml commons batch inference #7899

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion _ml-commons-plugin/api/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,5 @@ ML Commons supports the following APIs:
- [Controller APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/controller-apis/index/)
- [Execute Algorithm API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/execute-algorithm/)
- [Tasks APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/index/)
- [Train and Predict APIs]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/index/)
- [Profile API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/profile/)
- [Stats API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/stats/)
167 changes: 167 additions & 0 deletions _ml-commons-plugin/api/model-apis/batch-predict.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
---
layout: default
title: Batch predict
parent: Model APIs
grand_parent: ML Commons APIs
nav_order: 65
---

# Batch predict

This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated [GitHub issue](https://github.com/opensearch-project/ml-commons/issues/2488).
{: .warning}

ML Commons can perform inference on large datasets in an offline asynchronous mode using a model deployed on external model servers. To use the Batch Predict API, you must provide the `model_id` for an externally hosted model. Amazon SageMaker, Cohere, and OpenAI are currently the only verified external servers that support this API.

For information about user access for this API, see [Model access control considerations]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/index/#model-access-control-considerations).

For information about externally hosted models, see [Connecting to externally hosted models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/remote-models/index/).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the information below need to be in the form of a bulleted list, or could it just be a sentence?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily but if we add more connectors, I think it's better if it's a list.

For instructions on how set up batch inference and connector blueprints, see the following:

- [Amazon SageMaker batch predict connector blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_sagemaker_connector_blueprint.md)

- [OpenAI batch predict connector blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/batch_inference_openAI_connector_blueprint.md)

## Path and HTTP methods

```json
POST /_plugins/_ml/models/<model_id>/_batch_predict
```

## Prerequisites

Before using the Batch Predict API, you need to create a connector to the externally hosted model. For example, to create a connector to an OpenAI `text-embedding-ada-002` model, send the following request:

```json
POST /_plugins/_ml/connectors/_create
{
"name": "OpenAI Embedding model",
"description": "OpenAI embedding model for testing offline batch",
"version": "1",
"protocol": "http",
"parameters": {
"model": "text-embedding-ada-002",
"input_file_id": "<your input file id in OpenAI>",
"endpoint": "/v1/embeddings"
},
"credential": {
"openAI_key": "<your openAI key>"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://api.openai.com/v1/embeddings",
"headers": {
"Authorization": "Bearer ${credential.openAI_key}"
},
"request_body": "{ \"input\": ${parameters.input}, \"model\": \"${parameters.model}\" }",
"pre_process_function": "connector.pre_process.openai.embedding",
"post_process_function": "connector.post_process.openai.embedding"
},
{
"action_type": "batch_predict",
"method": "POST",
"url": "https://api.openai.com/v1/batches",
"headers": {
"Authorization": "Bearer ${credential.openAI_key}"
},
"request_body": "{ \"input_file_id\": \"${parameters.input_file_id}\", \"endpoint\": \"${parameters.endpoint}\", \"completion_window\": \"24h\" }"
}
]
}
```
{% include copy-curl.html %}

The response contains a connector ID that you'll use in the next steps:

```json
{
"connector_id": "XU5UiokBpXT9icfOM0vt"
}
```

Next, register an externally hosted model and provide the connector ID of the created connector:

```json
POST /_plugins/_ml/models/_register?deploy=true
{
"name": "OpenAI model for realtime embedding and offline batch inference",
"function_name": "remote",
"description": "OpenAI text embedding model",
"connector_id": "XU5UiokBpXT9icfOM0vt"
}
```
{% include copy-curl.html %}

The response contains the task ID for the register operation:

```json
{
"task_id": "rMormY8B8aiZvtEZIO_j",
"status": "CREATED",
"model_id": "lyjxwZABNrAVdFa9zrcZ"
}
```

To check the status of the operation, provide the task ID to the [Tasks API]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/tasks-apis/get-task/). Once the registration is complete, the task `state` changes to `COMPLETED`.

#### Example request

Once you have completed the prerequisite steps, you can call the Batch Predict API. The parameters in the batch predict request override those defined in the connector:

```json
POST /_plugins/_ml/models/lyjxwZABNrAVdFa9zrcZ/_batch_predict
{
"parameters": {
"model": "text-embedding-3-large"
}
}
```
{% include copy-curl.html %}

#### Example response

```json
{
"inference_results": [
{
"output": [
{
"name": "response",
"dataAsMap": {
"id": "batch_<your file id>",
"object": "batch",
"endpoint": "/v1/embeddings",
"errors": null,
"input_file_id": "file-<your input file id>",
"completion_window": "24h",
"status": "validating",
"output_file_id": null,
"error_file_id": null,
"created_at": 1722037257,
"in_progress_at": null,
"expires_at": 1722123657,
"finalizing_at": null,
"completed_at": null,
"failed_at": null,
"expired_at": null,
"cancelling_at": null,
"cancelled_at": null,
"request_counts": {
"total": 0,
"completed": 0,
"failed": 0
},
"metadata": null
}
}
],
"status_code": 200
}
]
}
```

For the definition of each field in the result, see [OpenAI Batch API](https://platform.openai.com/docs/guides/batch). Once the batch inference is complete, you can download the output by calling the [OpenAI Files API](https://platform.openai.com/docs/api-reference/files) and providing the file name specified in the `id` field of the response.
44 changes: 34 additions & 10 deletions _ml-commons-plugin/api/model-apis/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,40 @@

# Model APIs

ML Commons supports the following model-level APIs:

- [Register model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/)
- [Deploy model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/)
- [Get model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/get-model/)
- [Search model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/search-model/)
- [Update model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/update-model/)
- [Undeploy model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/undeploy-model/)
- [Delete model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/delete-model/)
- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/) (invokes a model)
ML Commons supports the following model-level CRUD APIs:

- [Register Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/register-model/)
- [Deploy Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/deploy-model/)
- [Get Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/get-model/)
- [Search Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/search-model/)
- [Update Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/update-model/)
- [Undeploy Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/undeploy-model/)
- [Delete Model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/delete-model/)

# Predict APIs

Predict APIs are used to invoke machine learning (ML) models. ML Commons supports the following Predict APIs:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the second instance of "Predict" intentionally capitalized?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, capitalized since it's the API name.


- [Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/predict/)
- [Batch Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/model-apis/batch-predict/) (experimental)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Batch Predict" is not capitalized in the title or H1 of the preceding file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Normally, we imply the operation in the H1 and left nav title but this is the actual API name so I capitalized. Alternatively, I can change everything to sentence case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine as is


# Train API

The ML Commons Train API lets you train ML algorithms synchronously and asynchronously:

- [Train]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/train/)

To train tasks through the API, three inputs are required:

- Algorithm name: Must be a [FunctionName](https://github.com/opensearch-project/ml-commons/blob/1.3/common/src/main/java/org/opensearch/ml/common/parameter/FunctionName.java). This determines what algorithm the ML model runs. To add a new function, see [How To Add a New Function](https://github.com/opensearch-project/ml-commons/blob/main/docs/how-to-add-new-function.md).
- Model hyperparameters: Adjust these parameters to improve model accuracy.
- Input data: The data that trains the ML model or applies it to predictions. You can input data in two ways: query against your index or use a data frame.

# Train and Predict API

Check failure on line 41 in _ml-commons-plugin/api/model-apis/index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'Train and Predict API' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'Train and Predict API' is a heading and should be in sentence case.", "location": {"path": "_ml-commons-plugin/api/model-apis/index.md", "range": {"start": {"line": 41, "column": 3}}}, "severity": "ERROR"}

The Train and Predict API lets you train and invoke the model using the same dataset:

- [Train and Predict]({{site.url}}{{site.baseurl}}/ml-commons-plugin/api/train-predict/train-and-predict/)

## Model access control considerations

Expand Down
24 changes: 0 additions & 24 deletions _ml-commons-plugin/api/train-predict/index.md

This file was deleted.

4 changes: 2 additions & 2 deletions _ml-commons-plugin/api/train-predict/predict.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
---
layout: default
title: Predict
parent: Train and Predict APIs
parent: Model APIs
grand_parent: ML Commons APIs
nav_order: 20
nav_order: 60
---

# Predict
Expand Down
4 changes: 2 additions & 2 deletions _ml-commons-plugin/api/train-predict/train-and-predict.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
---
layout: default
title: Train and predict
parent: Train and Predict APIs
parent: Model APIs
grand_parent: ML Commons APIs
nav_order: 10
nav_order: 70
---

## Train and predict
Expand Down
4 changes: 2 additions & 2 deletions _ml-commons-plugin/api/train-predict/train.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
---
layout: default
title: Train
parent: Train and Predict APIs
parent: Model APIs
grand_parent: ML Commons APIs
nav_order: 10
nav_order: 50
---

# Train
Expand Down
Loading