DJL Serving Adapters Management API

**Note that this API is experimental and is subject to change.

DJL Serving provides a set of API allow user to manage adapters at runtime:

Register an adapter
Update an adapter
Describe an adapter's status
Unregister an adapter
List registered adapters

This is an extension of the Management API and can be accessed the same.

Adapter Management APIs

Register an adapter

POST /models/{model_name}/adapters

name: The adapter name.
src: The adapter src. It currently requires a file, but eventually an id or URL can be supported depending on the model handler.
preload (optional): Whether to preload the adapter during initialization, defaults to true.
pin (optional): Whether to pin the adapter, defaults to false. If this option is enabled, adapter will be preloaded, and the adapter is pinned during initialization. This helps certain latency sensitive adapters to be present in GPU memory without being evicted.
All additional arguments will be treated as additional model-specific options and will be passed to the model during adapter registration

curl -X POST "http://localhost:8080/models/adaptecho/adapters?name=a1&src=/opt/ml/model/adapters/a1"

{
  "status": "Adapter a1 registered"
}

Update an adapter

POST /models/{model_name}/adapters/{adapter_name}/update

preload (optional): Whether to preload the adapter during initialization.
pin (optional): Whether to pin the adapter. LoRA adapters can be pinned in GPU without being evicted from LRUCache. This helps certain latency sensitive adapters to be present in GPU memory without being evicted.
All additional arguments will be treated as additional model-specific options and will be passed to the model during adapter registration

curl -X POST "http://localhost:8080/models/adaptecho/adapters/a1/update?pin=true"

{
  "status": "Adapter a1 updated"
}

Describe adapter

GET /models/{model_name}/adapters/{adapter_name}

Use the Describe Adapter API to get the status of an adapter:

curl http://localhost:8080/models/adaptecho/adapters/a1

[
  {
    "name": "a1",
    "src": "/opt/ml/model/adapters/a1",
    "pin": false
  }
]

Unregister an adapter

DELETE /models/{model_name}/adapters/{adapter_name}

Use the Unregister Adapter API to free up system resources:

curl -X DELETE http://localhost:8080/models/adaptecho/adapters/a1

{
  "status": "Adapter a1 unregistered"
}

List adapters

GET /models/{model_name}/adapters

limit (optional): the maximum number of items to return. It is passed as a query parameter. The default value is 100.
next_page_token (optional): queries for next page. It is passed as a query parameter. This value is return by a previous API call.

Use the Adapters API to query current registered adapters:

curl "http://localhost:8080/models/adaptecho/adapters"

This API supports pagination:

curl "http://localhost:8080/models/adaptecho/adapters?limit=5&next_page_token=0"

{
  "adapters": [
    {
      "name": "a1",
      "src": "/opt/ml/model/adapters/a1",
      "pin": false
    }
  ]
}

Advanced

For the single model use case, the /models/{model_name} API prefix can be omitted resulting in queries such as GET /adapters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adapters_api.md

adapters_api.md

DJL Serving Adapters Management API

Adapter Management APIs

Register an adapter

Update an adapter

Describe adapter

Unregister an adapter

List adapters

Advanced

Files

adapters_api.md

Latest commit

History

adapters_api.md

File metadata and controls

DJL Serving Adapters Management API

Adapter Management APIs

Register an adapter

Update an adapter

Describe adapter

Unregister an adapter

List adapters

Advanced