Skip to content

Commit

Permalink
Multilora docs (#22865)
Browse files Browse the repository at this point in the history
  • Loading branch information
natke authored Nov 19, 2024
1 parent c142e6f commit 37ec77f
Show file tree
Hide file tree
Showing 6 changed files with 449 additions and 2 deletions.
75 changes: 74 additions & 1 deletion docs/genai/api/c.md
Original file line number Diff line number Diff line change
Expand Up @@ -280,7 +280,7 @@ OGA_EXPORT OgaResult* OGA_API_CALL OgaGeneratorParamsSetInputSequences(OgaGenera

### Set model input

Set an additional model input, aside from the input_ids. For example additional inputs for LoRA adapters.
Set an additional model input, aside from the input_ids.

### Parameters

Expand Down Expand Up @@ -433,6 +433,79 @@ More details on the current runtime options can be found [here](https://github.c
OGA_EXPORT void OGA_API_CALL OgaGenerator_SetRuntimeOption(OgaGenerator* generator, const char* key, const char* value);
```
## Adapter API
This API is used to load and switch fine-tuned adapters, such as LoRA adapters.
### Create adapters
Creates the object that manages the adapters. This object is used to load all the model adapters. It is responsible for reference counting the loaded adapters.
```c
OGA_EXPORT OgaResult* OGA_API_CALL OgaCreateAdapters(const OgaModel* model, OgaAdapters** out);
```

#### Parameters

* model: the `OgaModel`, which has previously been created

#### Results

* out: a reference to the list of `OgaAdapters` created

### Load adapter

Loads the model adapter from the given adapter file path and adapter name.

```c
OGA_EXPORT OgaResult* OGA_API_CALL OgaLoadAdapter(OgaAdapters* adapters, const char* adapter_file_path, const char* adapter_name);
```
#### Parameters
* `adapters`: The OgaAdapters object into which to load the adapter.
* `adapter_file_path`: The file path of the adapter to load.
* `adapter_name`: A unique identifier for the adapter to be used for adapter querying
#### Return value
`OgaResult` containing an error message if the adapter failed to load.
### Unload adapter
Unloads the adapter with the given identifier from the set of previously loaded adapters. If the adapter is not found, or if it cannot be unloaded (when it is in use), an error is returned.
```c
OGA_EXPORT OgaResult* OGA_API_CALL OgaUnloadAdapter(OgaAdapters* adapters, const char* adapter_name);
```

#### Parameters

* `adapters`: The OgaAdapters object from which to unload the adapter.
* `adapter_name`: The name of the adapter to unload.

#### Return value

`OgaResult` containing an error message if the adapter failed to unload. This can occur if the method is called with an adapter that is not already loaded or has been marked active by a `OgaGenerator` still in use.

### Set active adapter

Sets the adapter with the given adapter name as active for the given OgaGenerator object.

```c
OGA_EXPORT OgaResult* OGA_API_CALL OgaSetActiveAdapter(OgaGenerator* generator, OgaAdapters* adapters, const char* adapter_name);
```
#### Parameters
* `generator`: The OgaGenerator object to set the active adapter.
* `adapters`: The OgaAdapters object that manages the model adapters.
* `adapter_name`: The name of the adapter to set as active.
#### Return value
`OgaResult` containing an error message if the adapter failed to be set as active. This can occur if the method is called with an adapter that has not been previously loaded.
## Enums and structs
```c
Expand Down
86 changes: 86 additions & 0 deletions docs/genai/api/csharp.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,33 @@ public void GenerateNextToken()
public ReadOnlySpan<int> GetSequence(ulong index)
```

### Set active adapter

Sets the active adapter on this Generator instance.

```csharp
using var model = new Model(modelPath);
using var genParams = new GeneratorParams(model);
using var generator = new Generator(model, genParams);
using var adapters = new Adapters(model);
string adapterName = "..."

generator.SetActiveAdapter(adapters, adapterName);
```

#### Parameters

* `adapters`: the previously created `Adapter` object
* `adapterName`: the name of the adapter to activate

#### Return value

`void`

#### Exception

Throws on error.

## Sequences class

### Num sequences member
Expand All @@ -169,3 +196,62 @@ public ulong NumSequences { get { return _numSequences; } }
public ReadOnlySpan<int> this[ulong sequenceIndex]
```

## Adapter class

This API is used to load and switch fine-tuned adapters, such as LoRA adapters.

### Constructor

Construct an instance of an Adapter class.

```csharp
using var model = new Model(modelPath);

using var adapters = new Adapters(model);
```

#### Parameters

* `model`: a previously constructed model class

### Load Adapter method

Loads an adapter file from disk.

```csharp
string adapterPath = Path()
string adapterName = ...

adapters.LoadAdapter(adapterPath, adapterName);
```

#### Parameters

* `adapterPath`: the path to the adapter file on disk
* `adapterName`: a string identifier used to refer to the adapter in subsequent methods

#### Return value

`void`

### Unload Adapter method

Unloads an adapter file from memory.

```csharp
adapters.UnLoadAdapter(adapterName);
```

#### Parameters

* `adapterName`: the name of the adapter to unload

#### Return value

`void`

#### Execption

Throws an exception on error.


4 changes: 4 additions & 0 deletions docs/genai/api/java.md
Original file line number Diff line number Diff line change
Expand Up @@ -610,3 +610,7 @@ public int[] getSequence(long sequenceIndex)

The sequence as an array of integers.


## Adapter class

_Coming very soon!_
55 changes: 54 additions & 1 deletion docs/genai/api/python.md
Original file line number Diff line number Diff line change
Expand Up @@ -316,4 +316,57 @@ Returns
onnxruntime_genai.Generator.get_sequence(index: int) -> numpy.ndarray[numpy.int32]
```

- `index`: (Required) The index of the sequence in the batch to return
- `index`: (Required) The index of the sequence in the batch to return

## Adapter class

### Create

Create an Adapters object, using a model that has been loaded.

```python
model = ...
adapters = og.Adapters(model)
```

#### Parameters

* `model`: the model that the adapters will be used with

#### Return value

An `Adapter` object

### Load

Load an adapter from disk into an Adapter object in memory.

```python
onnxruntime_genai.Adapters(file: str, name: str) -> None
```

#### Parameters

* `file`: the location on disk from which to load the adapter
* `name`: the name of the adapter

#### Return value

None

### Set active adapter

Sets the actove adapter on a `Generator` object.

```python
onnxruntime_genai.Generator(adapters: Generators::Adapters, adapter: str) -> None
```

#### Parameters

* `adapters`: the adapters object, which has had the identified adapter loading into it
* `adapter`: the name of the adapter to set as active

#### Return value

None
66 changes: 66 additions & 0 deletions docs/genai/reference/adapter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
---
title: Adapter file spec
description: Specification for the .onnx_adapter file format
has_children: false
parent: Reference
grand_parent: Generate API (Preview)
nav_order: 2
---

# Adapter file specification


## File format

The adapter file format is flatbuffers

## File extension

The file extension is ".onnx_adapter"

## Schema

Link to live [schema definition](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/lora/adapter_format/adapter_schema.fbs).

The schema definition is as follows

```
File:=
format_version := integer
adapter_version := integer
model_version := integer
[parameter := Parameter]
```

```
Parameter:=
name := string
dimensions := [int64]
data_type := TensorDataType
[data := uint8]
```

```
TensorDataType:=
UNDEFINED = 0 |
FLOAT = 1 |
UINT8 = 2 |
INT8 = 3 |
UINT16 = 4 |
INT16 = 5 |
INT32 = 6 |
INT64 = 7 |
STRING = 8 |
BOOL = 9 |
FLOAT16 = 10 |
DOUBLE = 11 |
UINT32 = 12 |
UINT64 = 13 |
COMPLEX64 = 14 |
COMPLEX128 = 15 |
BFLOAT16 = 16 |
FLOAT8E4M3FN = 17 |
FLOAT8E4M3FNUZ = 18 |
FLOAT8E5M2 = 19 |
FLOAT8E5M2FNUZ = 20
```
Loading

0 comments on commit 37ec77f

Please sign in to comment.