Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stable Diffusion 1.5 and Stable Diffusion XL Tutorials #82

Merged
merged 86 commits into from
Mar 2, 2024
Merged
Show file tree
Hide file tree
Changes from 72 commits
Commits
Show all changes
86 commits
Select commit Hold shift + click to select a range
deefc37
initial stable diffusion - work in progress
nnshah1 Feb 26, 2024
d907813
updating with batching enabled
nnshah1 Feb 27, 2024
92b49ea
update client, model
nnshah1 Feb 27, 2024
2140339
updated to create a backend
nnshah1 Feb 27, 2024
141010c
updated for xl and 1.5 seperated configs
nnshah1 Feb 27, 2024
887f086
updated with version info
nnshah1 Feb 27, 2024
3619b76
updating to read steps from file and prevent parallel engine build
Feb 28, 2024
968a0f8
remove multi gpu instance capability - not supported
Feb 28, 2024
41db857
adding client and base deployment
nnshah1 Feb 28, 2024
ee0125b
update docker ignore for onnx directories
Feb 29, 2024
2dd3cbd
update with arg to build models
Feb 29, 2024
1bea74e
updating with basic client
nnshah1 Feb 29, 2024
b4096e6
updated with logging and dynamic batching settings
nnshah1 Feb 29, 2024
a73e57e
updates for making file saving optional and testing client
nnshah1 Feb 29, 2024
f289784
Adding documentation
nnshah1 Feb 29, 2024
4a38b6c
updated
nnshah1 Feb 29, 2024
66e5e2d
updates
nnshah1 Feb 29, 2024
16abc4b
updates
nnshah1 Feb 29, 2024
2588c78
updates
nnshah1 Feb 29, 2024
e60a920
update relative link
nnshah1 Feb 29, 2024
556bd9f
updates
nnshah1 Feb 29, 2024
e898c7b
updated
nnshah1 Feb 29, 2024
66b4c97
removing unused imports
nnshah1 Feb 29, 2024
1c319d6
update with timeout
nnshah1 Feb 29, 2024
d7b2880
fix typo
nnshah1 Feb 29, 2024
1926c2f
fix typo
nnshah1 Feb 29, 2024
f96c073
updates
nnshah1 Feb 29, 2024
516b5e2
Update Popular_Models_Guide/StableDiffusion/docs/model_configuration.md
nnshah1 Feb 29, 2024
3334a46
Update Popular_Models_Guide/StableDiffusion/docs/model_configuration.md
nnshah1 Feb 29, 2024
d277c55
Update Popular_Models_Guide/StableDiffusion/run.sh
nnshah1 Feb 29, 2024
de633d4
Update Popular_Models_Guide/StableDiffusion/README.md
nnshah1 Feb 29, 2024
fc39e0d
remove unused imports
nnshah1 Feb 29, 2024
3f4f37f
Update Popular_Models_Guide/StableDiffusion/README.md
nnshah1 Feb 29, 2024
3969235
Update Popular_Models_Guide/StableDiffusion/README.md
nnshah1 Feb 29, 2024
44cbe6a
Update Popular_Models_Guide/StableDiffusion/README.md
nnshah1 Feb 29, 2024
378d685
Update Popular_Models_Guide/StableDiffusion/README.md
nnshah1 Feb 29, 2024
a1d40d0
Update Popular_Models_Guide/StableDiffusion/README.md
nnshah1 Feb 29, 2024
153e87f
Update Popular_Models_Guide/StableDiffusion/backend/diffusion/model.py
nnshah1 Feb 29, 2024
10b0d1a
Update Popular_Models_Guide/StableDiffusion/client.py
nnshah1 Feb 29, 2024
3055749
Update Popular_Models_Guide/StableDiffusion/run.sh
nnshah1 Feb 29, 2024
160599e
Merge branch 'nnshah1-stable-diffusion' of https://github.com/triton-…
nnshah1 Feb 29, 2024
9c305d6
updated with copyright
nnshah1 Feb 29, 2024
85a2b42
update copyright
nnshah1 Feb 29, 2024
a57e59b
added missing scripts
nnshah1 Mar 1, 2024
6e275a6
Update Popular_Models_Guide/StableDiffusion/README.md
nnshah1 Mar 1, 2024
71c07aa
Update Popular_Models_Guide/StableDiffusion/scripts/build_models.sh
nnshah1 Mar 1, 2024
ec2e1be
Update Popular_Models_Guide/StableDiffusion/backend/diffusion/model.py
nnshah1 Mar 1, 2024
394d97a
Update Popular_Models_Guide/StableDiffusion/client.py
nnshah1 Mar 1, 2024
64a4a61
Update Popular_Models_Guide/StableDiffusion/diffusion-models/stable_d…
nnshah1 Mar 1, 2024
bca243b
fixed indexing in batches
nnshah1 Mar 1, 2024
2e82bd7
Merge branch 'nnshah1-stable-diffusion' of https://github.com/triton-…
nnshah1 Mar 1, 2024
0b2f8dc
updated with branch in anticipation of cherry pick
nnshah1 Mar 1, 2024
7f28092
Update Popular_Models_Guide/StableDiffusion/README.md
nnshah1 Mar 1, 2024
a6da3a1
updated language
nnshah1 Mar 1, 2024
de26e85
updated with new line
nnshah1 Mar 1, 2024
83d6188
clean up of docker ignore
nnshah1 Mar 1, 2024
c069424
update to test url reference
nnshah1 Mar 1, 2024
6cb73b3
updated
nnshah1 Mar 1, 2024
3ca7e13
update with reference test
nnshah1 Mar 1, 2024
3446345
updated
nnshah1 Mar 1, 2024
00c2fc7
Update Popular_Models_Guide/StableDiffusion/diffusion-models/stable_d…
nnshah1 Mar 1, 2024
1323588
remove comments
nnshah1 Mar 1, 2024
f5f110c
Merge branch 'nnshah1-stable-diffusion' of https://github.com/triton-…
nnshah1 Mar 1, 2024
6d5010d
remove client application reference
nnshah1 Mar 1, 2024
8937d77
updated
nnshah1 Mar 1, 2024
332d930
removing scripts replaced by popular model guide
nnshah1 Mar 1, 2024
a1c49a4
updated for latest 24.01
nnshah1 Mar 1, 2024
9e0fb84
updated for 24.01
nnshah1 Mar 1, 2024
20bb6fd
updated with known limitations
nnshah1 Mar 1, 2024
fbb7f22
updated
nnshah1 Mar 1, 2024
01778d4
tweak
nnshah1 Mar 1, 2024
e75d4f9
updated
nnshah1 Mar 1, 2024
35a2ad7
updated
nnshah1 Mar 1, 2024
d0dc0c0
updated with reference to refinder model in example
nnshah1 Mar 1, 2024
d9654ee
Update Triton_Inference_Server_Python_API/README.md
nnshah1 Mar 1, 2024
f3971a3
updated misspelling
nnshah1 Mar 1, 2024
45032d3
Merge branch 'nnshah1-stable-diffusion' of https://github.com/triton-…
nnshah1 Mar 1, 2024
5a8ff84
updated references
nnshah1 Mar 1, 2024
0f9704e
updating links
nnshah1 Mar 1, 2024
d269e37
updating
nnshah1 Mar 1, 2024
8b55dd7
updated
nnshah1 Mar 1, 2024
023fff6
Update README.md
nnshah1 Mar 1, 2024
b8b712b
remove reference to specific release
nnshah1 Mar 1, 2024
2fdbf3b
Merge branch 'nnshah1-stable-diffusion' of https://github.com/triton-…
nnshah1 Mar 1, 2024
9354288
updated to remove reference
nnshah1 Mar 1, 2024
106d948
Update Popular_Models_Guide/StableDiffusion/README.md
nnshah1 Mar 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
# Pretrained Models
*.onnx
**/*.onnx
**/onnx/*.opt
**/*.bin
**/*.plan
**/pytorch_model

# Python Stuff
__pycache__
**/__pycache__

# Downloaded Assets
downloads
**/downloads
300 changes: 300 additions & 0 deletions Popular_Models_Guide/StableDiffusion/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,300 @@
<!--
# Copyright 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
# * Neither the name of NVIDIA CORPORATION nor the names of its
# contributors may be used to endorse or promote products derived
# from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-->

# Deploying Stable Diffusion Models with Triton and TensorRT

This example demonstrates how to deploy Stable Diffusion models in
Triton by leveraging the [TensorRT demo](https://github.com/NVIDIA/TensorRT/tree/release/9.2/demo/Diffusion)
pipeline and utilities.

Using the TensorRT demo as a base this example contains a reusable
[python based backend](https://github.com/triton-inference-server/backend/blob/main/docs/python_based_backends.md), [`/backend/diffusion/model.py`](backend/diffusion/model.py),
suitable for deploying multiple versions and configurations of
Diffusion models.

For more information on Stable Diffusion please visit
[stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5),
[stable-diffusion-xl](https://huggingface.co/docs/diffusers/en/using-diffusers/sdxl). For
more information on the TensorRT implementation please see the [TensorRT demo](https://github.com/NVIDIA/TensorRT/tree/release/9.2/demo/Diffusion).

> [!Note]
> This example is given as sample code and should be reviewed before use in production settings.

| [Requirements](#requirements) | [Building Server Image](#building-the-triton-inference-server-image) | [Stable Diffusion v1.5](#building-and-running-stable-diffusion-v-15) | [Stable Diffusion XL](#building-and-running-stable-diffusion-xl) | [Sending an Inference Request](#sending-an-inference-request) | [Model Configuration](docs/model_configuration.md) | [Sample Client](#sample-client) | [Known Issues and Limitations](#known-issues-and-limitations) |

## Requirements

The following instructions require a Linux system with Docker
installed. For CUDA support, make sure your CUDA driver meets the
requirements in the "NVIDIA Driver" section of [Deep Learning Framework
support matrix](https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html).

## Building the Triton Inference Server Image

The example is designed based on the
`nvcr.io/nvidia/tritonserver:24.01-py3` docker image and [TensorRT OSS v9.2.0](https://github.com/NVIDIA/TensorRT/releases/tag/v9.2.0).

A set of convenience scripts are provided to create a docker image
based on the `nvcr.io/nvidia/tritonserver:24.01-py3` image with the
dependencies for the TensorRT Stable Diffusion demo installed.

### Triton Inference Server + TensorRT OSS

#### Clone Repository
```bash
git clone https://github.com/triton-inference-server/tutorials.git -b r24.02 --single-branch
cd tutorials/Popular_Models_Guide/StableDiffusion
nnshah1 marked this conversation as resolved.
Show resolved Hide resolved
```

#### Build `tritonserver:r24.01-diffusion` Image
```bash
./build.sh
```

#### Included Models

The `default` build includes model configuration files located in the
`/diffusion-models` folder. Example configurations are provided for
[`stable_diffusion_1_5`](diffusion-models/stable_diffustion_1_5) and
[`stable_diffusion_xl`](diffusion-models/stable_diffustion_xl).

Model artifacts and engine files are not included in the image but are
built into a volume mounted directory as a separate step.

## Building and Running Stable Diffusion v 1.5

### Start `tritonserver:r24.01-diffusion` Container

The following command starts a container and volume mounts the current
directory as `workspace`.

```bash
./run.sh
```

### Build Stable Diffusion v 1.5 Engine

```bash
./scripts/build_models.sh --model stable_diffusion_1_5
nnshah1 marked this conversation as resolved.
Show resolved Hide resolved
```

#### Expected Output
```
diffusion-models
|-- stable_diffusion_1_5
| |-- 1
| | |-- 1.5-engine-batch-size-1
| | |-- 1.5-onnx
| | |-- 1.5-pytorch_model
| `-- config.pbtxt

```

### Start a Server Instance

> [!Note]
> We use `EXPLICIT` model control mode for demonstration purposes to
> control which stable diffusion version is loaded. For production
> deployments please refer to [Secure Deployment Considerations][secure_guide]
> for more information on the risks associated with `EXPLICIT` mode.

[secure_guide]: https://github.com/triton-inference-server/server/blob/main/docs/customization_guide/deploy.md

```bash
tritonserver --model-repository diffusion-models --model-control-mode explicit --load-model stable_diffusion_1_5
```

#### Expected Output
```
<SNIP>
I0229 20:15:52.125050 749 server.cc:676]
+----------------------+---------+--------+
| Model | Version | Status |
+----------------------+---------+--------+
| stable_diffusion_1_5 | 1 | READY |
+----------------------+---------+--------+

<SNIP>
```

## Building and Running Stable Diffusion XL

### Start `tritonserver:r24.01-diffusion` Container

The following command starts a container and volume mounts the current
directory as `workspace`.

```bash
./run.sh
```

### Build Stable Diffusion XL Engine

```bash
./scripts/build_models.sh --model stable_diffusion_xl
```

#### Expected Output
```
diffusion-models
|-- stable_diffusion_xl
|-- 1
| |-- xl-1.0-engine-batch-size-1
| |-- xl-1.0-onnx
| `-- xl-1.0-pytorch_model
`-- config.pbtxt
```

### Start a Server Instance

> [!Note]
> We use `EXPLICIT` model control mode for demonstration purposes to
> control which stable diffusion version is loaded. For production
> deployments please refer to [Secure Deployment Considerations][secure_guide]
> for more information on the risks associated with `EXPLICIT` mode.


```bash
tritonserver --model-repository diffusion-models --model-control-mode explicit --load-model stable_diffusion_xl
```

#### Expected Output
```
<SNIP>
I0229 20:22:22.912465 1440 server.cc:676]
+---------------------+---------+--------+
| Model | Version | Status |
+---------------------+---------+--------+
| stable_diffusion_xl | 1 | READY |
+---------------------+---------+--------+

<SNIP>
```

## Sending an Inference Request

We've provided a sample [client](client.py) application to make
sending and receiving requests simpler.

### Start `tritonserver:r24.01-diffusion` Container

In a separate terminal from the server start a new container.

The following command starts a container and volume mounts the current
directory as `workspace`.

```bash
./run.sh
```


### Send Prompt to Stable Diffusion 1.5

```bash
python3 client.py --model stable_diffusion_1_5 --prompt "butterfly in new york, 4k, realistic" --save-image
```

#### Example Output

```bash
Client: 0 Throughput: 0.7201335361144658 Avg. Latency: 1.3677194118499756
Throughput: 0.7163933558221957 Total Time: 1.395881175994873
```

If `--save-image` is given then output images will be saved as jpegs.

`
client_0_generated_image_0.jpg
`

![sample_generated_image](./docs/client_0_generated_image_0_1_5.jpg)


### Send Prompt to Stable Diffusion XL

```bash
python3 client.py --model stable_diffusion_xl --prompt "butterfly in new york, 4k, realistic" --save-image
```

#### Example Output

```bash
Client: 0 Throughput: 0.1825067711674996 Avg. Latency: 5.465569257736206
Throughput: 0.18224859609447058 Total Time: 5.487010717391968
```

If `--save-image` is given then output images will be saved as jpegs.

`
client_0_generated_image_0.jpg
`

![sample_generated_image](./docs/client_0_generated_image_0_xl.jpg)


## Sample Client

The sample [client](client.py) application enables users to quickly
test the diffusion models under different concurrency scenarios. For a
full list and description of the client application's options use:

```
python3 client.py --help
```

### Sending Concurrent Requests

To increase load and concurrency users can use the `clients` and
`requests` options to control the number of client processes and the
number of requests sent by each client.

#### Example: Ten Clients Sending Ten Requests Each

The following command enables ten clients each sending ten
requests. Each client is an independent process that sends its
requests one after the other in parallel with the other nine clients.

```bash
python3 client.py --model stable_diffusion_xl --requests 10 --clients 10
```

## Known Issues and Limitations

1. When shutting down the server an invalid memory operation occurs:
rmccorm4 marked this conversation as resolved.
Show resolved Hide resolved

```
free(): invalid pointer
```

> [!Note]
> This error is also seen in standalone applications outside of the Triton Inference Server
> and we believe this is due to an interaction between imported python modules.

2. The diffusion backend doesn't support using a refiner model.
rmccorm4 marked this conversation as resolved.
Show resolved Hide resolved


Loading
Loading