Skip to content

Commit

Permalink
Add link to iterative scheduling tutorial (#94)
Browse files Browse the repository at this point in the history
* Add link to iterative scheduling tutorial

* Review comments
  • Loading branch information
Tabrizian authored May 24, 2024
1 parent 30dea5c commit b3759c8
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 3 deletions.
4 changes: 2 additions & 2 deletions Conceptual_Guide/Part_6-building_complex_pipelines/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,8 @@

# Building Complex Pipelines: Stable Diffusion

| Navigate to | [Part 5: Building Model Ensembles](../Part_5-Model_Ensembles/) | [Documentation: BLS](https://github.com/triton-inference-server/python_backend#business-logic-scripting) |
| ------------ | --------------- | --------------- |
| Navigate to | [Part 5: Building Model Ensembles](../Part_5-Model_Ensembles/) | [Part 7: Iterative Scheduling Tutorial](./Part_7-iterative_scheduling) | [Documentation: BLS](https://github.com/triton-inference-server/python_backend#business-logic-scripting) |
| ------------ | --------------- | --------------- | --------------- |

**Watch [this explainer video](https://youtu.be/JgP2WgNIq_w) with discusses the pipeline, before proceeding with the example**. This example focuses on showcasing two of Triton Inference Server's features:
* Using multiple frameworks in the same inference pipeline. Refer [this for more information](https://github.com/triton-inference-server/backend#where-can-i-find-all-the-backends-that-are-available-for-triton) about supported frameworks.
Expand Down
3 changes: 3 additions & 0 deletions Conceptual_Guide/Part_7-iterative_scheduling/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@

# Deploying a GPT-2 Model using Python Backend and Iterative Scheduling

| Navigate to | [Part 6: Building Complex Pipelines: Stable Diffusion](../Part_6-building_complex_pipelines) | [Documentation: Iterative Scheduling](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_configuration.html#iterative-sequences) |
| ------------ | --------------- | --------------- |

In this tutorial, we will deploy a GPT-2 model using the Python backend and
demonstrate the
[iterative scheduling](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_configuration.html#iterative-sequences)
Expand Down
2 changes: 1 addition & 1 deletion Conceptual_Guide/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,4 +39,4 @@ Conceptual guides have been designed as an onboarding experience to Triton Infer
* [Part 4: Accelerating Models](Part_4-inference_acceleration/): Another path towards achieving higher throughput is to accelerate the underlying models. This guide covers SDKs and tools which can be used to accelerate the models.
* [Part 5: Building Model Ensembles](./Part_5-Model_Ensembles/): Models are rarely used standalone. This guide will cover "how to build a deep learning inference pipeline?"
* [Part 6: Using the BLS API to build complex pipelines](Part_6-building_complex_pipelines/): Often times there are scenarios where the pipeline requires control flows. Learn how to work with complex pipelines with models deployed on different backends.

* [Part 7: Iterative Scheduling Tutorial](./Part_7-iterative_scheduling): Shows how to use the Triton Iterative Scheduler with a GPT2 model using HuggingFace Transformers.

0 comments on commit b3759c8

Please sign in to comment.