Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated to include gpu dependency and quantization packages #22904

Merged
merged 4 commits into from
Nov 20, 2024
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 13 additions & 2 deletions src/routes/blogs/olive-quant-ft/+page.svx
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,15 @@ Also, as part of answering the question of when to quantize we'll show how the f

To answer our question on the right sequencing of quantization and fine-tuning we leveraged Olive (ONNX Live) - an advanced model optimization toolkit designed to streamline the process of optimizing AI models for deployment with the ONNX runtime.

> **Note**: Both quantization and fine-tuning need to run on an Nvidia A10 or A100 GPU machine.

### 1. 💾 Install Olive

We installed the [Olive CLI](../blogs/olive-cli) using `pip`:

<pre><code>pip install olive-ai[quantize,finetuning]
<pre><code>pip install olive-ai[finetune]
pip install autoawq
pip install auto-gptq
</code></pre>

### 2. 🗜️ Quantize
Expand All @@ -71,7 +75,14 @@ olive quantize \

### 3. 🎚️ Fine-tune

We fine-tune *the quantized models* using the following Olive commands:
We fine-tune *the quantized models* using the [tiny codes](https://huggingface.co/datasets/nampdn-ai/tiny-codes) dataset from Hugging Face. This is a gated dataset
and you'll need to [request for access](https://huggingface.co/docs/hub/main/datasets-gated). Once access has been granted you should login into Hugging Face with
your [access token](https://huggingface.co/docs/hub/security-tokens):

<pre><code>huggingface-clu login --token TOKEN
</code></pre>

Olive can finetune using the following commands:

<pre><code># Finetune AWQ model
olive finetune \
Expand Down
Loading