From e70cfa2dfdf19ea2b050eef302b592d8a3f68b84 Mon Sep 17 00:00:00 2001 From: Samuel Kemp Date: Wed, 20 Nov 2024 10:52:09 -0600 Subject: [PATCH 1/4] updated to include gpu dependency and quantization packages --- src/routes/blogs/olive-quant-ft/+page.svx | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/routes/blogs/olive-quant-ft/+page.svx b/src/routes/blogs/olive-quant-ft/+page.svx index 1f922bc7c0aa6..c0373740945c9 100644 --- a/src/routes/blogs/olive-quant-ft/+page.svx +++ b/src/routes/blogs/olive-quant-ft/+page.svx @@ -41,11 +41,15 @@ Also, as part of answering the question of when to quantize we'll show how the f To answer our question on the right sequencing of quantization and fine-tuning we leveraged Olive (ONNX Live) - an advanced model optimization toolkit designed to streamline the process of optimizing AI models for deployment with the ONNX runtime. +> **Note**: Both quantization and fine-tuning need to run on an Nvidia A10 or A100 GPU machine. + ### 1. 💾 Install Olive We installed the [Olive CLI](../blogs/olive-cli) using `pip`:
pip install olive-ai[quantize,finetuning]
+pip install autoawq
+pip install auto-gptq
 
### 2. 🗜️ Quantize From 2c4d6f6c4636a05bd7d23f0c31460883296f17ae Mon Sep 17 00:00:00 2001 From: Samuel Kemp Date: Wed, 20 Nov 2024 11:27:48 -0600 Subject: [PATCH 2/4] Update +page.svx fixed typo --- src/routes/blogs/olive-quant-ft/+page.svx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/routes/blogs/olive-quant-ft/+page.svx b/src/routes/blogs/olive-quant-ft/+page.svx index c0373740945c9..074e44029b779 100644 --- a/src/routes/blogs/olive-quant-ft/+page.svx +++ b/src/routes/blogs/olive-quant-ft/+page.svx @@ -47,7 +47,7 @@ To answer our question on the right sequencing of quantization and fine-tuning w We installed the [Olive CLI](../blogs/olive-cli) using `pip`: -
pip install olive-ai[quantize,finetuning]
+
pip install olive-ai[finetune]
 pip install autoawq
 pip install auto-gptq
 
From 7ccd6f05f2fc7aa3e36b9b4faee35fb7fe18c745 Mon Sep 17 00:00:00 2001 From: Samuel Kemp Date: Wed, 20 Nov 2024 12:27:37 -0600 Subject: [PATCH 3/4] Update +page.svx updated the need for access to the dataset --- src/routes/blogs/olive-quant-ft/+page.svx | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/routes/blogs/olive-quant-ft/+page.svx b/src/routes/blogs/olive-quant-ft/+page.svx index 074e44029b779..abba85bcd61ef 100644 --- a/src/routes/blogs/olive-quant-ft/+page.svx +++ b/src/routes/blogs/olive-quant-ft/+page.svx @@ -75,7 +75,14 @@ olive quantize \ ### 3. 🎚️ Fine-tune -We fine-tune *the quantized models* using the following Olive commands: +We fine-tune *the quantized models* using the [tiny codes](https://huggingface.co/datasets/nampdn-ai/tiny-codes) dataset from Hugging Face. This is a gated dataset +and you'll need to [request for access](https://huggingface.co/docs/hub/main/datasets-gated). Once access has been granted you should login into Hugging Face with +your [access token](https://huggingface.co/docs/hub/security-tokens): + +
huggingface-clu login --token TOKEN
+
+ +Olive can finetune using the following commands:
# Finetune AWQ model
 olive finetune \

From 39d531e47c8c26250962e313533498419f94c299 Mon Sep 17 00:00:00 2001
From: Samuel Kemp 
Date: Wed, 20 Nov 2024 12:38:46 -0600
Subject: [PATCH 4/4] Update +page.svx

fixed typo on model paths
---
 src/routes/blogs/olive-quant-ft/+page.svx | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/routes/blogs/olive-quant-ft/+page.svx b/src/routes/blogs/olive-quant-ft/+page.svx
index abba85bcd61ef..3363b24c4ec7e 100644
--- a/src/routes/blogs/olive-quant-ft/+page.svx
+++ b/src/routes/blogs/olive-quant-ft/+page.svx
@@ -119,8 +119,8 @@ We ran a [perplexity metrics](https://huggingface.co/docs/transformers/perplexit
 
 
input_model:
   type: HfModel
-  model_path: models/phi-awq-pt/model
-  adapter_path: models/phi-awq-pt/adapter
+  model_path: models/phi-awq-ft/model
+  adapter_path: models/phi-awq-ft/adapter
 systems:
   local_system:
     type: LocalSystem