diff --git a/notebooks/en/_toctree.yml b/notebooks/en/_toctree.yml index 566d049a..7b33993b 100644 --- a/notebooks/en/_toctree.yml +++ b/notebooks/en/_toctree.yml @@ -16,3 +16,5 @@ title: Advanced RAG on HuggingFace documentation using LangChain - local: rag_evaluation title: RAG Evaluation + - local: prompt_tuning_peft + title: Prompt tuning with PEFT diff --git a/notebooks/en/index.md b/notebooks/en/index.md index 4e1ec0b3..f2eed41c 100644 --- a/notebooks/en/index.md +++ b/notebooks/en/index.md @@ -7,6 +7,7 @@ applications and solving various machine learning tasks using open-source tools Check out the recently added notebooks: +- [Prompt Tuning with PEFT Library](prompt_tuning_peft) - [Migrating from OpenAI to Open LLMs Using TGI's Messages API](tgi_messages_api_demo) - [Automatic Embeddings with TEI through Inference Endpoints](automatic_embedding_tei_inference_endpoints) - [Simple RAG for GitHub issues using Hugging Face Zephyr and LangChain](rag_zephyr_langchain) @@ -15,7 +16,7 @@ Check out the recently added notebooks: - [RAG Evaluation Using Synthetic data and LLM-As-A-Judge](rag_evaluation) - [Advanced RAG on HuggingFace documentation using LangChain](advanced_rag) -You can also check out the notebooks in the cookbook's [GitHub repo](https://github.com/huggingface/cookbook). +You can also check out the notebooks in the cookbook's [GitHub repo](https://github.com/huggingface/cookbook.ipynb). ## Contributing diff --git a/notebooks/en/prompt_tuning_peft.ipynb b/notebooks/en/prompt_tuning_peft.ipynb new file mode 100644 index 00000000..2ae63c4d --- /dev/null +++ b/notebooks/en/prompt_tuning_peft.ipynb @@ -0,0 +1,1022 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "6fba2d42-ed99-4a03-8033-d479ce24d5dd", + "showTitle": false, + "title": "" + }, + "id": "2vkOvTEsVaTA" + }, + "source": [ + "# Prompt Tuning With PEFT.\n", + "_Authored by: [Pere Martra](https://github.com/peremartra)_\n", + "\n", + "\n", + "In this notebook we are introducing how to apply prompt tuning with the PEFT library to a pre-trained model.\n", + "\n", + "For a complete list of models compatible with PEFT refer to their [documentation](https://huggingface.co/docs/peft/main/en/index#supported-methods).\n", + "\n", + "A short sample of models available to be trained with PEFT includes Bloom, Llama, GPT-J, GPT-2, BERT, and more. Hugging Face is working hard to add more models to the library.\n", + "\n", + "## Brief introduction to Prompt Tuning.\n", + "It’s an Additive Fine-Tuning technique for models. This means that we WILL NOT MODIFY ANY WEIGHTS OF THE ORIGINAL MODEL. You might be wondering, how are we going to perform Fine-Tuning then? Well, we will train additional layers that are added to the model. That’s why it’s called an Additive technique.\n", + "\n", + "Considering it’s an Additive technique and its name is Prompt-Tuning, it seems clear that the layers we’re going to add and train are related to the prompt.\n", + "\n", + "![Prompt_Tuning_Diagram](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/Martra_Figure_5_Prompt_Tuning.jpg)\n", + "\n", + "We are creating a type of superprompt by enabling a model to enhance a portion of the prompt with its acquired knowledge. However, that particular section of the prompt cannot be translated into natural language. **It's as if we've mastered expressing ourselves in embeddings and generating highly effective prompts.**\n", + "\n", + "In each training cycle, the only weights that can be modified to minimize the loss function are those integrated into the prompt.\n", + "\n", + "The primary consequence of this technique is that the number of parameters to train is genuinely small. However, we encounter a second, perhaps more significant consequence, namely that, **since we do not modify the weights of the pretrained model, it does not alter its behavior or forget any information it has previously learned.**\n", + "\n", + "The training is faster and more cost-effective. Moreover, we can train various models, and during inference time, we only need to load one foundational model along with the new smaller trained models because the weights of the original model have not been altered\n", + "\n", + "## What are we going to do in the notebook?\n", + "We are going to train two different models using two datasets, each with just one pre-trained model from the Bloom family. One model will be trained with a dataset of prompts, while the other will use a dataset of inspirational sentences. We will compare the results for the same question from both models before and after training.\n", + "\n", + "Additionally, we'll explore how to load both models with only one copy of the foundational model in memory.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tZhdbTh-VaTA" + }, + "source": [ + "## Loading the PEFT Library\n", + "This library contains the Hugging Face implementation of various Fine-Tuning techniques, including Prompt Tuning" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "d16bf5ec-888b-4c76-a655-193fd4cc8a36", + "showTitle": false, + "title": "" + }, + "id": "JechhJhhVaTA" + }, + "outputs": [], + "source": [ + "!pip install -q peft==0.8.2" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "id": "6CRxq5Z2WJ7C" + }, + "outputs": [], + "source": [ + "!pip install -q datasets==2.14.5" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "GGbh426RVaTB" + }, + "source": [ + "From the transformers library, we import the necessary classes to instantiate the model and the tokenizer." + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "31738463-c9b0-431d-869e-1735e1e2f5c7", + "showTitle": false, + "title": "" + }, + "id": "KWOEt-yOVaTB" + }, + "outputs": [], + "source": [ + "from transformers import AutoModelForCausalLM, AutoTokenizer" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6qYsnwjSVaTC" + }, + "source": [ + "### Loading the model and the tokenizers.\n", + "\n", + "Bloom is one of the smallest and smartest models available for training with the PEFT Library using Prompt Tuning. You can choose any model from the Bloom Family, and I encourage you to try at least two of them to observe the differences.\n", + "\n", + "I'm opting for the smallest one to minimize training time and avoid memory issues in Colab." + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": { + "id": "MnqIhv2UVaTC" + }, + "outputs": [], + "source": [ + "model_name = \"bigscience/bloomz-560m\"\n", + "#model_name=\"bigscience/bloom-1b1\"\n", + "NUM_VIRTUAL_TOKENS = 4\n", + "NUM_EPOCHS = 6" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": { + "id": "fSMu3qRsVaTC" + }, + "outputs": [], + "source": [ + "tokenizer = AutoTokenizer.from_pretrained(model_name)\n", + "foundational_model = AutoModelForCausalLM.from_pretrained(\n", + " model_name,\n", + " trust_remote_code=True\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "8W2fWhOnVaTC" + }, + "source": [ + "## Inference with the pre trained bloom model\n", + "If you want to achieve more varied and original generations, uncomment the parameters: temperature, top_p, and do_sample, in *model.generate* below\n", + "\n", + "With the default configuration, the model's responses remain consistent across calls." + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": { + "id": "47j2D3WWVaTC" + }, + "outputs": [], + "source": [ + "#this function returns the outputs from the model received, and inputs.\n", + "def get_outputs(model, inputs, max_new_tokens=100):\n", + " outputs = model.generate(\n", + " input_ids=inputs[\"input_ids\"],\n", + " attention_mask=inputs[\"attention_mask\"],\n", + " max_new_tokens=max_new_tokens,\n", + " #temperature=0.2,\n", + " #top_p=0.95,\n", + " #do_sample=True,\n", + " repetition_penalty=1.5, #Avoid repetition.\n", + " early_stopping=True, #The model can stop before reach the max_length\n", + " eos_token_id=tokenizer.eos_token_id\n", + " )\n", + " return outputs" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "ca4d203a-5152-4947-ab34-cfd0b40a102a", + "showTitle": false, + "title": "" + }, + "id": "kRLSfuo2VaTC" + }, + "source": [ + "As we want to have two different trained models, I will create two distinct prompts.\n", + "\n", + "The first model will be trained with a dataset containing prompts, and the second one with a dataset of motivational sentences.\n", + "\n", + "The first model will receive the prompt \"I want you to act as a motivational coach.\" and the second model will receive \"There are two nice things that should matter to you:\"\n", + "\n", + "But first, I'm going to collect some results from the model without Fine-Tuning." + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "1d4c80a9-4edd-4fcd-aef0-996f4da5cc02", + "showTitle": false, + "title": "" + }, + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "QvStaT7cVaTC", + "outputId": "ab34b3cd-a849-4dff-b36d-bf25c9f55ce1" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "[\"I want you to act as a motivational coach. Don't be afraid of being challenged.\"]\n" + ] + } + ], + "source": [ + "input_prompt = tokenizer(\"I want you to act as a motivational coach. \", return_tensors=\"pt\")\n", + "foundational_outputs_prompt = get_outputs(foundational_model, input_prompt, max_new_tokens=50)\n", + "\n", + "print(tokenizer.batch_decode(foundational_outputs_prompt, skip_special_tokens=True))" + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "1Xhm3jZMVaTD", + "outputId": "305f0137-6a02-4e43-9c9d-2b4ecd377937" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "['There are two nice things that should matter to you: the price and quality of your product.']\n" + ] + } + ], + "source": [ + "input_sentences = tokenizer(\"There are two nice things that should matter to you:\", return_tensors=\"pt\")\n", + "foundational_outputs_sentence = get_outputs(foundational_model, input_sentences, max_new_tokens=50)\n", + "\n", + "print(tokenizer.batch_decode(foundational_outputs_sentence, skip_special_tokens=True))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "f438d43b-6b9f-445e-9df4-60ea09640764", + "showTitle": false, + "title": "" + }, + "id": "OGbJTbRnVaTD" + }, + "source": [ + "Both answers are more or less correct. Any of the Bloom models is pre-trained and can generate sentences accurately and sensibly. Let's see if, after training, the responses are either equal or more accurately generated.\n", + "\n", + "## Preparing the Datasets\n", + "The Datasets useds are:\n", + "* https://huggingface.co/datasets/fka/awesome-chatgpt-prompts\n", + "* https://huggingface.co/datasets/Abirate/english_quotes\n" + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "metadata": { + "id": "RD8H_LLaVaTD" + }, + "outputs": [], + "source": [ + "import os\n", + "#os.environ[\"TOKENIZERS_PARALLELISM\"] = \"false\"" + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "2ed62b41-e3fa-4a41-a0a9-59f35a6904f9", + "showTitle": false, + "title": "" + }, + "id": "xmAp_o4PVaTD" + }, + "outputs": [], + "source": [ + "from datasets import load_dataset\n", + "\n", + "dataset_prompt = \"fka/awesome-chatgpt-prompts\"\n", + "\n", + "#Create the Dataset to create prompts.\n", + "data_prompt = load_dataset(dataset_prompt)\n", + "data_prompt = data_prompt.map(lambda samples: tokenizer(samples[\"prompt\"]), batched=True)\n", + "train_sample_prompt = data_prompt[\"train\"].select(range(50))\n" + ] + }, + { + "cell_type": "code", + "source": [ + "display(train_sample_prompt)" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 86 + }, + "id": "jNlOpGbqBgcu", + "outputId": "3f8106b2-948b-4a7b-cf78-bd3fcc2f0338" + }, + "execution_count": 51, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/plain": [ + "Dataset({\n", + " features: ['act', 'prompt', 'input_ids', 'attention_mask'],\n", + " num_rows: 50\n", + "})" + ] + }, + "metadata": {} + } + ] + }, + { + "cell_type": "code", + "execution_count": 52, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "dZcOaE5CU658", + "outputId": "fb8f5081-012b-4c37-ee1f-3aef2d0f54a7" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "{'act': ['Linux Terminal'], 'prompt': ['I want you to act as a linux terminal. I will type commands and you will reply with what the terminal should show. I want you to only reply with the terminal output inside one unique code block, and nothing else. do not write explanations. do not type commands unless I instruct you to do so. when i need to tell you something in english, i will do so by putting text inside curly brackets {like this}. my first command is pwd'], 'input_ids': [[44, 4026, 1152, 427, 1769, 661, 267, 104105, 28434, 17, 473, 2152, 4105, 49123, 530, 1152, 2152, 57502, 1002, 3595, 368, 28434, 3403, 6460, 17, 473, 4026, 1152, 427, 3804, 57502, 1002, 368, 28434, 10014, 14652, 2592, 19826, 4400, 10973, 15, 530, 16915, 4384, 17, 727, 1130, 11602, 184637, 17, 727, 1130, 4105, 49123, 35262, 473, 32247, 1152, 427, 727, 1427, 17, 3262, 707, 3423, 427, 13485, 1152, 7747, 361, 170205, 15, 707, 2152, 727, 1427, 1331, 55385, 5484, 14652, 6291, 999, 117805, 731, 29726, 1119, 96, 17, 2670, 3968, 9361, 632, 269, 42512]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]}\n" + ] + } + ], + "source": [ + "print(train_sample_prompt[:1])" + ] + }, + { + "cell_type": "code", + "execution_count": 53, + "metadata": { + "id": "WeM66LmEVaTD" + }, + "outputs": [], + "source": [ + "dataset_sentences = load_dataset(\"Abirate/english_quotes\")\n", + "\n", + "data_sentences = dataset_sentences.map(lambda samples: tokenizer(samples[\"quote\"]), batched=True)\n", + "train_sample_sentences = data_sentences[\"train\"].select(range(25))\n", + "train_sample_sentences = train_sample_sentences.remove_columns(['author', 'tags'])" + ] + }, + { + "cell_type": "code", + "source": [ + "display(train_sample_sentences)" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 86 + }, + "id": "zUSG_M_nBp_E", + "outputId": "faf36464-de24-4512-aace-c1ff8713c1d4" + }, + "execution_count": 54, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/plain": [ + "Dataset({\n", + " features: ['quote', 'input_ids', 'attention_mask'],\n", + " num_rows: 25\n", + "})" + ] + }, + "metadata": {} + } + ] + }, + { + "cell_type": "markdown", + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "b97381d4-5fe2-49d0-be5d-2fe3421edc5c", + "showTitle": false, + "title": "" + }, + "id": "0-5mv1ZpVaTD" + }, + "source": [ + "## Fine-Tuning. \n", + "\n", + "### PEFT configurations\n", + "\n", + "\n", + "API docs:\n", + "https://huggingface.co/docs/peft/main/en/package_reference/tuners#peft.PromptTuningConfig\n", + "\n", + "We can use the same configuration for both models to be trained.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 55, + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "6df8e1f1-be9e-42db-b4a4-6af7cd351004", + "showTitle": false, + "title": "" + }, + "id": "sOg1Yh-oVaTD" + }, + "outputs": [], + "source": [ + "from peft import get_peft_model, PromptTuningConfig, TaskType, PromptTuningInit\n", + "\n", + "generation_config = PromptTuningConfig(\n", + " task_type=TaskType.CAUSAL_LM, #This type indicates the model will generate text.\n", + " prompt_tuning_init=PromptTuningInit.RANDOM, #The added virtual tokens are initializad with random numbers\n", + " num_virtual_tokens=NUM_VIRTUAL_TOKENS, #Number of virtual tokens to be added and trained.\n", + " tokenizer_name_or_path=model_name #The pre-trained model.\n", + ")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "an9KBtB1VaTD" + }, + "source": [ + "### Creating two Prompt Tuning Models.\n", + "We will create two identical prompt tuning models using the same pre-trained model and the same config." + ] + }, + { + "cell_type": "code", + "execution_count": 56, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "c_D8oDQZVaTD", + "outputId": "6b46ca98-3f60-49c1-dab2-91259d6387af" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "trainable params: 4,096 || all params: 559,218,688 || trainable%: 0.0007324504863471229\n", + "None\n" + ] + } + ], + "source": [ + "peft_model_prompt = get_peft_model(foundational_model, generation_config)\n", + "print(peft_model_prompt.print_trainable_parameters())" + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "IktYfj68VaTE", + "outputId": "28fe03b7-4490-43ba-b913-4633e269737a" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "trainable params: 4,096 || all params: 559,218,688 || trainable%: 0.0007324504863471229\n", + "None\n" + ] + } + ], + "source": [ + "peft_model_sentences = get_peft_model(foundational_model, generation_config)\n", + "print(peft_model_sentences.print_trainable_parameters())" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "cff5bc33-8cfb-4144-8962-9c54362a7faa", + "showTitle": false, + "title": "" + }, + "id": "i6WhJSUwVaTE" + }, + "source": [ + "**That's amazing: did you see the reduction in trainable parameters? We are going to train a 0.001% of the paramaters available.**\n", + "\n", + "Now we are going to create the training arguments, and we will use the same configuration in both trainings." + ] + }, + { + "cell_type": "code", + "execution_count": 58, + "metadata": { + "id": "SJoznfzjVaTE" + }, + "outputs": [], + "source": [ + "from transformers import TrainingArguments\n", + "def create_training_arguments(path, learning_rate=0.0035, epochs=6):\n", + " training_args = TrainingArguments(\n", + " output_dir=path, # Where the model predictions and checkpoints will be written\n", + " use_cpu=True, # This is necessary for CPU clusters.\n", + " auto_find_batch_size=True, # Find a suitable batch size that will fit into memory automatically\n", + " learning_rate= learning_rate, # Higher learning rate than full Fine-Tuning\n", + " num_train_epochs=epochs\n", + " )\n", + " return training_args" + ] + }, + { + "cell_type": "code", + "execution_count": 59, + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "54b78a8f-81f0-44c0-b0bc-dcb14891715f", + "showTitle": false, + "title": "" + }, + "id": "cb1j50DSVaTE" + }, + "outputs": [], + "source": [ + "\n", + "import os\n", + "\n", + "working_dir = \"./\"\n", + "\n", + "#Is best to store the models in separate folders.\n", + "#Create the name of the directories where to store the models.\n", + "output_directory_prompt = os.path.join(working_dir, \"peft_outputs_prompt\")\n", + "output_directory_sentences = os.path.join(working_dir, \"peft_outputs_sentences\")\n", + "\n", + "#Just creating the directoris if not exist.\n", + "if not os.path.exists(working_dir):\n", + " os.mkdir(working_dir)\n", + "if not os.path.exists(output_directory_prompt):\n", + " os.mkdir(output_directory_prompt)\n", + "if not os.path.exists(output_directory_sentences):\n", + " os.mkdir(output_directory_sentences)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "OC5IhO9mVaTE" + }, + "source": [ + "We need to indicate the directory containing the model when creating the TrainingArguments." + ] + }, + { + "cell_type": "code", + "execution_count": 60, + "metadata": { + "id": "D4v4RSSeVaTE" + }, + "outputs": [], + "source": [ + "training_args_prompt = create_training_arguments(output_directory_prompt, 0.003, NUM_EPOCHS)\n", + "training_args_sentences = create_training_arguments(output_directory_sentences, 0.003, NUM_EPOCHS)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "c593deb6-5626-4fd9-89c2-2329e2f9b6e0", + "showTitle": false, + "title": "" + }, + "id": "GdMfjk5RVaTE" + }, + "source": [ + "## Train\n", + "\n", + "We will create the trainer Object, one for each model to train. " + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": { + "id": "uVAfNdEIVaTE" + }, + "outputs": [], + "source": [ + "from transformers import Trainer, DataCollatorForLanguageModeling\n", + "def create_trainer(model, training_args, train_dataset):\n", + " trainer = Trainer(\n", + " model=model, # We pass in the PEFT version of the foundation model, bloomz-560M\n", + " args=training_args, #The args for the training.\n", + " train_dataset=train_dataset, #The dataset used to tyrain the model.\n", + " data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False) # mlm=False indicates not to use masked language modeling\n", + " )\n", + " return trainer\n" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "32e43bcf-23b2-46aa-9cf0-455b83ef4f38", + "showTitle": false, + "title": "" + }, + "colab": { + "base_uri": "https://localhost:8080/", + "height": 127 + }, + "id": "1Sz9BeFZVaTF", + "outputId": "1b698470-209e-4001-fcbe-6fa8a2ac8707" + }, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/plain": [ + "" + ], + "text/html": [ + "\n", + "
\n", + " \n", + " \n", + " [42/42 11:23, Epoch 6/6]\n", + "
\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
StepTraining Loss

" + ] + }, + "metadata": {} + }, + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "TrainOutput(global_step=42, training_loss=3.5800417945498513, metrics={'train_runtime': 703.2941, 'train_samples_per_second': 0.427, 'train_steps_per_second': 0.06, 'total_flos': 60957279240192.0, 'train_loss': 3.5800417945498513, 'epoch': 6.0})" + ] + }, + "metadata": {}, + "execution_count": 62 + } + ], + "source": [ + "#Training first model.\n", + "trainer_prompt = create_trainer(peft_model_prompt, training_args_prompt, train_sample_prompt)\n", + "trainer_prompt.train()" + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 127 + }, + "id": "afTotMckVaTF", + "outputId": "15bed85d-17f5-4a49-d8d5-bae35e68d294" + }, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/plain": [ + "" + ], + "text/html": [ + "\n", + "

\n", + " \n", + " \n", + " [24/24 03:29, Epoch 6/6]\n", + "
\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
StepTraining Loss

" + ] + }, + "metadata": {} + }, + { + "output_type": "execute_result", + "data": { + "text/plain": [ + "TrainOutput(global_step=24, training_loss=4.4278310139973955, metrics={'train_runtime': 219.765, 'train_samples_per_second': 0.683, 'train_steps_per_second': 0.109, 'total_flos': 17825006936064.0, 'train_loss': 4.4278310139973955, 'epoch': 6.0})" + ] + }, + "metadata": {}, + "execution_count": 63 + } + ], + "source": [ + "#Training second model.\n", + "trainer_sentences = create_trainer(peft_model_sentences, training_args_sentences, train_sample_sentences)\n", + "trainer_sentences.train()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "z2Zsww_2VaTF" + }, + "source": [ + "In less than 10 minutes (CPU time in a M1 Pro) we trained 2 different models, with two different missions with a same foundational model as a base." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "5a6c8daf-8248-458a-9f6f-14865b4fbd2e", + "showTitle": false, + "title": "" + }, + "id": "s5k10HwoVaTG" + }, + "source": [ + "## Save models\n", + "We are going to save the models. These models are ready to be used, as long as we have the pre-trained model from which they were created in memory." + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "409df5ce-e496-46d7-be2c-202a463cdc80", + "showTitle": false, + "title": "" + }, + "id": "E3dn3PeMVaTG" + }, + "outputs": [], + "source": [ + "trainer_prompt.model.save_pretrained(output_directory_prompt)\n", + "trainer_sentences.model.save_pretrained(output_directory_sentences)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "fb14e3fd-bbf6-4d56-92c2-51bfe08de72a", + "showTitle": false, + "title": "" + }, + "id": "rkUKpDDWVaTG" + }, + "source": [ + "## Inference\n", + "\n", + "You can load the model from the path that you have saved to before, and ask the model to generate text based on our input before!" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "cc48af16-c117-4019-a31a-ce1c93cd21d4", + "showTitle": false, + "title": "" + }, + "id": "dlqXXN8oVaTG" + }, + "outputs": [], + "source": [ + "from peft import PeftModel\n", + "\n", + "loaded_model_prompt = PeftModel.from_pretrained(foundational_model,\n", + " output_directory_prompt,\n", + " #device_map='auto',\n", + " is_trainable=False)" + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": { + "application/vnd.databricks.v1+cell": { + "cellMetadata": {}, + "inputWidgets": {}, + "nuid": "6b44524b-2ac5-4e74-81e6-c406d4414e42", + "showTitle": false, + "title": "" + }, + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "-4jd3zCGVaTG", + "outputId": "b55454f1-f1ed-444c-b107-698778406e6e" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "['I want you to act as a motivational coach. You will be helping students learn how they can improve their performance in the classroom and at school.']\n" + ] + } + ], + "source": [ + "loaded_model_prompt_outputs = get_outputs(loaded_model_prompt, input_prompt)\n", + "print(tokenizer.batch_decode(loaded_model_prompt_outputs, skip_special_tokens=True))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SHbeFTXjVaTG" + }, + "source": [ + "If we compare both answers something changed.\n", + "* ***Pretrained Model:*** *I want you to act as a motivational coach. Don't be afraid of being challenged.*\n", + "* ***Fine-Tuned Model:*** *I want you to act as a motivational coach. You can use this method if you're feeling anxious about your.*\n", + "\n", + "We have to keep in mind that we have only trained the model for a few minutes, but they have been enough to obtain a response closer to what we were looking for." + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "metadata": { + "id": "MuwAsq3uVaTG" + }, + "outputs": [], + "source": [ + "loaded_model_prompt.load_adapter(output_directory_sentences, adapter_name=\"quotes\")\n", + "loaded_model_prompt.set_adapter(\"quotes\")" + ] + }, + { + "cell_type": "code", + "execution_count": 70, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "IQm--PWSVaTH", + "outputId": "3e814a6a-a380-4f2c-f887-6852a9f51002" + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "['There are two nice things that should matter to you: the weather and your health.']\n" + ] + } + ], + "source": [ + "loaded_model_sentences_outputs = get_outputs(loaded_model_prompt, input_sentences)\n", + "print(tokenizer.batch_decode(loaded_model_sentences_outputs, skip_special_tokens=True))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UnR8y9gwVaTH" + }, + "source": [ + "With the second model we have a similar result.\n", + "* **Pretrained Model:** *There are two nice things that should matter to you: the price and quality of your product.*\n", + "* **Fine-Tuned Model:** *There are two nice things that should matter to you: the weather and your health.*\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "B6TUjNtGVaTH" + }, + "source": [ + "# Conclusion\n", + "Prompt Tuning is an amazing technique that can save us hours of training and a significant amount of money. In the notebook, we have trained two models in just a few minutes, and we can have both models in memory, providing service to different clients.\n", + "\n", + "If you want to try different combinations and models, the notebook is ready to use another model from the Bloom family.\n", + "\n", + "You can change the number of epochs to train, the number of virtual tokens, and the model in the third cell. However, there are many configurations to change. If you're looking for a good exercise, you can replace the random initialization of the virtual tokens with a fixed value.\n", + "\n", + "*The responses of the Fine-Tuned models may vary every time we train them. I've pasted the results of one of my trainings, but the actual results may differ.*" + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "metadata": { + "id": "5OMyCWasVaTH" + }, + "outputs": [], + "source": [] + } + ], + "metadata": { + "application/vnd.databricks.v1+notebook": { + "dashboards": [], + "language": "python", + "notebookMetadata": { + "pythonIndentUnit": 2 + }, + "notebookName": "LLM 02 - Prompt Tuning with PEFT", + "widgets": {} + }, + "colab": { + "machine_shape": "hm", + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.4" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +}