Skip to content

Peterpetu/Large-Language-Model-Fine-Tuning-with-QLora

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

QLora Fine-Tuning for Large Language Models

Introduction

This project is exploration in the field of Natural Language Processing (NLP), focusing on the fine-tuning of a pre-trained language model using the Quantized Low Rank Adapter (QLoRA) technique. QLoRA represents a recent innovation in NLP, offering a efficient methodology to adapt and enhance large-scale language models while prioritizing computational and memory optimization. The project integrates QLoRA into an existing Finnish open-source language model, previously not optimized for instruction-based tasks. By employing low-rank matrix approximation and weight quantization, the project adapts the model for instruction finetuning, enhancing its task-specific performance capabilities. As an experimental prototype, this project serves as a scalable foundation, demonstrating the potential for evolving into a full-fledged finetuning project with larger parameter sizes. The primary objective is to showcase the effectiveness of QLoRA in strategically refining a language model for specific instructions, providing valuable insights into balancing model adaptability, computational efficiency, and performance integrity in the rapidly evolving field of large-scale NLP models.

QLoRA Fine-Tuning Process

The QLoRA fine-tuning process in this project is a detailed sequence of steps, crafted to augment a pre-trained language model's capabilities with a focus on efficient computational resource management.

Steps in the QLoRA Fine-Tuning Process:

  1. Model and Tokenizer Initialization: This phase involves initializing a pre-trained model and its tokenizer from Hugging Face's model hub. This baseline model possesses an inherent understanding of language structures and patterns.

  2. Adaptation for QLoRA: The model is configured for quantized training, a hallmark of QLoRA, which reduces the model's size and memory usage without significantly impacting learning capacity. Simultaneously, parameters not involved in fine-tuning are frozen to decrease computational load.

  3. Low Rank Adapter Integration: This crucial phase involves fine-tuning parameters such as rank, alpha (lora_alpha), target modules, dropout rate, and bias for the adapter, aligning them with the specific NLP task. The adapter is then integrated into the model for targeted learning.

  4. Optimized Training and Evaluation: The reduced model size facilitates efficient training. Performance metrics are closely monitored, and post-training, the model undergoes a thorough evaluation against predefined benchmarks to verify the fine-tuning's effectiveness.

Dataset Preparation and Analysis

The dataset_tokenization_analysis.py script is integral to analyzing the dataset's tokenization process. This analysis ensures optimal dataset compatibility with the model's training process.

Model Analysis and Testing

  1. Fine-Tuned Model Analysis: The finetuned_model_analysis.py script compares the original and fine-tuned models.
  2. Model Prompt Testing: The model_prompt_test.py script tests the model's language generation capabilities with various prompts.

Workflow Summary

The project encompasses:

  • Data preprocessing (dataset_tokenization_analysis.py)
  • Model training (train_model.py)
  • Model analysis (finetuned_model_analysis.py)
  • Model testing (model_prompt_test.py)

About

Quantized Low Rank Adaptor Fine-Tuning for Large Language Model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages