Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] [GenAI] Lora Finetune #7288

Open
LittleLittleCloud opened this issue Nov 6, 2024 · 0 comments
Open

[WIP] [GenAI] Lora Finetune #7288

LittleLittleCloud opened this issue Nov 6, 2024 · 0 comments
Labels
enhancement New feature or request untriaged New issue has not been triaged

Comments

@LittleLittleCloud
Copy link
Contributor

LittleLittleCloud commented Nov 6, 2024

Lora fine-tuning is an adapter-based technique to fine-tune an LLM. It changes LLM model architecture by adding learnable lora layers to transformers. During fine-tuning, only lora weights are adjustable and the LLM weights are frozen, so it requires much less GPU memory comparing to a full-layer fine-tuning. Based on this table, it requires 16GB memory to fine-tuning a 7B size model in 16bits, which can be fit in rtx 3090, 4080 and 4090. A wider range of GPUs can be fit on 3.8B LLMs like phi-3.5-mini

API design (wip)

Package: Microsoft.ML.GenAI.Lora

interface ICausalLMLoraPipeline {} // pipeline for loading causal LM + lora layers

class LoraConfiguration // lora configuration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request untriaged New issue has not been triaged
Projects
None yet
Development

No branches or pull requests

1 participant