mLoRA

An Efficient "Factory" to Build Multiple LoRA Adapters

mLoRA (a.k.a Multi-LoRA Fine-Tune) is an open-source framework designed for efficient fine-tuning of multiple Large Language Models (LLMs) using LoRA and its variants. Key features of mLoRA include:

Concurrent fine-tuning of multiple LoRA adapters.
Shared base model among multiple LoRA adapters.
Efficient pipeline parallelism algorithm.
Support for multiple LoRA variant algorithms and various base models.
Support for multiple reinforcement learning preference alignment algorithms.

Quickstart

Firstly, you should clone this repository and install dependencies:

# Clone Repository
git clone https://github.com/TUDB-Labs/mLoRA
cd mLoRA
# Install requirements need the Python >= 3.12
pip install .

The mlora.py code is a starting point for batch fine-tuning LoRA adapters.

python mlora.py \
  --base_model TinyLlama/TinyLlama-1.1B-Chat-v0.4 \
  --config demo/lora/lora_case_1.yaml

You can check the adapters' configuration in demo folder, there are some configuration regarding the use of different LoRA variants and reinforcement learning preference alignment algorithms.

For further detailed usage information, please use --help option:

python mlora.py --help

Deploy as service

We can deploy mLoAR as a service to continuously receive user requests and perform fine-tuning task.

# Install requirements for deploy
pip install .[deploy]
# Start the server
python mlora_server.py \
  --base_model /data/TinyLlama-1.1B-Chat-v1.0/ \
  --root /tmp/mlora

For further detailed usage information, please use --help option:

python mlora_server.py --help

Once the service is deployed, use mlora_cli.py to interact with the server.

python mlora_cli.py

Why you should use mLoRA

Using mLoRA can save significant computational and memory resources when training multiple adapters simultaneously.

High performance on consumer hardware

We fine-tuned multiple LoRA adapters using four A6000 graphics cards with fp32 precision and without using checkpointing and any quantization techniques:

Model	mLoRA (tokens/s)	PEFT-LoRA with FSDP (tokens/s)	PEFT-LoRA with TP (tokens/s)
llama-2-7b (32fp)	2364	1750	1500
llama-2-13b (32fp)	1280	OOM	875

Supported model

	Model
✓	LLaMA

Supported LoRA variants

	Variant
✓	QLoRA
✓	LoRA+

Supported preference alignment algorithms

	Variant
✓	DPO
✓	CPO

Document

Help Document[TODO]
Design Document

Contributing

We welcome contributions to improve this repository! Please review the contribution guidelines before submitting pull requests or issues.

Fork the repository. Create a new branch for your feature or fix. Submit a pull request with a detailed explanation of your changes.

You can use the pre-commit to check your code.

# Install requirements
pip install .[ci_test]
ln -s ../../.github/workflows/pre-commit .git/hooks/pre-commit

Or just call the script to check your code

.github/workflows/pre-commit

Citation

Please cite the repo if you use the code in this repo.

@misc{m-LoRA,
  author = {Zhengmao, Ye\textsuperscript{*} and Dengchun, Li\textsuperscript{*} and Jingqi, Tian and Tingfeng, Lan and Yanbo, Liang and Yexi, Jiang and Jie, Zuo and Hui, Lu and Lei, Duan and Mingjie, Tang},
  title = {m-LoRA: Efficient LLM Model Fine-tune and Inference via Multi-Lora Optimization},
  year = {2023},
  publisher = {GitHub},
  howpublished = {\url{https://github.com/TUDB-Labs/mLoRA}},
  note={\textsuperscript{*}: these authors contributed equally to this work.}
}

Copyright

This project is licensed under the Apache 2.0 License.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Name		Name	Last commit message	Last commit date
Latest commit History 274 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
demo		demo
docs		docs
mlora		mlora
scripts		scripts
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
mlora.py		mlora.py
mlora_cli.py		mlora_cli.py
mlora_server.py		mlora_server.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mLoRA

An Efficient "Factory" to Build Multiple LoRA Adapters

Quickstart

Deploy as service

Why you should use mLoRA

High performance on consumer hardware

Supported model

Supported LoRA variants

Supported preference alignment algorithms

Document

Contributing

Citation

Copyright

About

Releases

Contributors 10

Languages

License

TUDB-Labs/mLoRA

Folders and files

Latest commit

History

Repository files navigation

mLoRA

An Efficient "Factory" to Build Multiple LoRA Adapters

Quickstart

Deploy as service

Why you should use mLoRA

High performance on consumer hardware

Supported model

Supported LoRA variants

Supported preference alignment algorithms

Document

Contributing

Citation

Copyright

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Contributors 10

Languages