diff --git a/README.md b/README.md index a2f1a647..386e4df5 100644 --- a/README.md +++ b/README.md @@ -117,6 +117,103 @@ mlora_cli # and enjoy it!! ``` +
+ +Step-by-step + +### Step1. Download the mlora image and install the mlora_cli +```bash +docker pull yezhengmaolove/mlora:latest +pip install mlora-cli +``` +[![asciicast](https://asciinema.org/a/TfYrIsXgfZeMxPRrzOkWZ7T4b.svg)](https://asciinema.org/a/TfYrIsXgfZeMxPRrzOkWZ7T4b) + +### Step2. Start the mlora server with Docker +```bash +# first, we create a cache dir in host for cache some file +mkdir ~/cache +# second, we manually download the model weights from Hugging Face. +mkdir ~/model && cd ~/model +git clone https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0 +# we map port 8000 used by the mlora server to port 1288 on the host machine. +# the BASE_MODEL environment variable indicates the path of the base model used by mlora. +# the STORAGE_DIR environment variable indicates the path where datasets and lora adapters are stored. +# we use the script /opt/deploy.sh in container to start the server. +docker run -itd --runtime nvidia --gpus all \ + -v ~/cache:/cache \ + -v ~/model:/model \ + -p 1288:8000 \ + --name mlora_server \ + -e "BASE_MODEL=/model/TinyLlama-1.1B-Chat-v1.0" \ + -e "STORAGE_DIR=/cache" \ + yezhengmaolove/mlora:latest /bin/bash /opt/deploy.sh +``` +[![asciicast](https://asciinema.org/a/LrLH0jU176NQNfawHpCITaLGx.svg)](https://asciinema.org/a/LrLH0jU176NQNfawHpCITaLGx) + +### Step3. use mlora_cli tool link to mlora server +we use mlora_cli link to the server http://127.0.0.1:1288 (must use the http protocal) +```bash +(mLoRA) set port 1288 +(mLoRA) set host http://127.0.0.1 +``` +[![asciicast](https://asciinema.org/a/GN1NBc2MEN8GrmcIasmIMDjNa.svg)](https://asciinema.org/a/GN1NBc2MEN8GrmcIasmIMDjNa) + +### Step4. upload some data file for train. +we use the Stanford Alpaca dataset as a demo, the data just like below: +```json +[{"instruction": "", "input": "", "output": }, {...}] +``` +```bash +(mLoRA) file upload +? file type: train data +? name: alpaca +? file path: /home/yezhengmao/alpaca-lora/alpaca_data.json +``` +[![asciicast](https://asciinema.org/a/KN41mnlMShZWDs3dIrd64L4nS.svg)](https://asciinema.org/a/KN41mnlMShZWDs3dIrd64L4nS) + +### Step5. upload some template to provide a structured format for generating prompts +the template in a yaml file, and write by templating language Jinja2, see the demo/prompt.yaml file + +the data file you upload can be considered as array data, with the elements in the array being of dictionary type. we consider each element as a data point in the template. +```bash +(mLoRA) file upload +? file type: prompt template +? name: simple_prompt +? file path: /home/yezhengmao/mLoRA/demo/prompt.yaml +``` +[![asciicast](https://asciinema.org/a/SFY8H0K4DppVvqmQCVuOuThoz.svg)](https://asciinema.org/a/SFY8H0K4DppVvqmQCVuOuThoz) + +### Step6. create a dataset +we create a dataset, the dataset consists of data, a template, and the corresponding prompter. +we can use `dataset showcase` command to check the if the prompts are generated correctly. +```bash +(mLoRA) dataset create +? name: alpaca_dataset +? train data file: alpaca +? prompt template file: simple_prompt +? prompter: instruction +? data preprocessing: default +(mLoRA) dataset showcase +? dataset name: alpaca_dataset +``` +[![asciicast](https://asciinema.org/a/mxpwo6gWihjEEsJ0dfcXG98cM.svg)](https://asciinema.org/a/mxpwo6gWihjEEsJ0dfcXG98cM) + +### Step7. create a adapter +now we can use `adapter create` command to create a adapter for train. + +[![asciicast](https://asciinema.org/a/Wf4PHfoGC0PCcciGHOXCkX0xj.svg)](https://asciinema.org/a/Wf4PHfoGC0PCcciGHOXCkX0xj) + +### Step8. !!!! submit task to train !!!! +Finally, we can submit the task to train our adapter using the defined dataset. +NOTE: you can continuously submit or terminal training tasks. +use the `adapter ls` or `task ls` to check the tasks' status + +[![asciicast](https://asciinema.org/a/vr8f1XtA0CBULGIP81w2bakkE.svg)](https://asciinema.org/a/vr8f1XtA0CBULGIP81w2bakkE) + + +
+ + ## Why you should use mLoRA Using mLoRA can save significant computational and memory resources when training multiple adapters simultaneously.