Skip to content

Commit

Permalink
update llama.cpp docs
Browse files Browse the repository at this point in the history
  • Loading branch information
tybalex committed Jul 2, 2024
1 parent 1b03efd commit 76b1488
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions docs/docs/inference/llamacpp.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,14 @@ For example:
wget https://huggingface.co/rubra-ai/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/rubra-meta-llama-3-8b-instruct.Q8_0.gguf
```

:::info
For large multi-part model files, such as [rubra-meta-llama-3-70b-instruct_Q6_K-0000*-of-00003.gguf](https://huggingface.co/rubra-ai/Meta-Llama-3-70B-Instruct-GGUF/tree/main), use the following command to merge them before proceeding to the next step:
```
./llama-gguf-split --merge rubra-meta-llama-3-70b-instruct_Q6_K-0000*-of-00003.gguf rubra-meta-llama-3-70b-instruct_Q6_K.gguf
```
This will merge multi-part model files to one gguf file `rubra-meta-llama-3-70b-instruct_Q6_K.gguf`.
:::

### 5. Start the OpenAI Compatible Server

```bash
Expand Down

0 comments on commit 76b1488

Please sign in to comment.