From 779a72a9661fb7e1a1de362c0fd8e641d1b04e14 Mon Sep 17 00:00:00 2001 From: Matt Rubens Date: Sat, 22 Feb 2025 19:32:02 -0500 Subject: [PATCH] Dedupe local model instructions --- docs/advanced-usage/local-models.md | 106 ++-------------------------- 1 file changed, 5 insertions(+), 101 deletions(-) diff --git a/docs/advanced-usage/local-models.md b/docs/advanced-usage/local-models.md index 06ee06c4..11370eb5 100644 --- a/docs/advanced-usage/local-models.md +++ b/docs/advanced-usage/local-models.md @@ -21,113 +21,17 @@ Roo Code currently supports two main local model providers: 1. **Ollama:** A popular open-source tool for running large language models locally. It supports a wide range of models. 2. **LM Studio:** A user-friendly desktop application that simplifies the process of downloading, configuring, and running local models. It also provides a local server that emulates the OpenAI API. -## Setting Up Ollama +## Setting Up Local Models -1. **Download and Install Ollama:** Download the Ollama installer for your operating system from the [Ollama website](https://ollama.com/). Follow the installation instructions. Make sure Ollama is running +For detailed setup instructions, see: - ```bash - ollama serve - ``` +* [Setting up Ollama](../providers/ollama) +* [Setting up LM Studio](../providers/lmstudio) -2. **Download a Model:** Ollama supports many different models. You can find a list of available models on the [Ollama website](https://ollama.com/library). Some recommended models for coding tasks include: - - * `codellama:7b-code` (good starting point, smaller) - * `codellama:13b-code` (better quality, larger) - * `codellama:34b-code` (even better quality, very large) - * `qwen2.5-coder:32b` - * `mistralai/Mistral-7B-Instruct-v0.1` (good general-purpose model) - * `deepseek-coder:6.7b-base` (good for coding tasks) - * `llama3:8b-instruct-q5_1` (good for general tasks) - - To download a model, open your terminal and run: - - ```bash - ollama pull - ``` - - For example: - - ```bash - ollama pull qwen2.5-coder:32b - ``` - -3. **Configure the Model:** by default, Ollama uses a context window size of 2048 tokens, which is too small for Roo Code requests. You need to have at least 12k to get decent results, ideally - 32k. To configure a model, you actually need to set its parameters and save a copy of it. - - ##### Using Ollama runtime - Load the model (we will use `qwen2.5-coder:32b` as an example): - - ```bash - ollama run qwen2.5-coder:32b - ``` - - Change context size parameter: - - ```bash - /set parameter num_ctx 32768 - ``` - - Save the model with a new name: - - ```bash - /save your_model_name - ``` - ##### Using Ollama command line - Alternatively, you can write all your settings into a text file and generate the model in the command-line. - - - Create a text file with model settings, and save it (~/qwen2.5-coder-32k.txt). Here we've only used the `num_ctx` parameter, but you could include more parameters on the next line using the `PARAMETER name value` syntax. - - ```text - FROM qwen2.5-coder:32b - # sets the context window size to 32768, this controls how many tokens the LLM can use as context to generate the next token - PARAMETER num_ctx 32768 - ``` - Change directory to the `.ollama/models` directory. On most Macs, thats `~/.ollama/models` by default (`%HOMEPATH%\.ollama` on Windows). - - ```bash - cd ~/.ollama/models - ``` - - Create your model from the settings text file you created. The syntax is `ollama create (name of the model you want to see) -f (text file with settings)` - - ```bash - ollama create qwen2.5-coder-32k -f ~/qwen2.5-coder-32k.txt - ``` - - - -4. **Configure Roo Code:** - * Open the Roo Code sidebar ( icon). - * Click the settings gear icon (). - * Select "ollama" as the API Provider. - * Enter the Model name from the previous step (e.g., `your_model_name`) or choose it from the radio button list that should appear below `Model ID` if Ollama is currently running. - * (Optional) You can configure the base URL if you're running Ollama on a different machine. The default is `http://localhost:11434`. - * (Optional) Configure Model context size in Advanced settings, so Roo Code knows how to manage its sliding window. - -## Setting Up LM Studio - -1. **Download and Install LM Studio:** Download LM Studio from the [LM Studio website](https://lmstudio.ai/). -2. **Download a Model:** Use the LM Studio interface to search for and download a model. Some recommended models include those listed above for Ollama. Look for models in the GGUF format. -3. **Start the Local Server:** - * In LM Studio, click the **"Local Server"** tab (the icon looks like `<->`). - * Select your downloaded model. - * Click **"Start Server"**. -4. **Configure Roo Code:** - * Open the Roo Code sidebar ( icon). - * Click the settings gear icon (). - * Select "lmstudio" as the API Provider. - * Enter the Model ID. This should be the name of the model file you loaded in LM Studio (e.g., `codellama-7b.Q4_0.gguf`). LM Studio shows a list of "Currently loaded models" in its UI. - * (Optional) You can configure the base URL if you're running LM Studio on a different machine. The default is `http://localhost:1234`. +Both providers offer similar capabilities but with different user interfaces and workflows. Ollama provides more control through its command-line interface, while LM Studio offers a more user-friendly graphical interface. ## Troubleshooting -* **"Please check the LM Studio developer logs to debug what went wrong":** This error usually indicates a problem with the model or its configuration in LM Studio. Try the following: - * Make sure the LM Studio local server is running and that the correct model is loaded. - * Check the LM Studio logs for any error messages. - * Try restarting the LM Studio server. - * Ensure your chosen model is compatible with Roo Code. Some very small models may not work well. - * Some models may require a larger context length. - * **"No connection could be made because the target machine actively refused it":** This usually means that the Ollama or LM Studio server isn't running, or is running on a different port/address than Roo Code is configured to use. Double-check the Base URL setting. * **Slow Response Times:** Local models can be slower than cloud-based models, especially on less powerful hardware. If performance is an issue, try using a smaller model.