I put together this flexible microservices setup to run any Ollama-compatible model locally. This stack combines Ollama with Open WebUI to create a complete local AI solution. While it's pre-configured for DeepSeek r1, you can easily swap in any model you prefer!
This repo gives you a reusable template with two separate microservices:
- Ollama container: Handles all the AI model stuff (runs whatever model you choose)
- Open WebUI container: Gives you a clean chat interface to talk to the model
They communicate with each other but run independently - proper microservice architecture!
-
Get Docker running
You'll need Docker installed → Get Docker here -
Grab this repo
git clone <repository-url> cd local-llm-stack
-
Choose your model (optional)
The default is set to DeepSeek r1, but you can easily change it:
- Edit the Dockerfile in the ollama directory to replace "deepseek-r1" with any model from Ollama's library
- Examples: llama3, mistral, phi3, codellama, etc.
-
Fire it up!
docker compose up -d
First run will take a while (maybe grab a coffee?) as it downloads the model
-
Start chatting
Just open http://localhost:8080 in your browser
What's actually happening:
- The Ollama container boots up and automatically downloads your chosen model if needed
- The Open WebUI connects to Ollama's API (they talk to each other through Docker's internal network)
- Everything runs locally - your data stays on your machine
┌────────────────────────────────────────────────────────────┐
│ Your Machine │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Docker Environment │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ │ │
│ │ │ │ │ │ │ │
│ │ │ Ollama Engine │◄────────┤ Open WebUI │ │ │
│ │ │ Container │ │ Container │ │ │
│ │ │ │ │ │ │ │
│ │ └────────┬────────┘ └────────▲────────┘ │ │
│ │ │ │ │ │
│ │ │ │ │ │
│ │ ┌────────▼────────┐ │ │ │
│ │ │ │ │ │ │
│ │ │ Ollama Data │ │ │ │
│ │ │ Volume │ │ │ │
│ │ │ │ │ │ │
│ │ └─────────────────┘ │ │ │
│ │ │ │ │
│ └────────────────────────────────────────┼────────────┘ │
│ │ │
│ ┌─────────────────┐ │ │
│ │ │ │ │
│ │ Web Browser │◄─────────────────────┘ │
│ │ │ │
│ └─────────────────┘ │
│ │
└────────────────────────────────────────────────────────────┘
│ │
│ │
▼ ▼
Port 11434 (API) Port 8080 (UI)
(Not exposed externally) (User access)
When you need to check on things:
See what's running
docker compose psCheck the logs (helpful for debugging)
docker compose logs -fShut it down
docker compose downRemoves all data
docker compose down -v- Larger models like DeepSeek r1 require decent hardware
- If you're on a laptop, expect your fans to spin up
- First responses might be slow as the model warms up
- Try smaller models like Phi-3 mini if you need faster responses on modest hardware
This project builds upon these amazing open source projects:
- Ollama - For running LLMs locally
- Open WebUI - For the intuitive chat interface