Skip to content

Commit ea52e43

Browse files
committed
readme update
1 parent 77f1a20 commit ea52e43

File tree

2 files changed

+480
-199
lines changed

2 files changed

+480
-199
lines changed

readme.md

Lines changed: 50 additions & 199 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ With the OTW-Viewer, all generated documents (converted Markdown files, lexicon
4949
- 📱 **No-Code Gradio Interface**: Drag-&-drop upload with live terminal and complete pipeline control
5050
- 🌐 **Multi-Format Export**: LoRA, Merged (both for transformers, vLLM, etc.), GGUF in Q_8 with quantizations for local deployment (OpenWebUI/LM-Studio)
5151
- 🔍 **VLM Integration**: Vision-Language-Models for automatic image descriptions in documents
52-
-**Universal API Support**: Works with OpenAI, OpenRouter, Ollama, LM Studio, and any OpenAI-compatible API
52+
-**Runpod Integration**: Scalable cloud GPU support for cost-effective training
5353

5454
***
5555

@@ -60,20 +60,13 @@ With the OTW-Viewer, all generated documents (converted Markdown files, lexicon
6060
**Hardware:**
6161
- **Linux system recommended** (Ubuntu 22.04 LTS or similar)
6262
- **At least 100 GB free storage space**
63-
- **For Training: NVIDIA GPU with at least 20 GB VRAM** (depending on the model being trained)
63+
- **NVIDIA GPU with at least 20 GB VRAM** (depending on the model being trained)
6464
- RTX 4090/A6000/A100 recommended
6565
- For smaller models: RTX 3090/4080 (16GB) possible
66-
- **For Dataset Generation Only: No GPU required** (can use cloud APIs)
67-
- **CUDA 12.8+ and cuDNN** (only if using local GPU)
66+
- **CUDA 12.8+ and cuDNN installed**
6867

6968
**Accounts:**
7069
- **HuggingFace Account** with Access Token (Read + optional Write)
71-
- **API Access** (choose one):
72-
- OpenAI API Key
73-
- OpenRouter API Key
74-
- Ollama (local installation)
75-
- LM Studio (local installation)
76-
- Any OpenAI-compatible API endpoint
7770

7871
### HuggingFace Token Setup
7972

@@ -82,222 +75,78 @@ With the OTW-Viewer, all generated documents (converted Markdown files, lexicon
8275
3. Create a new token with **Read** permission (and **Write** for model upload)
8376
4. Note down the token for installation
8477

85-
### Universal Installation (NEW - Works with any API)
86-
87-
OpenTuneWeaver now supports **any OpenAI-compatible API** for dataset generation. Choose your preferred installation method:
88-
89-
#### Quick Installation with Direct Script
90-
91-
```bash
92-
# Download and run the universal setup script
93-
wget https://raw.githubusercontent.com/ProfEngel/OpenTuneWeaver/main/setup_universal.sh
94-
chmod +x setup_universal.sh
95-
96-
# Configure your API (choose one):
97-
98-
# Option 1: For OpenAI
99-
export OPENAI_API_TYPE=openai
100-
export OPENAI_API_BASE=https://api.openai.com/v1
101-
export OPENAI_API_KEY=sk-your-key-here
102-
export OPENAI_MODEL_NAME=gpt-4
103-
104-
# Option 2: For OpenRouter
105-
export OPENAI_API_TYPE=openrouter
106-
export OPENAI_API_BASE=https://openrouter.ai/api/v1
107-
export OPENAI_API_KEY=your-openrouter-key
108-
export OPENAI_MODEL_NAME=meta-llama/llama-3.2-3b-instruct
109-
110-
# Option 3: For local Ollama (default)
111-
export OPENAI_API_TYPE=ollama
112-
export OPENAI_API_BASE=http://localhost:11434/v1
113-
export OPENAI_MODEL_NAME=gemma3:12b-it-qat #VLM-Model for Image description
114-
115-
# Option 4: For LM Studio
116-
export OPENAI_API_TYPE=lmstudio
117-
export OPENAI_API_BASE=http://localhost:1234/v1
118-
export OPENAI_MODEL_NAME=your-loaded-model
119-
120-
# Run the installation
121-
./setup_universal.sh
122-
```
123-
124-
#### Installation with Virtual Environment (Recommended)
125-
126-
```bash
127-
# Create and activate virtual environment
128-
python3 -m venv opentuneweaver-env
129-
source opentuneweaver-env/bin/activate
130-
131-
# Clone repository
132-
git clone https://github.com/ProfEngel/OpenTuneWeaver.git
133-
cd OpenTuneWeaver
134-
135-
# Install dependencies
136-
pip install --upgrade pip
137-
pip install -r requirements.txt
138-
139-
# Configure API (see options above)
140-
export OPENAI_API_TYPE=openai # or your preferred API
141-
export OPENAI_API_BASE=https://api.openai.com/v1
142-
export OPENAI_API_KEY=your-api-key
143-
export OPENAI_MODEL_NAME=gpt-4
144-
145-
# Run setup
146-
./setup_universal.sh
147-
```
148-
149-
#### Installation with Conda
150-
151-
```bash
152-
# Create conda environment
153-
conda create -n opentuneweaver python=3.11
154-
conda activate opentuneweaver
155-
156-
# Clone repository
157-
git clone https://github.com/ProfEngel/OpenTuneWeaver.git
158-
cd OpenTuneWeaver
159-
160-
# Install dependencies
161-
pip install -r requirements.txt
162-
163-
# Install unsloth (for training)
164-
pip install --upgrade --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth-zoo.git
165-
166-
# Configure API (see options above)
167-
export OPENAI_API_TYPE=your-api-type
168-
export OPENAI_API_BASE=your-api-base-url
169-
export OPENAI_API_KEY=your-api-key
170-
export OPENAI_MODEL_NAME=your-model
171-
172-
# Run setup
173-
./setup_universal.sh
174-
```
175-
176-
#### Docker Installation (Recommended for Production) (not tested yet)
177-
178-
```bash
179-
# Clone repository
180-
git clone https://github.com/ProfEngel/OpenTuneWeaver.git
181-
cd OpenTuneWeaver
182-
183-
# Copy and configure environment
184-
cp .env.example .env
185-
# Edit .env with your API settings
186-
187-
# Build and run with Docker Compose
188-
docker-compose up -d
189-
190-
# Access at http://localhost:8080
191-
```
192-
193-
### Runpod Installation (For Simple Online-GPU Training)
78+
### Quick Start with Runpod (Recommended)
19479

19580
**Runpod Template:**
19681
```
82+
19783
runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04
19884
Disk Volume: 100 GB
19985
Pod Volume: 100 GB
20086
Open Ports: 8080,11434
87+
20188
```
20289

20390
**Installation:**
204-
```bash
91+
```
92+
20593
cd /workspace
20694
git clone https://github.com/ProfEngel/OpenTuneWeaver.git
207-
cd OpenTuneWeaver
208-
209-
# For Runpod with Ollama (local inference)
95+
cp OpenTuneWeaver/setup_runpod_direct.sh .
96+
chmod +x setup_runpod_direct.sh
21097
./setup_runpod_direct.sh
21198
212-
# OR for Runpod with external API
213-
export OPENAI_API_TYPE=openai
214-
export OPENAI_API_BASE=https://api.openai.com/v1
215-
export OPENAI_API_KEY=your-key
216-
export OPENAI_MODEL_NAME=gpt-4
217-
./setup_universal.sh
21899
```
219100

220-
### API Configuration Examples
101+
**After installation:**
221102

222-
#### Using OpenAI GPT-4
223-
```bash
224-
export OPENAI_API_TYPE=openai
225-
export OPENAI_API_BASE=https://api.openai.com/v1
226-
export OPENAI_API_KEY=sk-...your-key...
227-
export OPENAI_MODEL_NAME=gpt-5-mini # or gpt-4
228-
```
103+
wait until the installation is done, then press y for starting the ui. The ui starts on port http://yourIP:8080
229104

230-
#### Using OpenRouter
231-
```bash
232-
export OPENAI_API_TYPE=openrouter
233-
export OPENAI_API_BASE=https://openrouter.ai/api/v1
234-
export OPENAI_API_KEY=your-openrouter-key
235-
export OPENAI_MODEL_NAME=meta-llama/llama-3.2-3b-instruct
236-
# Other models: claude-3-opus, mistral-large, etc.
237-
```
105+
In Runpod access via Runpod web interface on port 8080.
238106

239-
#### Using Local Ollama
240-
```bash
241-
# First install Ollama
242-
curl -fsSL https://ollama.com/install.sh | sh
243-
ollama pull gemma3:12b-it-qat
107+
### Alternative Installation Methods
244108

245-
# Configure OpenTuneWeaver
246-
export OPENAI_API_TYPE=ollama
247-
export OPENAI_API_BASE=http://localhost:11434/v1
248-
export OPENAI_MODEL_NAME=gemma3:12b-it-qat
109+
**Docker Installation:** *(Coming Soon)*
249110
```
250111
251-
#### Using LM Studio
252-
```bash
253-
# Start LM Studio and load a model
254-
# Then configure:
255-
export OPENAI_API_TYPE=lmstudio
256-
export OPENAI_API_BASE=http://localhost:1234/v1
257-
export OPENAI_MODEL_NAME=your-loaded-model
258-
```
112+
docker run -d -p 7860:7860 --gpus all -v opentuneweaver:/app/data --name opentuneweaver opentuneweaver/opentuneweaver:latest
259113
260-
#### Using Custom API Endpoint
261-
```bash
262-
export OPENAI_API_TYPE=custom
263-
export OPENAI_API_BASE=https://your-api-endpoint.com/v1
264-
export OPENAI_API_KEY=your-api-key
265-
export OPENAI_MODEL_NAME=your-model-name
266114
```
267115

268-
### Starting OpenTuneWeaver
116+
**Conda Installation:**
117+
```
269118
270-
After installation, start the application:
119+
conda create -n opentuneweaver python=3.11
120+
conda activate opentuneweaver
121+
apt-get update && apt-get upgrade -y
122+
git clone https://github.com/ProfEngel/OpenTuneWeaver.git
123+
cp OpenTuneWeaver/setup_runpod_direct.sh .
124+
chmod +x setup_runpod_direct.sh
271125
272-
```bash
273-
# Direct start
274-
./start_otw.sh
126+
# Installation von unsloth_zoo direkt von GitHub
127+
pip install --upgrade --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth-zoo.git
275128
276-
# Or with custom port
277-
export SERVER_PORT=7860
278-
./start_otw.sh
129+
# Dann das Setup-Skript ausführen
130+
./setup_runpod_direct.sh
279131
280-
# Access the UI
281-
# Local: http://localhost:8080
282-
# Remote: http://your-server-ip:8080
283132
```
284133

285-
### Troubleshooting
134+
**Virtual Environment:**
135+
```
286136
287-
If you encounter issues:
137+
python3.11 -m venv opentuneweaver-env
138+
source opentuneweaver-env/bin/activate
139+
apt-get update && apt-get upgrade -y
140+
git clone https://github.com/ProfEngel/OpenTuneWeaver.git
141+
cp OpenTuneWeaver/setup_runpod_direct.sh .
142+
chmod +x setup_runpod_direct.sh
288143
289-
```bash
290-
# Check installation
291-
./debug_otw.sh
144+
# Installation von unsloth_zoo direkt von GitHub
145+
pip install --upgrade --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth-zoo.git
292146
293-
# View logs
294-
tail -f logs/pipeline.log
147+
# Dann das Setup-Skript ausführen
148+
./setup_runpod_direct.sh
295149
296-
# Test API connection
297-
curl -X POST $OPENAI_API_BASE/chat/completions \
298-
-H "Authorization: Bearer $OPENAI_API_KEY" \
299-
-H "Content-Type: application/json" \
300-
-d '{"model": "'$OPENAI_MODEL_NAME'", "messages": [{"role": "user", "content": "Test"}]}'
301150
```
302151

303152
***
@@ -367,15 +216,17 @@ OpenTuneWeaver would not be possible without these excellent open-source framewo
367216

368217
If you use OpenTuneWeaver in your research, please cite our paper:
369218

370-
```bibtex
219+
```
220+
371221
@article{opentuneweaver2024,
372-
title={OpenTuneWeaver: Semantically-structured, Curatable LLM Fine-tuning Pipeline for Research and Education},
373-
author={Engel, Prof. Dr. Mathias},
374-
journal={arXiv preprint},
375-
year={2024},
376-
institution={Hochschule für Wirtschaft und Umwelt Nürtingen-Geislingen},
377-
note={Funded by MWK Baden-Württemberg and Stifterverband Deutschland}
222+
title={OpenTuneWeaver: Semantically-structured, Curatable LLM Fine-tuning Pipeline for Research and Education},
223+
author={Engel, Prof. Dr. Mathias},
224+
journal={arXiv preprint},
225+
year={2024},
226+
institution={Hochschule für Wirtschaft und Umwelt Nürtingen-Geislingen},
227+
note={Funded by MWK Baden-Württemberg and Stifterverband Deutschland}
378228
}
229+
379230
```
380231

381232
**Paper available:**
@@ -427,4 +278,4 @@ Semantically-structured, curatable all-in-one LLM fine-tuning pipeline
427278

428279
### Topics
429280

430-
`llm` `finetuning` `ai` `machine-learning` `nlp` `semantic-chunking` `lora` `qlora` `pdf-processing` `qa-generation` `benchmarking` `gradio` `huggingface` `educational-ai` `research-tools`
281+
`llm` `finetuning` `ai` `machine-learning` `nlp` `semantic-chunking` `lora` `qlora` `pdf-processing` `qa-generation` `benchmarking` `gradio` `huggingface` `educational-ai` `research-tools`

0 commit comments

Comments
 (0)