Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 85 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# =============================================================================
# LLM Council - Environment Configuration
# =============================================================================
# Copy this file to .env and configure your desired providers
# You can mix providers freely - use cloud, local, or both!

# =============================================================================
# Provider Selection
# =============================================================================

# Default provider when model identifier has no prefix
# Options: "openrouter", "ollama", "lmstudio"
# Example: If DEFAULT_PROVIDER=openrouter, then "gpt-4" means "openrouter:gpt-4"
DEFAULT_PROVIDER=openrouter

# =============================================================================
# OpenRouter Configuration (Cloud Models)
# =============================================================================

# Get your API key from: https://openrouter.ai/keys
# Required if using any OpenRouter models (e.g., GPT-4, Claude, Gemini)
OPENROUTER_API_KEY=your_openrouter_api_key_here

# =============================================================================
# Ollama Configuration (Local Models)
# =============================================================================

# Ollama server URL
# Default: http://localhost:11434
# Only change if running Ollama on a different host/port
OLLAMA_BASE_URL=http://localhost:11434

# Installation: https://ollama.ai/
# Download models: ollama pull llama2, ollama pull mistral, etc.

# =============================================================================
# LMStudio Configuration (Local Models)
# =============================================================================

# LMStudio server URL (OpenAI-compatible API)
# Default: http://localhost:1234/v1/chat/completions
# Only change if using a different port
LMSTUDIO_BASE_URL=http://localhost:1234/v1/chat/completions

# Installation: https://lmstudio.ai/
# Make sure to start the local server in LMStudio before using

# =============================================================================
# Usage Examples
# =============================================================================

# Example 1: Cloud-only setup (OpenRouter)
# ----------------------------------------
# DEFAULT_PROVIDER=openrouter
# OPENROUTER_API_KEY=sk-or-v1-...
#
# In config.py:
# COUNCIL_MODELS = ["openai/gpt-4", "anthropic/claude-3-sonnet", "google/gemini-pro"]

# Example 2: Local-only setup (Ollama)
# ------------------------------------
# DEFAULT_PROVIDER=ollama
# OLLAMA_BASE_URL=http://localhost:11434
#
# In config.py:
# COUNCIL_MODELS = ["ollama:llama2", "ollama:mistral", "ollama:codellama"]

# Example 3: Mixed setup (Cloud + Local)
# --------------------------------------
# DEFAULT_PROVIDER=openrouter
# OPENROUTER_API_KEY=sk-or-v1-...
# OLLAMA_BASE_URL=http://localhost:11434
#
# In config.py:
# COUNCIL_MODELS = ["ollama:llama2", "openrouter:gpt-4", "lmstudio:mistral"]
# CHAIRMAN_MODEL = "openrouter:gpt-4" # Use cloud for synthesis

# Example 4: Privacy-focused (100% Local)
# ---------------------------------------
# DEFAULT_PROVIDER=ollama
# OLLAMA_BASE_URL=http://localhost:11434
#
# In config.py:
# COUNCIL_MODELS = ["llama2", "mistral", "codellama"] # Uses DEFAULT_PROVIDER
# CHAIRMAN_MODEL = "mistral"
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,8 @@ data/
# Frontend
frontend/node_modules/
frontend/dist/
frontend/.vite/
frontend/.vite/

# IDE
.idea/
.vscode/
141 changes: 127 additions & 14 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,31 +8,95 @@ LLM Council is a 3-stage deliberation system where multiple LLMs collaboratively

## Architecture

### Provider Abstraction System (NEW!)

**Multi-Provider Support**
LLM Council now supports three providers out of the box:
- **OpenRouter**: Cloud models (GPT-4, Claude, Gemini, etc.)
- **Ollama**: Local models (Llama2, Mistral, CodeLlama, etc.)
- **LMStudio**: Local models via OpenAI-compatible API

**Model Identifier Format**
Models use a flexible identifier format:
- **Prefixed**: `provider:model` (e.g., `ollama:llama2`, `openrouter:gpt-4`)
- **Simple**: `model` (uses `DEFAULT_PROVIDER` from config)

This enables:
- **Cloud-only**: All models via OpenRouter (requires API key, pay-per-use)
- **Local-only**: All models via Ollama/LMStudio (100% private, free)
- **Mixed mode**: Combine providers (e.g., local council + cloud chairman)

**Provider Architecture (`backend/providers/`)**

`base.py` - Abstract base class
- Defines `Provider` interface with `query()` and `query_batch()` methods
- All providers inherit from this base class
- Enforces consistent API across providers

`openrouter.py` - OpenRouterProvider
- Cloud provider using OpenRouter API
- Requires API key from environment
- Supports all OpenRouter models (GPT, Claude, Gemini, etc.)

`ollama.py` - OllamaProvider
- Local provider using Ollama's native API
- Uses ollama library's `AsyncClient.chat()` method
- Default URL: http://localhost:11434

`lmstudio.py` - LMStudioProvider
- Local provider using OpenAI-compatible API
- No authentication required
- Default URL: http://localhost:1234/v1/chat/completions

`factory.py` - ProviderFactory
- Central routing system for model requests
- Parses model identifiers and routes to appropriate provider
- Handles parallel queries across multiple providers
- Caches provider instances for efficiency
- Validates provider configuration on initialization

**How It Works**
1. `council.py` creates ProviderFactory singleton with config
2. Model identifiers like `["ollama:llama2", "openrouter:gpt-4"]` are passed to factory
3. Factory parses each identifier: `("ollama", "llama2")`, `("openrouter", "gpt-4")`
4. Factory routes to appropriate provider and executes query
5. Results returned with original model identifier as key

**Configuration**
See `.env.example` for complete setup instructions. Key variables:
- `DEFAULT_PROVIDER`: Provider for unprefixed model names
- `OPENROUTER_API_KEY`: Required for OpenRouter models
- `OLLAMA_BASE_URL`: Ollama server URL (default: localhost:11434)
- `LMSTUDIO_BASE_URL`: LMStudio server URL (default: localhost:1234)

### Backend Structure (`backend/`)

**`config.py`**
- Contains `COUNCIL_MODELS` (list of OpenRouter model identifiers)
- Contains `COUNCIL_MODELS` (list of model identifiers, any provider)
- Contains `CHAIRMAN_MODEL` (model that synthesizes final answer)
- Uses environment variable `OPENROUTER_API_KEY` from `.env`
- Contains `CONVERSATION_TITLE_MODEL` (fast model for title generation)
- Contains provider configuration (URLs, API keys via environment)
- Backend runs on **port 8001** (NOT 8000 - user had another app on 8000)

**`openrouter.py`**
- `query_model()`: Single async model query
- `query_models_parallel()`: Parallel queries using `asyncio.gather()`
- Returns dict with 'content' and optional 'reasoning_details'
- Graceful degradation: returns None on failure, continues with successful responses
- **Model identifier examples**:
- Cloud: `"openai/gpt-4"`, `"anthropic/claude-sonnet-4"`
- Local: `"ollama:llama2"`, `"lmstudio:mistral"`
- Mixed: `["ollama:llama2", "openrouter:gpt-4"]`

**`council.py`** - The Core Logic
- `stage1_collect_responses()`: Parallel queries to all council models
- `get_factory()`: Creates/returns ProviderFactory singleton with config
- `stage1_collect_responses()`: Parallel queries to all council models via factory
- `stage2_collect_rankings()`:
- Anonymizes responses as "Response A, B, C, etc."
- Creates `label_to_model` mapping for de-anonymization
- Prompts models to evaluate and rank (with strict format requirements)
- Returns tuple: (rankings_list, label_to_model_dict)
- Each ranking includes both raw text and `parsed_ranking` list
- Uses same council models via factory (supports mixed providers)
- `stage3_synthesize_final()`: Chairman synthesizes from all responses + rankings
- `parse_ranking_from_text()`: Extracts "FINAL RANKING:" section, handles both numbered lists and plain format
- `calculate_aggregate_rankings()`: Computes average rank position across all peer evaluations
- `generate_conversation_title()`: Fast title generation for conversations
- All model queries go through ProviderFactory for automatic routing

**`storage.py`**
- JSON-based conversation storage in `data/conversations/`
Expand Down Expand Up @@ -143,24 +207,73 @@ Models are hardcoded in `backend/config.py`. Chairman can be same or different f

## Testing Notes

Use `test_openrouter.py` to verify API connectivity and test different model identifiers before adding to council. The script tests both streaming and non-streaming modes.
**Provider Testing**
Before configuring models in production:

1. **OpenRouter**: Verify API key and model availability
- Check models at https://openrouter.ai/models
- Test with simple query first

2. **Ollama**: Ensure server is running and models are pulled
```bash
ollama serve # Start server
ollama pull llama2 # Download model
ollama list # Verify installed models
curl http://localhost:11434/api/tags # Check API
```

3. **LMStudio**: Start local server and load model
- Open LMStudio → Developer → Start Server
- Verify server at http://localhost:1234

**Factory Validation**
The ProviderFactory has built-in validation:
```python
from backend.providers import ProviderFactory
factory = ProviderFactory(...)
print(factory.validate_all()) # Check all providers
print(factory.get_available_providers()) # List working providers
```

**Testing Mixed Mode**
Set up a test configuration with mixed providers:
```python
COUNCIL_MODELS = [
"ollama:llama2", # Local
"openrouter:gpt-4o", # Cloud
]
```
Verify all providers respond correctly before production use.

## Data Flow Summary

```
User Query
Stage 1: Parallel queries → [individual responses]
ProviderFactory initialization (singleton)
├─ OpenRouterProvider (if configured)
├─ OllamaProvider (if configured)
└─ LMStudioProvider (if configured)
Stage 1: Parse model identifiers → Route to providers → Parallel queries
→ [individual responses with original model IDs]
Stage 2: Anonymize → Parallel ranking queries → [evaluations + parsed rankings]
Stage 2: Anonymize → Route to providers → Parallel ranking queries
→ [evaluations + parsed rankings]
Aggregate Rankings Calculation → [sorted by avg position]
Stage 3: Chairman synthesis with full context
Stage 3: Route chairman model to provider → Synthesis with full context
→ [final answer]
Return: {stage1, stage2, stage3, metadata}
Frontend: Display with tabs + validation UI
```

The entire flow is async/parallel where possible to minimize latency.
**Key Points:**
- All model queries go through ProviderFactory
- Factory automatically routes based on model identifier prefix
- Parallel execution happens both within and across providers
- Each stage can use models from different providers (mixed mode)
- The entire flow is async/parallel where possible to minimize latency
Loading