Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions AI-Travel-Agent/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ amadeus==12.0.0
ipykernel==7.0.1
jupyter==1.1.1
langchain==1.0.2
langchain-community==0.3.27
langchain-community==0.4.1
streamlit==1.51.0
huggingface-hub==1.1.4
pydantic==2.12.3
wikipedia==1.4.0
google-search-results==2.4.2
hf-xet==1.2.0
hf-xet==1.2.0
8 changes: 4 additions & 4 deletions LLM/rag/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
langchain==0.3.24
langchain==0.3.27
langchain-community==0.3.27
langchain-core==0.3.56
langchain-core==0.3.72
langchain-huggingface==0.1.2
huggingface-hub==0.29.1
huggingface-hub>=0.30.0
sentence-transformers==3.4.1
chromadb==0.6.3
transformers==4.53.1
pypdf==6.1.3
torch==2.7.1
langchain-chroma==0.2.2
beautifulsoup4==4.13.3
beautifulsoup4==4.13.3
150 changes: 150 additions & 0 deletions LLM/src/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# RAG Chat with Ollama

A Streamlit-based Retrieval-Augmented Generation (RAG) chat application powered by Ollama on Intel® Core™ Ultra Processors.

## Overview

This application demonstrates a RAG (Retrieval-Augmented Generation) system that allows you to chat with documents using Ollama's language models. Upload your documents, and the system will create embeddings and enable semantic search to provide context-aware responses.

## Prerequisites

- Windows 11 or Ubuntu 20.04+
- Intel® Core™ Ultra Processors or Intel Arc™ Graphics
- 16GB+ RAM recommended

## Setup

### 1. Install Ollama

Download and install Ollama:
```powershell
winget install Ollama.Ollama
```

Or download from [https://ollama.com/download](https://ollama.com/download)

### 2. Building Ollama with GPU Support (Vulkan)

For advanced users who want to build Ollama from source with Vulkan GPU acceleration on Windows:

**a. Install Vulkan SDK**
- Download from: [https://vulkan.lunarg.com/sdk/home](https://vulkan.lunarg.com/sdk/home)

**b. Install TDM-GCC**
- Download from: [https://github.com/jmeubank/tdm-gcc/releases/tag/v10.3.0-tdm64-2](https://github.com/jmeubank/tdm-gcc/releases/tag/v10.3.0-tdm64-2)

**c. Install Go SDK**
- Download Go v1.24.9: [https://go.dev/dl/go1.24.9.windows-amd64.msi](https://go.dev/dl/go1.24.9.windows-amd64.msi)

**d. Build Ollama with Vulkan**
```powershell
# Set environment variables
set CGO_ENABLED=1
set CGO_CFLAGS=-IC:\VulkanSDK\1.4.321.1\Include

# Build with CMake
cmake -B build
cmake --build build --config Release -j14

# Build Go binary
go build

# Run Ollama server (Terminal 1)
go run . serve

# Test with a model (Terminal 2)
ollama run gemma3:270m
```

**Note:** This is for advanced users. The pre-built Ollama installation works fine for most users.

### 3. Pull Language Models

Pull the models you want to use:
```bash
ollama pull llama3.2
ollama pull qwen2.5
ollama pull mistral
```

### 4. Install Python Dependencies

Using pip:
```bash
pip install streamlit ollama chromadb sentence-transformers pypdf
```

Using uv (recommended):
```bash
uv pip install streamlit ollama chromadb sentence-transformers pypdf
```

## Running the Application

### 1. Start Ollama Server

If not already running:
```bash
ollama serve
```

### 2. Run the Streamlit App

```bash
# Using Python directly
streamlit run st_rag_chat.py

# Or using uv
uv run streamlit run st_rag_chat.py
```

### 3. Access the App

Open your browser and navigate to:
```
http://localhost:8501
```

## Usage

1. **Upload Documents**: Use the sidebar to upload PDF or text files
2. **Select Model**: Choose your preferred Ollama model from the dropdown
3. **Process Documents**: Click "Process Documents" to create embeddings
4. **Chat**: Ask questions about your documents in the chat interface
5. **View Sources**: See which document sections were used to answer your questions

## Features

- 📄 **Multi-format Support**: Upload PDF and text documents
- 🤖 **Model Selection**: Choose from available Ollama models
- 🔍 **Semantic Search**: Find relevant context using vector embeddings
- 💬 **Context-Aware Chat**: Get answers based on your documents
- 📚 **Source Attribution**: See which parts of documents were used
- 💾 **Persistent Storage**: ChromaDB vector database for efficient retrieval

## Troubleshooting

**Ollama Connection Error:**
- Ensure Ollama is running: `ollama serve`
- Check if models are installed: `ollama list`

**Memory Issues:**
- Use smaller models like `llama3.2:1b` or `qwen2.5:3b`
- Reduce the number of documents processed at once

**Slow Performance:**
- Ensure GPU drivers are up to date
- Use GPU-accelerated Ollama build (Vulkan)
- Try smaller, faster models

## Technical Stack

- **Ollama**: Local LLM runtime
- **Streamlit**: Web interface
- **ChromaDB**: Vector database
- **Sentence Transformers**: Text embeddings
- **Intel Hardware**: Optimized for Intel Core™ Ultra Processors

## License

This project is licensed under the MIT License. See [LICENSE](../LICENSE) for details.
3 changes: 2 additions & 1 deletion LLM/src/st_ollama.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ def load_models():
list: A list of model names if successful, otherwise an empty list.
"""
try:
model_list = [model["name"] for model in ollama.list()["models"]]
response = ollama.list()
model_list = [model.model for model in response.models]
return model_list
except Exception as e:
st.error(f"Error loading models: {e}")
Expand Down
Loading
Loading