Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions AI-Travel-Agent/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ amadeus==12.0.0
ipykernel==7.0.1
jupyter==1.1.1
langchain==1.0.2
langchain-community==0.3.27
langchain-community==0.4.1
streamlit==1.51.0
huggingface-hub==1.1.4
pydantic==2.12.3
wikipedia==1.4.0
google-search-results==2.4.2
hf-xet==1.2.0
hf-xet==1.2.0
8 changes: 4 additions & 4 deletions LLM/rag/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
langchain==0.3.24
langchain==0.3.27
langchain-community==0.3.27
langchain-core==0.3.56
langchain-core==0.3.72
langchain-huggingface==0.1.2
huggingface-hub==0.29.1
huggingface-hub>=0.30.0
sentence-transformers==3.4.1
chromadb==0.6.3
transformers==4.53.1
pypdf==6.1.3
torch==2.7.1
langchain-chroma==0.2.2
beautifulsoup4==4.13.3
beautifulsoup4==4.13.3
150 changes: 150 additions & 0 deletions LLM/src/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# RAG Chat with Ollama

A Streamlit-based Retrieval-Augmented Generation (RAG) chat application powered by Ollama on Intel® Core™ Ultra Processors.

## Overview

This application demonstrates a RAG (Retrieval-Augmented Generation) system that allows you to chat with documents using Ollama's language models. Upload your documents, and the system will create embeddings and enable semantic search to provide context-aware responses.

## Prerequisites

- Windows 11 or Ubuntu 20.04+
- Intel® Core™ Ultra Processors or Intel Arc™ Graphics
- 16GB+ RAM recommended

## Setup

### 1. Install Ollama

Download and install Ollama:
```powershell
winget install Ollama.Ollama
```

Or download from [https://ollama.com/download](https://ollama.com/download)

### 2. Building Ollama with GPU Support (Vulkan)

For advanced users who want to build Ollama from source with Vulkan GPU acceleration on Windows:

**a. Install Vulkan SDK**
- Download from: [https://vulkan.lunarg.com/sdk/home](https://vulkan.lunarg.com/sdk/home)

**b. Install TDM-GCC**
- Download from: [https://github.com/jmeubank/tdm-gcc/releases/tag/v10.3.0-tdm64-2](https://github.com/jmeubank/tdm-gcc/releases/tag/v10.3.0-tdm64-2)

**c. Install Go SDK**
- Download Go v1.24.9: [https://go.dev/dl/go1.24.9.windows-amd64.msi](https://go.dev/dl/go1.24.9.windows-amd64.msi)

**d. Build Ollama with Vulkan**
```powershell
# Set environment variables
set CGO_ENABLED=1
set CGO_CFLAGS=-IC:\VulkanSDK\1.4.321.1\Include

# Build with CMake
cmake -B build
cmake --build build --config Release -j14

# Build Go binary
go build

# Run Ollama server (Terminal 1)
go run . serve

# Test with a model (Terminal 2)
ollama run gemma3:270m
```

**Note:** This is for advanced users. The pre-built Ollama installation works fine for most users.

### 3. Pull Language Models

Pull the models you want to use:
```bash
ollama pull llama3.2
ollama pull qwen2.5
ollama pull mistral
```

### 4. Install Python Dependencies

Using pip:
```bash
pip install streamlit ollama chromadb sentence-transformers pypdf
```

Using uv (recommended):
```bash
uv pip install streamlit ollama chromadb sentence-transformers pypdf
```

## Running the Application

### 1. Start Ollama Server

If not already running:
```bash
ollama serve
```

### 2. Run the Streamlit App

```bash
# Using Python directly
streamlit run st_rag_chat.py

# Or using uv
uv run streamlit run st_rag_chat.py
```

### 3. Access the App

Open your browser and navigate to:
```
http://localhost:8501
```

## Usage

1. **Upload Documents**: Use the sidebar to upload PDF or text files
2. **Select Model**: Choose your preferred Ollama model from the dropdown
3. **Process Documents**: Click "Process Documents" to create embeddings
4. **Chat**: Ask questions about your documents in the chat interface
5. **View Sources**: See which document sections were used to answer your questions

## Features

- 📄 **Multi-format Support**: Upload PDF and text documents
- 🤖 **Model Selection**: Choose from available Ollama models
- 🔍 **Semantic Search**: Find relevant context using vector embeddings
- 💬 **Context-Aware Chat**: Get answers based on your documents
- 📚 **Source Attribution**: See which parts of documents were used
- 💾 **Persistent Storage**: ChromaDB vector database for efficient retrieval

## Troubleshooting

**Ollama Connection Error:**
- Ensure Ollama is running: `ollama serve`
- Check if models are installed: `ollama list`

**Memory Issues:**
- Use smaller models like `llama3.2:1b` or `qwen2.5:3b`
- Reduce the number of documents processed at once

**Slow Performance:**
- Ensure GPU drivers are up to date
- Use GPU-accelerated Ollama build (Vulkan)
- Try smaller, faster models

## Technical Stack

- **Ollama**: Local LLM runtime
- **Streamlit**: Web interface
- **ChromaDB**: Vector database
- **Sentence Transformers**: Text embeddings
- **Intel Hardware**: Optimized for Intel Core™ Ultra Processors

## License

This project is licensed under the MIT License.
3 changes: 2 additions & 1 deletion LLM/src/st_ollama.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ def load_models():
list: A list of model names if successful, otherwise an empty list.
"""
try:
model_list = [model["name"] for model in ollama.list()["models"]]
response = ollama.list()
model_list = [model.model for model in response.models]
return model_list
except Exception as e:
st.error(f"Error loading models: {e}")
Expand Down
Loading
Loading