Skip to content

happyforever006/coursebot

Repository files navigation

Course Materials RAG System

An advanced Retrieval-Augmented Generation (RAG) system with sequential tool calling capabilities, designed to answer complex questions about course materials using intelligent semantic search and AI-powered analysis.

🚀 Key Features

🧠 Advanced AI Capabilities

  • Sequential Tool Calling: AI can make up to 2 tool calls in sequence for complex, multi-part queries
  • Multi-Provider AI Support: DeepSeek API (default) and Anthropic Claude
  • Intelligent Tool Selection: Automatic choice between content search and course outline tools
  • Complex Query Handling: Course comparisons, multi-part questions, and deep analysis

🔍 Smart Search System

  • Dual Search Tools:
    • Course outline tool for structure and lesson lists
    • Content search tool for specific topics and details
  • Semantic Vector Search: ChromaDB-powered similarity matching
  • Course Resolution: Fuzzy matching for course names
  • Contextual Filtering: Lesson-specific and course-specific searches

🌐 User Experience

  • Comprehensive Answers: Multi-dimensional responses combining different information sources
  • Real-time Processing: Up to 10-second timeout protection with graceful degradation
  • Interactive Web Interface: Modern responsive design with markdown rendering
  • Source Transparency: Complete citation tracking for all information

Overview

This full-stack web application enables users to query course materials and receive intelligent, context-aware responses. The system can handle simple questions with single searches or complex queries requiring multiple tool calls and information synthesis.

Example Complex Queries:

  • "Compare MCP course structure with Advanced Retrieval course content"
  • "Give me the MCP course outline, then find specific details about lesson 3"
  • "What does the Computer Vision course teach about neural networks, and how does it compare to other courses?"

Prerequisites

  • Python 3.13 or higher
  • uv (Python package manager) or pip
  • AI API key (DeepSeek or Anthropic Claude)
  • For Windows: Use Git Bash to run the application commands - Download Git for Windows

Installation

Option 1: Using uv (Recommended)

  1. Install uv (if not already installed)

    curl -LsSf https://astral.sh/uv/install.sh | sh
  2. Install Python dependencies

    uv sync

Option 2: Using pip

pip install chromadb==1.0.15 anthropic==0.58.2 openai>=1.35.0 sentence-transformers==5.0.0 fastapi==0.116.1 uvicorn==0.35.0 python-multipart==0.0.20 python-dotenv==1.1.1

Configuration

Create a .env file in the root directory with your preferred AI provider:

DeepSeek API (Default - Recommended)

# AI Provider Selection
AI_PROVIDER=deepseek

# DeepSeek API Configuration  
DEEPSEEK_API_KEY=your_deepseek_key_here
DEEPSEEK_MODEL=deepseek-chat
DEEPSEEK_BASE_URL=https://api.deepseek.com/v1

Anthropic Claude (Alternative)

# AI Provider Selection
AI_PROVIDER=anthropic

# Anthropic API Configuration
ANTHROPIC_API_KEY=your_anthropic_key_here
ANTHROPIC_MODEL=claude-sonnet-4-20250514

Running the Application

Quick Start

Use the provided shell script:

chmod +x run.sh
./run.sh

Manual Start

cd backend
python -m uvicorn app:app --reload --port 8000

The application will be available at:

  • Web Interface: http://localhost:8000
  • API Documentation: http://localhost:8000/docs
  • ReDoc Documentation: http://localhost:8000/redoc

Usage Examples

Simple Queries (Single Tool Call)

What is the MCP course about?
How many lessons are in the Advanced Retrieval course?

Complex Queries (Sequential Tool Calls)

Give me the MCP course outline, then explain what lesson 3 covers
Compare the structure of MCP and Computer Vision courses
What programming concepts are taught across all courses?

Testing Sequential Tool Calls

Test the advanced sequential calling feature:

# Test via API
curl -X POST http://127.0.0.1:8000/api/query \
-H "Content-Type: application/json" \
-d '{"query": "Give me the MCP course outline, then tell me what lesson 3 covers"}'

# Test via Python
cd backend
python test_manual.py

Architecture

Sequential Tool Calling System

The system features an advanced sequential tool calling architecture that allows AI to make intelligent decisions about when and how to use multiple tools:

User Query → AI Analysis → Tool Selection → Tool Execution → Result Integration
                ↓              ↓              ↓              ↓
             Complex?    → Tool 1 Call  → Tool 2 Call → Final Response

Key Components:

  • ToolCallSession: Manages state across multiple tool calls (max 2 rounds, 10s timeout)
  • AI Generator: Routes to DeepSeek/Anthropic with sequential call support
  • Tool Manager: Coordinates between course outline and content search tools
  • Vector Store: ChromaDB with dual collections (course metadata + content chunks)

Data Flow

  1. Single Tool Scenario: Simple query → AI calls one tool → Response
  2. Sequential Tool Scenario: Complex query → AI calls Tool 1 → Reviews result → AI calls Tool 2 → Integrated response

Performance Features

  • Timeout Protection: 10-second maximum processing time
  • Graceful Degradation: Falls back to available results if later tools fail
  • Smart Termination: Stops when no more tool calls needed or limits reached
  • Error Recovery: Continues processing even if individual tool calls fail

API Endpoints

  • POST /api/query - Main query endpoint with session management
    {
      "query": "Your question here",
      "session_id": "optional-session-id"
    }
  • GET /api/courses - Course statistics and metadata
  • GET /docs - Interactive API documentation
  • GET /redoc - Alternative API documentation

Troubleshooting

Common Issues

Sequential Tool Calls Not Working

  • Verify AI provider is properly configured in .env
  • Check API key validity and rate limits
  • Test with simple queries first, then complex ones

Slow Response Times

  • Normal for sequential calls: 6-15 seconds for complex queries
  • Single tool calls: 2-6 seconds
  • Check network connection and API response times

Import or Module Errors

# Ensure all dependencies are installed
pip install -r requirements.txt  # if available
# or
uv sync

API Key Issues

# Verify .env file exists in root directory
cat .env

# Test API connectivity
curl -X POST http://127.0.0.1:8000/api/query \
-H "Content-Type: application/json" \
-d '{"query": "test"}'

Database Issues

# Reset vector database if needed
rm -rf backend/chroma_db/
# Restart application to rebuild from docs/ folder

Development

Adding New Course Materials

  1. Place .txt files in the docs/ directory
  2. Restart the application
  3. New courses will be automatically processed and indexed

Testing Framework

  • Unit Tests: backend/tests/test_sequential_tools.py
  • Manual Testing: backend/test_manual.py
  • API Testing: Use provided curl commands or test scripts

Code Structure

backend/
├── ai_generator.py          # Sequential tool calling logic
├── search_tools.py          # Course outline & content search tools  
├── vector_store.py          # ChromaDB integration
├── rag_system.py           # Main orchestrator
├── app.py                  # FastAPI application
└── tests/                  # Test suites

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Implement changes with tests
  4. Update documentation as needed
  5. Submit a pull request

License

This project is intended for educational and research purposes.

About

a rag project of online courses

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors