Skip to content

dubreuil534/Agent-rag

Β 
Β 

Repository files navigation

Agentic RAG with Knowledge Graph

Agentic knowledge retrieval redefined with an AI agent system that combines traditional RAG (vector search) with knowledge graph capabilities to analyze and provide insights about big tech companies and their AI initiatives. The system uses PostgreSQL with pgvector for semantic search and Neo4j with Graphiti for temporal knowledge graphs. The goal is to create Agentic RAG at its finest.

Built with:

  • Pydantic AI for the AI Agent Framework
  • Graphiti for the Knowledge Graph
  • Postgres with PGVector for the Vector Database
  • Neo4j for the Knowledge Graph Engine (Graphiti connects to this)
  • FastAPI for the Agent API
  • Claude Code for the AI Coding Assistant (See CLAUDE.md, PLANNING.md, and TASK.md)

Overview

This system includes three main components:

  1. Document Ingestion Pipeline: Processes markdown documents using semantic chunking and builds both vector embeddings and knowledge graph relationships
  2. AI Agent Interface: A conversational agent powered by Pydantic AI that can search across both vector database and knowledge graph
  3. Streaming API: FastAPI backend with real-time streaming responses and comprehensive search capabilities

Prerequisites

  • Python 3.11 or higher
  • PostgreSQL database (such as Neon)
  • Neo4j database (for knowledge graph)
  • LLM Provider API key (OpenAI, Ollama, Gemini, etc.)

Installation

1. Set up a virtual environment

# Create and activate virtual environment
python -m venv venv       # python3 on Linux
source venv/bin/activate  # On Linux/macOS
# or
venv\Scripts\activate     # On Windows

2. Install dependencies

pip install -r requirements.txt

3. Set up required tables in Postgres

Execute the SQL in sql/schema.sql to create all necessary tables, indexes, and functions.

Be sure to change the embedding dimensions on lines 31, 67, and 100 based on your embedding model. OpenAI's text-embedding-3-small is 1536 and nomic-embed-text from Ollama is 768 dimensions, for reference.

Note that this script will drop all tables before creating/recreating!

4. Set up Neo4j

You have a couple easy options for setting up Neo4j:

Option A: Using Local-AI-Packaged (Simplified setup - Recommended)

  1. Clone the repository: git clone https://github.com/coleam00/local-ai-packaged
  2. Follow the installation instructions to set up Neo4j through the package
  3. Note the username and password you set in .env and the URI will be bolt://localhost:7687

Option B: Using Neo4j Desktop

  1. Download and install Neo4j Desktop
  2. Create a new project and add a local DBMS
  3. Start the DBMS and set a password
  4. Note the connection details (URI, username, password)

5. Configure environment variables

Create a .env file in the project root:

# Database Configuration (example Neon connection string)
DATABASE_URL=postgresql://username:[email protected]/neondb

# Neo4j Configuration  
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_password

# LLM Provider Configuration (choose one)
LLM_PROVIDER=openai
LLM_BASE_URL=https://api.openai.com/v1
LLM_API_KEY=sk-your-api-key
LLM_CHOICE=gpt-4.1-mini

# Embedding Configuration
EMBEDDING_PROVIDER=openai
EMBEDDING_BASE_URL=https://api.openai.com/v1
EMBEDDING_API_KEY=sk-your-api-key
EMBEDDING_MODEL=text-embedding-3-small

# Ingestion Configuration
INGESTION_LLM_CHOICE=gpt-4.1-nano  # Faster model for processing

# Application Configuration
APP_ENV=development
LOG_LEVEL=INFO
APP_PORT=8058

For other LLM providers:

# Ollama (Local)
LLM_PROVIDER=ollama
LLM_BASE_URL=http://localhost:11434/v1
LLM_API_KEY=ollama
LLM_CHOICE=qwen2.5:14b-instruct

# OpenRouter
LLM_PROVIDER=openrouter
LLM_BASE_URL=https://openrouter.ai/api/v1
LLM_API_KEY=your-openrouter-key
LLM_CHOICE=anthropic/claude-3-5-sonnet

# Gemini
LLM_PROVIDER=gemini
LLM_BASE_URL=https://generativelanguage.googleapis.com/v1beta
LLM_API_KEY=your-gemini-key
LLM_CHOICE=gemini-2.5-flash

Quick Start

1. Prepare Your Documents

Add your markdown documents to the documents/ folder:

mkdir -p documents
# Add your markdown files about tech companies, AI research, etc.
# Example: documents/google_ai_initiatives.md
#          documents/microsoft_openai_partnership.md

Converting PDFs to Markdown

The system includes a built-in PDF to Markdown converter using Docling for advanced PDF understanding:

# Convert all PDFs from the pdf/ folder (default behavior)
python convert_pdf.py

# Convert a single PDF file
python convert_pdf.py document.pdf

# Convert a PDF from URL (e.g., arXiv papers)
python convert_pdf.py https://arxiv.org/pdf/2408.09869.pdf

# Convert multiple specific PDFs
python convert_pdf.py file1.pdf file2.pdf file3.pdf

# Convert all PDFs from a custom folder
python convert_pdf.py --folder my_pdfs/

# Specify custom output directory
python convert_pdf.py document.pdf -o custom_output_folder

# Verbose output for debugging
python convert_pdf.py -v

Quick Start: Simply place your PDF files in the pdf/ folder and run python convert_pdf.py to convert them all!

The converter will automatically:

  • Download PDFs from URLs if needed
  • Extract text, tables, and structure using Docling's advanced PDF understanding
  • Convert to clean Markdown format
  • Save files in the documents/ folder (or specified output directory)

You can also use the programmatic API:

from ingestion.pdf_converter import PDFConverter

converter = PDFConverter(output_dir="documents")
output_path = converter.convert_to_markdown("document.pdf")
print(f"Converted to: {output_path}")

Note: For a comprehensive example with extensive content, you can copy the provided big_tech_docs folder:

cp -r big_tech_docs/* documents/

This includes 21 detailed documents about major tech companies and their AI initiatives. Be aware that processing all these files into the knowledge graph will take significant time (potentially 30+ minutes) due to the computational complexity of entity extraction and relationship building.

2. Run Document Ingestion

Important: You must run ingestion first to populate the databases before the agent can provide meaningful responses.

# Basic ingestion with semantic chunking
python -m ingestion.ingest

# Clean existing data and re-ingest everything
python -m ingestion.ingest --clean

# Custom settings for faster processing (no knowledge graph)
python -m ingestion.ingest --chunk-size 800 --no-semantic --verbose

The ingestion process will:

  • Parse and semantically chunk your documents
  • Generate embeddings for vector search
  • Extract entities and relationships for the knowledge graph
  • Store everything in PostgreSQL and Neo4j

NOTE that this can take a while because knowledge graphs are very computationally expensive!

3. Configure Agent Behavior (Optional)

Before running the API server, you can customize when the agent uses different tools by modifying the system prompt in agent/prompts.py. The system prompt controls:

  • When to use vector search vs knowledge graph search
  • How to combine results from different sources
  • The agent's reasoning strategy for tool selection

4. Start the API Server (Terminal 1)

# Start the FastAPI server
python -m agent.api

# Server will be available at http://localhost:8058

5. Use the Command Line Interface (Terminal 2)

The CLI provides an interactive way to chat with the agent and see which tools it uses for each query.

# Start the CLI in a separate terminal from the API (connects to default API at http://localhost:8058)
python cli.py

# Connect to a different URL
python cli.py --url http://localhost:8058

# Connect to a specific port
python cli.py --port 8080

CLI Features

  • Real-time streaming responses - See the agent's response as it's generated
  • Tool usage visibility - Understand which tools the agent used:
    • vector_search - Semantic similarity search
    • graph_search - Knowledge graph queries
    • hybrid_search - Combined search approach
  • Session management - Maintains conversation context
  • Color-coded output - Easy to read responses and tool information

Example CLI Session

πŸ€– Agentic RAG with Knowledge Graph CLI
============================================================
Connected to: http://localhost:8058

You: What are Microsoft's AI initiatives?

πŸ€– Assistant:
Microsoft has several major AI initiatives including...

πŸ›  Tools Used:
  1. vector_search (query='Microsoft AI initiatives', limit=10)
  2. graph_search (query='Microsoft AI projects')

────────────────────────────────────────────────────────────

You: How is Microsoft connected to OpenAI?

πŸ€– Assistant:
Microsoft has a significant strategic partnership with OpenAI...

πŸ›  Tools Used:
  1. hybrid_search (query='Microsoft OpenAI partnership', limit=10)
  2. get_entity_relationships (entity='Microsoft')

CLI Commands

  • help - Show available commands
  • health - Check API connection status
  • clear - Clear current session
  • exit or quit - Exit the CLI

6. Test the System

Health Check

curl http://localhost:8058/health

Chat with the Agent (Non-streaming)

curl -X POST "http://localhost:8058/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What are Google'\''s main AI initiatives?"
  }'

Streaming Chat

curl -X POST "http://localhost:8058/chat/stream" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Compare Microsoft and Google'\''s AI strategies",
  }'

How It Works

The Power of Hybrid RAG + Knowledge Graph

This system combines the best of both worlds:

Vector Database (PostgreSQL + pgvector):

  • Semantic similarity search across document chunks
  • Fast retrieval of contextually relevant information
  • Excellent for finding documents about similar topics

Knowledge Graph (Neo4j + Graphiti):

  • Temporal relationships between entities (companies, people, technologies)
  • Graph traversal for discovering connections
  • Perfect for understanding partnerships, acquisitions, and evolution over time

Intelligent Agent:

  • Automatically chooses the best search strategy
  • Combines results from both databases
  • Provides context-aware responses with source citations

Example Queries

The system excels at queries that benefit from both semantic search and relationship understanding:

  • Semantic Questions: "What AI research is Google working on?"

    • Uses vector search to find relevant document chunks about Google's AI research
  • Relationship Questions: "How are Microsoft and OpenAI connected?"

    • Uses knowledge graph to traverse relationships and partnerships
  • Temporal Questions: "Show me the timeline of Meta's AI announcements"

    • Leverages Graphiti's temporal capabilities to track changes over time
  • Complex Analysis: "Compare the AI strategies of FAANG companies"

    • Combines vector search for strategy documents with graph traversal for competitive analysis

Why This Architecture Works So Well

  1. Complementary Strengths: Vector search finds semantically similar content while knowledge graphs reveal hidden connections

  2. Temporal Intelligence: Graphiti tracks how facts change over time, perfect for the rapidly evolving AI landscape

  3. Flexible LLM Support: Switch between OpenAI, Ollama, OpenRouter, or Gemini based on your needs

  4. Production Ready: Comprehensive testing, error handling, and monitoring

API Documentation

Visit http://localhost:8058/docs for interactive API documentation once the server is running.

Key Features

  • Hybrid Search: Seamlessly combines vector similarity and graph traversal
  • Temporal Knowledge: Tracks how information changes over time
  • Streaming Responses: Real-time AI responses with Server-Sent Events
  • PDF Conversion: Built-in PDF to Markdown converter using Docling for advanced PDF understanding
  • Flexible Providers: Support for multiple LLM and embedding providers
  • Semantic Chunking: Intelligent document splitting using LLM analysis
  • Production Ready: Comprehensive testing, logging, and error handling

Project Structure

agentic-rag-knowledge-graph/
β”œβ”€β”€ agent/                  # AI agent and API
β”‚   β”œβ”€β”€ agent.py           # Main Pydantic AI agent
β”‚   β”œβ”€β”€ api.py             # FastAPI application
β”‚   β”œβ”€β”€ providers.py       # LLM provider abstraction
β”‚   └── models.py          # Data models
β”œβ”€β”€ ingestion/             # Document processing
β”‚   β”œβ”€β”€ ingest.py         # Main ingestion pipeline
β”‚   β”œβ”€β”€ chunker.py        # Semantic chunking
β”‚   β”œβ”€β”€ embedder.py       # Embedding generation
β”‚   └── pdf_converter.py  # PDF to Markdown conversion
β”œβ”€β”€ convert_pdf.py         # CLI tool for PDF conversion
β”œβ”€β”€ pdf_to_markdown.py     # Standalone PDF conversion script
β”œβ”€β”€ cli.py                 # Interactive CLI for the agent
β”œβ”€β”€ sql/                   # Database schema
β”œβ”€β”€ documents/             # Your markdown files
└── tests/                # Comprehensive test suite

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=agent --cov=ingestion --cov-report=html

# Run specific test categories
pytest tests/agent/
pytest tests/ingestion/

Troubleshooting

Common Issues

Database Connection: Ensure your DATABASE_URL is correct and the database is accessible

# Test your connection
psql -d "$DATABASE_URL" -c "SELECT 1;"

Neo4j Connection: Verify your Neo4j instance is running and credentials are correct

# Check if Neo4j is accessible (adjust URL as needed)
curl -u neo4j:password http://localhost:7474/db/data/

No Results from Agent: Make sure you've run the ingestion pipeline first

python -m ingestion.ingest --verbose

LLM API Issues: Check your API key and provider configuration in .env


Built with ❀️ using Pydantic AI, FastAPI, PostgreSQL, and Neo4j.

About

All the open source AI Agents hosted on the oTTomator Live Agent Studio platform!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.2%
  • PLpgSQL 2.8%