HKUDS · netbrah · Nov 22, 2025 · Nov 22, 2025 · Nov 24, 2025
diff --git a/.github/INTEGRATION_TEST_SETUP.md b/.github/INTEGRATION_TEST_SETUP.md
@@ -0,0 +1,234 @@
+# GitHub Copilot Setup Steps for LightRAG Integration Testing
+
+This document describes the steps needed to set up and run the LightRAG integration tests locally or in CI/CD.
+
+## Prerequisites
+
+- Python 3.10 or higher
+- Docker and Docker Compose
+- Git
+
+## Local Setup Steps
+
+### 1. Clone the Repository
+
+```bash
+git clone https://github.com/netbrah/LightRAG.git
+cd LightRAG
+```
+
+### 2. Set Up Python Virtual Environment
+
+```bash
+python -m venv .venv
+source .venv/bin/activate  # On Windows: .venv\Scripts\activate
+```
+
+### 3. Install Python Dependencies
+
+```bash
+pip install --upgrade pip
+pip install -e ".[api]"
+pip install pytest pytest-asyncio httpx
+```
+
+### 4. Start Docker Services
+
+The integration tests require three services:
+- **Redis**: For KV and document status storage
+- **Neo4j**: For graph storage
+- **Milvus**: For vector storage
+
+```bash
+cd tests
+docker-compose -f docker-compose.integration.yml up -d
+```
+
+### 5. Wait for Services to Be Ready
+
+```bash
+# Wait for Redis
+until docker exec lightrag-test-redis redis-cli ping | grep -q PONG; do sleep 2; done
+
+# Wait for Neo4j (may take up to 2 minutes)
+until docker exec lightrag-test-neo4j cypher-shell -u neo4j -p testpassword123 "RETURN 1" 2>/dev/null | grep -q "1"; do sleep 5; done
+
+# Wait for Milvus (may take up to 3 minutes)
+until curl -s http://localhost:9091/healthz | grep -q "OK"; do sleep 5; done
+```
+
+### 6. Start Mock OpenAI Server
+
+The mock server simulates OpenAI API responses for testing without requiring actual API keys.
+
+```bash
+cd tests
+python mock_openai_server.py --host 127.0.0.1 --port 8000 &
+MOCK_PID=$!
+
+# Wait for it to be ready
+until curl -s http://127.0.0.1:8000/health | grep -q "healthy"; do sleep 1; done
+```
+
+### 7. Prepare Test Environment
+
+```bash
+cd tests
+cp .env.integration .env
+mkdir -p test_inputs test_rag_storage
+```
+
+### 8. Start LightRAG Server
+
+```bash
+cd tests
+lightrag-server &
+LIGHTRAG_PID=$!
+
+# Wait for it to be ready
+until curl -s http://localhost:9621/health | grep -q "status"; do sleep 2; done
+```
+
+### 9. Run Integration Tests
+
+```bash
+cd tests
+python integration_test.py
+```
+
+### 10. Cleanup
+
+```bash
+# Stop servers
+kill $LIGHTRAG_PID
+kill $MOCK_PID
+
+# Stop Docker services
+docker-compose -f docker-compose.integration.yml down -v
+
+# Remove test artifacts
+rm -rf test_inputs test_rag_storage .env
+```
+
+## Service Configuration Details
+
+### Redis Configuration
+- **Port**: 6379
+- **Container**: lightrag-test-redis
+- **Purpose**: KV storage and document status tracking
+
+### Neo4j Configuration
+- **HTTP Port**: 7474
+- **Bolt Port**: 7687
+- **Container**: lightrag-test-neo4j
+- **Credentials**: neo4j/testpassword123
+- **Purpose**: Graph knowledge base storage
+
+### Milvus Configuration
+- **API Port**: 19530
+- **Health Port**: 9091
+- **Container**: lightrag-test-milvus
+- **Database**: lightrag_test
+- **Purpose**: Vector embeddings storage
+
+### Mock OpenAI Server Configuration
+- **Port**: 8000
+- **Endpoints**:
+  - `/v1/chat/completions` - Mock LLM responses
+  - `/v1/embeddings` - Mock embedding generation
+  - `/health` - Health check
+
+### LightRAG Server Configuration
+- **Port**: 9621
+- **Configuration**: tests/.env.integration
+- **Storage Backends**:
+  - KV: RedisKVStorage
+  - Doc Status: RedisDocStatusStorage
+  - Vector: MilvusVectorDBStorage
+  - Graph: Neo4JStorage
+
+## CI/CD Integration
+
+The integration tests are automatically run on every commit via GitHub Actions. See `.github/workflows/integration-test.yml` for the workflow configuration.
+
+### Workflow Triggers
+- Push to branches: main, dev, copilot/**
+- Pull requests to: main, dev
+- Manual workflow dispatch
+
+### Workflow Steps
+1. Checkout code
+2. Set up Python environment
+3. Install dependencies
+4. Start Docker services (Redis, Neo4j, Milvus)
+5. Wait for all services to be healthy
+6. Start Mock OpenAI server
+7. Configure test environment
+8. Start LightRAG server
+9. Run integration tests
+10. Collect logs on failure
+11. Cleanup all resources
+
+## Test Coverage
+
+The integration tests validate:
+
+1. **Health Check**: Server availability and basic functionality
+2. **Document Indexing**:
+   - File upload (C++ source files)
+   - Text insertion
+   - Multiple file formats
+3. **Query Operations**:
+   - Naive mode
+   - Local mode
+   - Global mode
+   - Hybrid mode
+4. **Structured Data Retrieval**:
+   - Entity extraction
+   - Relationship mapping
+   - Chunk retrieval
+5. **Graph Operations**:
+   - Graph data retrieval
+   - Node and edge counting
+
+## Sample Test Repository
+
+The tests use a sample C++ repository located at `tests/sample_cpp_repo/`:
+- **Files**: calculator.h, calculator.cpp, utils.h, utils.cpp, main.cpp
+- **Purpose**: Demonstrates code indexing and querying capabilities
+- **Content**: Simple calculator implementation with documentation
+
+## Troubleshooting
+
+### Services Not Starting
+- Check Docker is running: `docker ps`
+- Check port availability: `lsof -i :6379,7687,19530,8000,9621`
+- Review Docker logs: `docker-compose -f tests/docker-compose.integration.yml logs`
+
+### Mock Server Issues
+- Verify port 8000 is available
+- Check mock server logs
+- Test health endpoint: `curl http://127.0.0.1:8000/health`
+
+### LightRAG Server Issues
+- Check environment file: `tests/.env`
+- Review server logs: `cat tests/lightrag.log*`
+- Verify storage connections
+
+### Test Failures
+- Ensure all services are healthy before running tests
+- Check network connectivity between services
+- Review test output for specific error messages
+
+## Environment Variables
+
+Key environment variables used in integration tests:
+
+- `LIGHTRAG_API_URL`: LightRAG server URL (default: http://localhost:9621)
+- `LLM_BINDING_HOST`: Mock OpenAI server URL (default: http://127.0.0.1:8000)
+- `EMBEDDING_BINDING_HOST`: Mock embedding server URL (default: http://127.0.0.1:8000)
+- `REDIS_URI`: Redis connection string
+- `NEO4J_URI`: Neo4j connection string
+- `MILVUS_URI`: Milvus connection string
+
+All configurations are defined in `tests/.env.integration`.
diff --git a/.github/workflows/integration-test.yml b/.github/workflows/integration-test.yml
@@ -0,0 +1,164 @@
+name: Integration Tests
+
+on:
+  push:
+  pull_request:
+  workflow_dispatch:
+
+jobs:
+  integration-test:
+    name: Full Integration Test
+    runs-on: ubuntu-latest
+    timeout-minutes: 30
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+
+      - name: Set up Python 3.11
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.11'
+
+      - name: Cache pip packages
+        uses: actions/cache@v4
+        with:
+          path: ~/.cache/pip
+          key: ${{ runner.os }}-pip-integration-${{ hashFiles('**/pyproject.toml') }}
+          restore-keys: |
+            ${{ runner.os }}-pip-integration-
+            ${{ runner.os }}-pip-
+
+      - name: Install Python dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install -e ".[api]"
+          pip install pytest pytest-asyncio httpx
+
+      - name: Create minimal frontend stub for testing
+        run: |
+          mkdir -p lightrag/api/webui
+          echo '<!DOCTYPE html><html><head><title>LightRAG Test</title></head><body><h1>Integration Test Mode</h1></body></html>' > lightrag/api/webui/index.html
+          echo "Created minimal frontend stub for integration testing"
+
+      - name: Start Docker services (Redis, Neo4j, Milvus)
+        run: |
+          cd tests
+          docker compose -f docker-compose.integration.yml up -d
+          echo "Waiting for services to be ready..."
+
+      - name: Wait for Redis
+        run: |
+          echo "Waiting for Redis to be ready..."
+          timeout 60 bash -c 'until docker exec lightrag-test-redis redis-cli ping | grep -q PONG; do sleep 2; done'
+          echo "✅ Redis is ready"
+
+      - name: Wait for Neo4j
+        run: |
+          echo "Waiting for Neo4j to be ready..."
+          timeout 120 bash -c 'until docker exec lightrag-test-neo4j cypher-shell -u neo4j -p testpassword123 "RETURN 1" 2>/dev/null | grep -q "1"; do sleep 5; done'
+          echo "✅ Neo4j is ready"
+
+      - name: Wait for Milvus
+        run: |
+          echo "Waiting for Milvus to be ready..."
+          timeout 180 bash -c 'until curl -s http://localhost:9091/healthz | grep -q "OK"; do sleep 5; done'
+          echo "✅ Milvus is ready"
+
+      - name: Verify services are running
+        run: |
+          docker ps
+          echo "Testing service connectivity..."
+          docker exec lightrag-test-redis redis-cli ping
+          docker exec lightrag-test-neo4j cypher-shell -u neo4j -p testpassword123 "RETURN 1"
+          curl -s http://localhost:9091/healthz
+
+      - name: Start Mock OpenAI Server
+        run: |
+          echo "Starting Mock OpenAI Server..."
+          cd tests
+          python mock_openai_server.py --host 127.0.0.1 --port 8000 &
+          MOCK_PID=$!
+          echo "MOCK_SERVER_PID=${MOCK_PID}" >> $GITHUB_ENV
+
+          # Wait for mock server to be ready
+          echo "Waiting for mock server to be ready..."
+          timeout 30 bash -c 'until curl -s http://127.0.0.1:8000/health | grep -q "healthy"; do sleep 1; done'
+          echo "✅ Mock OpenAI Server is ready (PID: ${MOCK_PID})"
+
+      - name: Prepare test environment
+        run: |
+          cd tests
+          cp .env.integration .env
+          mkdir -p test_inputs test_rag_storage
+          echo "Environment prepared for testing"
+
+      - name: Start LightRAG Server
+        run: |
+          cd tests
+          echo "Starting LightRAG Server..."
+          lightrag-server &
+          LIGHTRAG_PID=$!
+          echo "LIGHTRAG_SERVER_PID=${LIGHTRAG_PID}" >> $GITHUB_ENV
+
+          # Wait for LightRAG server to be ready
+          echo "Waiting for LightRAG server to be ready..."
+          timeout 60 bash -c 'until curl -s http://localhost:9621/health | grep -q "status"; do sleep 2; done'
+          echo "✅ LightRAG Server is ready (PID: ${LIGHTRAG_PID})"
+
+      - name: Run Integration Tests
+        run: |
+          cd tests
+          python integration_test.py
+        env:
+          LIGHTRAG_API_URL: http://localhost:9621
+
+      - name: Collect logs on failure
+        if: failure()
+        run: |
+          echo "=== LightRAG Server Logs ==="
+          cat tests/lightrag.log* 2>/dev/null || echo "No LightRAG logs found"
+
+          echo "=== Docker Service Logs ==="
+          docker compose -f tests/docker-compose.integration.yml logs
+
+      - name: Stop LightRAG Server
+        if: always()
+        run: |
+          if [ ! -z "$LIGHTRAG_SERVER_PID" ]; then
+            echo "Stopping LightRAG Server (PID: $LIGHTRAG_SERVER_PID)..."
+            kill $LIGHTRAG_SERVER_PID 2>/dev/null || true
+            sleep 2
+          fi
+
+      - name: Stop Mock OpenAI Server
+        if: always()
+        run: |
+          if [ ! -z "$MOCK_SERVER_PID" ]; then
+            echo "Stopping Mock OpenAI Server (PID: $MOCK_SERVER_PID)..."
+            kill $MOCK_SERVER_PID 2>/dev/null || true
+          fi
+
+      - name: Stop Docker services
+        if: always()
+        run: |
+          cd tests
+          docker compose -f docker-compose.integration.yml down -v
+          echo "Docker services stopped and volumes removed"
+
+      - name: Cleanup test artifacts
+        if: always()
+        run: |
+          cd tests
+          rm -rf test_inputs test_rag_storage .env
+          echo "Test artifacts cleaned up"
+
+      - name: Upload test artifacts
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: integration-test-artifacts
+          path: |
+            tests/lightrag.log*
+            tests/test_rag_storage/
+          retention-days: 7