diff --git a/.github/workflows/deploy-codeengine.yml b/.github/workflows/deploy-codeengine.yml new file mode 100644 index 00000000..435a95aa --- /dev/null +++ b/.github/workflows/deploy-codeengine.yml @@ -0,0 +1,56 @@ +name: Deploy to IBM Cloud Code Engine + +on: + workflow_dispatch: + inputs: + IBM_CLOUD_REGION: + description: 'IBM Cloud region' + required: true + default: 'us-south' + IBM_CLOUD_RESOURCE_GROUP: + description: 'IBM Cloud resource group' + required: true + default: 'Default' + CE_PROJECT_NAME: + description: 'Code Engine project name' + required: true + CE_APP_NAME: + description: 'Code Engine app name' + required: true + DOCKER_IMAGE: + description: 'Docker image name' + required: true + +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - name: Checkout code + uses: actions/checkout@v3 + + - name: Install IBM Cloud CLI + run: | + curl -fsSL https://clis.cloud.ibm.com/install/linux | sh + ibmcloud --version + ibmcloud plugin install -f code-engine + + - name: Build and push Docker image + env: + IBM_CLOUD_API_KEY: ${{ secrets.IBM_CLOUD_API_KEY }} + IBM_CLOUD_REGION: ${{ github.event.inputs.IBM_CLOUD_REGION }} + DOCKER_IMAGE: ${{ github.event.inputs.DOCKER_IMAGE }} + run: | + ibmcloud login --apikey "$IBM_CLOUD_API_KEY" -r "$IBM_CLOUD_REGION" + ibmcloud cr login + docker build -t "$DOCKER_IMAGE" -f backend/Dockerfile.codeengine backend + docker push "$DOCKER_IMAGE" + + - name: Deploy to Code Engine + env: + IBM_CLOUD_API_KEY: ${{ secrets.IBM_CLOUD_API_KEY }} + IBM_CLOUD_REGION: ${{ github.event.inputs.IBM_CLOUD_REGION }} + IBM_CLOUD_RESOURCE_GROUP: ${{ github.event.inputs.IBM_CLOUD_RESOURCE_GROUP }} + CE_PROJECT_NAME: ${{ github.event.inputs.CE_PROJECT_NAME }} + CE_APP_NAME: ${{ github.event.inputs.CE_APP_NAME }} + DOCKER_IMAGE: ${{ github.event.inputs.DOCKER_IMAGE }} + run: ./scripts/deploy_codeengine.sh diff --git a/GEMINI.md b/GEMINI.md new file mode 100644 index 00000000..88676ef7 --- /dev/null +++ b/GEMINI.md @@ -0,0 +1,68 @@ +# RAG Modulo Agentic Development - Gemini + +This document outlines the development process for Gemini, an AI agent, working on the RAG Modulo project. + +## 🎯 Current Mission: Agentic RAG Platform Development + +**Priority:** Enhance the RAG platform with new features, fix bugs, and improve performance. + +## 🧠 Development Philosophy + +- **Understand First**: Before making any changes, thoroughly understand the codebase, architecture, and existing conventions. +- **Plan Thoughtfully**: Create a clear and concise plan before implementing any changes. +- **Implement Systematically**: Execute the plan in a structured manner, with regular verification and testing. +- **Test Rigorously**: Ensure all changes are covered by tests and that all tests pass. +- **Document Clearly**: Update documentation to reflect any changes made to the codebase. + +## 📋 Project Context Essentials + +- **Architecture**: Python FastAPI backend + React frontend. +- **Focus**: Transform basic RAG into an agentic AI platform with agent orchestration. +- **Tech Stack**: Python, FastAPI, React, Docker, a variety of vector databases. +- **Quality Standards**: >90% test coverage, clean code, and comprehensive documentation. + +## 🚀 Development Workflow + +### **Phase 1: Research** +- Understand the codebase structure and dependencies. +- Validate assumptions before proceeding. +- Use context compaction to focus on key insights. + +### **Phase 2: Planning** +- Create precise, detailed implementation plans. +- Outline exact files to edit and verification steps. +- Compress findings into actionable implementation steps. + +### **Phase 3: Implementation** +- Execute plans systematically with verification. +- Compact and update context after each stage. +- Maintain high human engagement for quality. + +## 🤖 Agent Development Instructions + +### **Quality Gates (Must Follow)** +- **Pre-Commit**: Always run `make pre-commit-run` and tests before committing. +- **Test Coverage**: Add comprehensive tests for new features (>90% coverage). +- **Code Patterns**: Follow existing patterns in `backend/` and `frontend/`. +- **Branch Strategy**: Create feature branches for each issue (`feature/issue-XXX`). +- **Commit Messages**: Descriptive commits following conventional format. + +### **Technology Stack Commands** +- **Python**: `poetry run ` for all Python operations. +- **Frontend**: `npm run dev` for React development. +- **Testing**: `make test-unit-fast`, `make test-integration`. +- **Linting**: `make lint`, `make fix-all`. + +### **Docker Compose Commands (V2 Required)** +- **Local Development**: `docker compose -f docker-compose.dev.yml up -d` +- **Build Development**: `docker compose -f docker-compose.dev.yml build backend` +- **Production Testing**: `make run-ghcr` (uses pre-built GHCR images) +- **Stop Services**: `docker compose -f docker-compose.dev.yml down` + +## ✅ Success Criteria + +- All tests pass. +- Code follows project style. +- Security guidelines followed. +- Documentation updated. +- Issues properly implemented. diff --git a/backend/Dockerfile.codeengine b/backend/Dockerfile.codeengine new file mode 100644 index 00000000..ee1b6de4 --- /dev/null +++ b/backend/Dockerfile.codeengine @@ -0,0 +1,39 @@ +# Use a slim Python image as a base +FROM python:3.11-slim as builder + +# Set the working directory +WORKDIR /app + +# Install poetry +RUN pip install poetry + +# Copy only the dependency files to leverage Docker cache +COPY pyproject.toml poetry.lock ./ + +# Install dependencies into a virtual environment +RUN poetry config virtualenvs.in-project true && \ + poetry install --only main --no-root + +# Copy the rest of the application code +COPY . . + +# Final stage +FROM python:3.11-slim + +# Set the working directory +WORKDIR /app + +# Copy the virtual environment from the builder stage +COPY --from=builder /app/.venv ./.venv + +# Copy the application code from the builder stage +COPY --from=builder /app/ . + +# Activate the virtual environment +ENV PATH="/app/.venv/bin:$PATH" + +# Expose the port the app runs on +EXPOSE 8000 + +# Run the application +CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"] diff --git a/backend/core/config.py b/backend/core/config.py index c9200167..aad91447 100644 --- a/backend/core/config.py +++ b/backend/core/config.py @@ -5,9 +5,10 @@ from functools import lru_cache from typing import Annotated -from pydantic import field_validator +from pydantic import computed_field, field_validator from pydantic.fields import Field from pydantic_settings import BaseSettings, SettingsConfigDict +from sqlalchemy import URL from core.logging_utils import get_logger @@ -266,10 +267,31 @@ class Settings(BaseSettings): oidc_token_url: Annotated[str | None, Field(default=None, alias="OIDC_TOKEN_URL")] oidc_userinfo_endpoint: Annotated[str | None, Field(default=None, alias="OIDC_USERINFO_ENDPOINT")] oidc_introspection_endpoint: Annotated[str | None, Field(default=None, alias="OIDC_INTROSPECTION_ENDPOINT")] + ibm_cloud_api_key: Annotated[str | None, Field(default=None, alias="IBM_CLOUD_API_KEY")] # JWT settings jwt_algorithm: Annotated[str, Field(default="HS256", alias="JWT_ALGORITHM")] + @computed_field # type: ignore[misc] + @property + def database_url(self) -> URL: + """Construct database URL from components.""" + host = self.collectiondb_host + # In a test environment, if the host is localhost, it's likely a local dev setup + # where the test container is running on a Docker network. + # The service name in docker-compose is 'postgres', so we switch to that. + if self.testing and host == "localhost": + host = os.environ.get("DB_HOST", "postgres") + + return URL.create( + drivername="postgresql", + username=self.collectiondb_user, + password=self.collectiondb_pass, + host=host, + port=self.collectiondb_port, + database=self.collectiondb_name, + ) + # RBAC settings rbac_mapping: Annotated[ dict[str, dict[str, list[str]]], diff --git a/backend/rag_solution/doc_utils.py b/backend/rag_solution/doc_utils.py index f081ecce..a08aa8ab 100644 --- a/backend/rag_solution/doc_utils.py +++ b/backend/rag_solution/doc_utils.py @@ -8,6 +8,7 @@ import os import uuid +from core.config import get_settings from vectordbs.data_types import Document, DocumentChunk, DocumentChunkMetadata, DocumentMetadata, Source @@ -30,6 +31,7 @@ def _get_embeddings_for_doc_utils(text: str | list[str]) -> list[list[float]]: Exception: If other unexpected errors occur """ # Import here to avoid circular imports + from core.config import get_settings from core.custom_exceptions import LLMProviderError # pylint: disable=import-outside-toplevel from sqlalchemy.exc import SQLAlchemyError # pylint: disable=import-outside-toplevel @@ -37,7 +39,8 @@ def _get_embeddings_for_doc_utils(text: str | list[str]) -> list[list[float]]: from rag_solution.generation.providers.factory import LLMProviderFactory # pylint: disable=import-outside-toplevel # Create session and get embeddings in one clean flow - session_factory = create_session_factory() + settings = get_settings() + session_factory = create_session_factory(settings) db = session_factory() try: diff --git a/backend/rag_solution/file_management/database.py b/backend/rag_solution/file_management/database.py index 46a21573..210ed36b 100644 --- a/backend/rag_solution/file_management/database.py +++ b/backend/rag_solution/file_management/database.py @@ -1,10 +1,10 @@ -# backend/rag_solution/file_management/database.py +"""Database management for the RAG Modulo application.""" import logging import os from collections.abc import Generator from core.config import Settings, get_settings -from sqlalchemy import URL, create_engine +from sqlalchemy import create_engine from sqlalchemy.exc import SQLAlchemyError from sqlalchemy.orm import Session, declarative_base, sessionmaker @@ -16,38 +16,12 @@ if not os.environ.get("PYTEST_CURRENT_TEST"): logger.info("Database module is being imported") +# Get settings once at module level +settings = get_settings() -# Initialize database components with dependency injection -def create_database_url(settings: Settings | None = None) -> URL: - """Create database URL from settings.""" - if settings is None: - settings = get_settings() - - host = os.environ.get("DB_HOST", settings.collectiondb_host) - # When running in Docker test container, use "postgres" as host - # When running locally, use "localhost" as host - if os.environ.get("PYTEST_CURRENT_TEST") and host == "localhost": - host = "postgres" - - database_url = URL.create( - drivername="postgresql", - username=settings.collectiondb_user, - password=settings.collectiondb_pass, - host=host, # Use the adjusted host - port=settings.collectiondb_port, - database=settings.collectiondb_name, - ) - - if not os.environ.get("PYTEST_CURRENT_TEST"): - logger.debug(f"Database URL: {database_url}") - - return database_url - - -# Create database components using default settings +# Create database components using settings # This maintains backward compatibility while enabling dependency injection -_default_database_url = create_database_url() -engine = create_engine(_default_database_url, echo=not bool(os.environ.get("PYTEST_CURRENT_TEST"))) +engine = create_engine(settings.database_url, echo=not bool(os.environ.get("PYTEST_CURRENT_TEST"))) SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine) Base = declarative_base() @@ -55,14 +29,10 @@ def create_database_url(settings: Settings | None = None) -> URL: logger.info("Base has been created") -def create_session_factory(settings: Settings | None = None) -> sessionmaker[Session]: +def create_session_factory(db_settings: Settings) -> sessionmaker: """Create a sessionmaker with injected settings for dependency injection.""" - if settings is None: - settings = get_settings() - - database_url = create_database_url(settings) - engine = create_engine(database_url, echo=not bool(os.environ.get("PYTEST_CURRENT_TEST"))) - return sessionmaker(autocommit=False, autoflush=False, bind=engine) + db_engine = create_engine(db_settings.database_url, echo=not bool(os.environ.get("PYTEST_CURRENT_TEST"))) + return sessionmaker(autocommit=False, autoflush=False, bind=db_engine) def get_db() -> Generator[Session, None, None]: @@ -83,11 +53,11 @@ def get_db() -> Generator[Session, None, None]: logger.info("Creating a new database session.") yield db except SQLAlchemyError as e: - logger.error(f"A database error occurred: {e}", exc_info=True) + logger.error("A database error occurred: %s", e, exc_info=True) db.rollback() raise except Exception as e: - logger.error(f"An unexpected error occurred: {e}", exc_info=True) + logger.error("An unexpected error occurred: %s", e, exc_info=True) raise finally: db.close() diff --git a/backend/tests/unit/test_core_config.py b/backend/tests/unit/test_core_config.py index df5d0fa0..5305227f 100644 --- a/backend/tests/unit/test_core_config.py +++ b/backend/tests/unit/test_core_config.py @@ -7,7 +7,7 @@ from unittest.mock import patch import pytest -from core.config import Settings +from backend.core.config import Settings @pytest.mark.unit @@ -80,15 +80,25 @@ def test_cot_token_budget_multiplier_type(self) -> None: assert isinstance(settings.cot_token_budget_multiplier, float) assert settings.cot_token_budget_multiplier > 0.0 - def test_cot_integration_with_existing_settings(self, integration_settings: Any) -> None: - """Test CoT settings integrate properly with existing configuration.""" - settings = integration_settings - # Verify existing settings still work - assert hasattr(settings, "jwt_secret_key") - assert hasattr(settings, "rag_llm") +@pytest.mark.unit +class TestDatabaseUrlConfiguration: + """Test database_url computed property in Settings.""" - # Verify CoT settings are available - assert hasattr(settings, "cot_max_reasoning_depth") - assert hasattr(settings, "cot_reasoning_strategy") - assert hasattr(settings, "cot_token_budget_multiplier") + def test_database_url_construction(self) -> None: + """Test that the database_url is constructed correctly from default settings.""" + settings = Settings() # type: ignore[call-arg] + expected_url = ( + f"postgresql://{settings.collectiondb_user}:{settings.collectiondb_pass}@" + f"{settings.collectiondb_host}:{settings.collectiondb_port}/{settings.collectiondb_name}" + ) + assert str(settings.database_url) == expected_url + + def test_database_url_testing_environment(self) -> None: + """Test that the database_url switches host in a testing environment.""" + settings = Settings(testing=True) # type: ignore[call-arg] + expected_url = ( + f"postgresql://{settings.collectiondb_user}:{settings.collectiondb_pass}@" + f"postgres:{settings.collectiondb_port}/{settings.collectiondb_name}" + ) + assert str(settings.database_url) == expected_url diff --git a/backend/tests/unit/test_podcast_service_unit.py b/backend/tests/unit/test_podcast_service_unit.py index 2a42b35c..d163b35f 100644 --- a/backend/tests/unit/test_podcast_service_unit.py +++ b/backend/tests/unit/test_podcast_service_unit.py @@ -12,7 +12,7 @@ import pytest from sqlalchemy.ext.asyncio import AsyncSession -from rag_solution.schemas.podcast_schema import ( +from backend.rag_solution.schemas.podcast_schema import ( AudioFormat, PodcastDuration, PodcastGenerationInput, @@ -21,9 +21,9 @@ VoiceGender, VoiceSettings, ) -from rag_solution.services.collection_service import CollectionService -from rag_solution.services.podcast_service import PodcastService -from rag_solution.services.search_service import SearchService +from backend.rag_solution.services.collection_service import CollectionService +from backend.rag_solution.services.podcast_service import PodcastService +from backend.rag_solution.services.search_service import SearchService @pytest.mark.unit @@ -103,13 +103,12 @@ async def test_generate_podcast_creates_record(self, mock_service: PodcastServic # Mock document count validation mock_service.collection_service.count_documents = AsyncMock(return_value=10) # type: ignore[attr-defined] - # Mock active podcast count check - mock_service.repository.count_active_for_user = AsyncMock(return_value=0) # type: ignore[method-assign] + with patch.object(mock_service.repository, "count_active_for_user", new=AsyncMock(return_value=0)), patch.object( + mock_service.repository, "create", new=AsyncMock(return_value=mock_podcast) + ) as mock_create: + background_tasks = Mock() + background_tasks.add_task = Mock() - background_tasks = Mock() - background_tasks.add_task = Mock() - - with patch.object(mock_service.repository, "create", new=AsyncMock(return_value=mock_podcast)) as mock_create: result = await mock_service.generate_podcast(podcast_input, background_tasks) assert result is not None @@ -138,12 +137,13 @@ async def test_get_podcast_returns_output(self, mock_service: PodcastService) -> mock_podcast = Mock() mock_podcast.user_id = user_id - with patch.object(mock_service.repository, "get_by_id", new=AsyncMock(return_value=mock_podcast)) as mock_get: - with patch.object(mock_service.repository, "to_schema", return_value=mock_output): - result = await mock_service.get_podcast(podcast_id, user_id) + with patch.object(mock_service.repository, "get_by_id", new=AsyncMock(return_value=mock_podcast)) as mock_get, patch.object( + mock_service.repository, "to_schema", return_value=mock_output + ): + result = await mock_service.get_podcast(podcast_id, user_id) - assert result == mock_output - mock_get.assert_called_once_with(podcast_id) + assert result == mock_output + mock_get.assert_called_once_with(podcast_id) @pytest.mark.asyncio async def test_list_user_podcasts(self, mock_service: PodcastService) -> None: @@ -165,12 +165,13 @@ async def test_delete_podcast(self, mock_service: PodcastService) -> None: podcast_id = uuid4() user_id = uuid4() - with patch.object(mock_service.repository, "get_by_id", new=AsyncMock(return_value=Mock(user_id=user_id))): - with patch.object(mock_service.repository, "delete", new=AsyncMock(return_value=True)) as mock_delete: - result = await mock_service.delete_podcast(podcast_id, user_id) + with patch.object( + mock_service.repository, "get_by_id", new=AsyncMock(return_value=Mock(user_id=user_id)) + ), patch.object(mock_service.repository, "delete", new=AsyncMock(return_value=True)) as mock_delete: + result = await mock_service.delete_podcast(podcast_id, user_id) - assert result is True - mock_delete.assert_called_once_with(podcast_id) + assert result is True + mock_delete.assert_called_once_with(podcast_id) @pytest.mark.unit @@ -191,7 +192,7 @@ def mock_service(self) -> PodcastService: ) @pytest.mark.asyncio - async def test_validate_podcast_input(self, mock_service: PodcastService) -> None: + async def test_validate_podcast_input(self) -> None: """Unit: Validates podcast input schema.""" podcast_input = PodcastGenerationInput( user_id=uuid4(), @@ -247,9 +248,7 @@ async def test_retrieve_content_uses_description_in_query(self, mock_service: Po assert description in search_input.question @pytest.mark.asyncio - async def test_retrieve_content_uses_generic_query_without_description( - self, mock_service: PodcastService - ) -> None: + async def test_retrieve_content_uses_generic_query_without_description(self, mock_service: PodcastService) -> None: """Unit: _retrieve_content uses generic query if no description.""" podcast_input = PodcastGenerationInput( user_id=uuid4(), @@ -267,7 +266,7 @@ async def test_retrieve_content_uses_generic_query_without_description( assert "Provide a comprehensive overview" in search_input.question @pytest.mark.asyncio - @patch("rag_solution.services.podcast_service.LLMProviderFactory") + @patch("backend.rag_solution.services.podcast_service.LLMProviderFactory") async def test_generate_script_uses_description_in_prompt( self, mock_llm_factory: Mock, mock_service: PodcastService ) -> None: @@ -290,7 +289,7 @@ async def test_generate_script_uses_description_in_prompt( assert f"Topic/Focus: {description}" in prompt @pytest.mark.asyncio - @patch("rag_solution.services.podcast_service.LLMProviderFactory") + @patch("backend.rag_solution.services.podcast_service.LLMProviderFactory") async def test_generate_script_uses_generic_topic_without_description( self, mock_llm_factory: Mock, mock_service: PodcastService ) -> None: diff --git a/docs/deployment/code_engine.md b/docs/deployment/code_engine.md new file mode 100644 index 00000000..ecdee382 --- /dev/null +++ b/docs/deployment/code_engine.md @@ -0,0 +1,39 @@ +# Deploying to IBM Cloud Code Engine + +This document provides instructions on how to deploy the RAG Modulo application to IBM Cloud Code Engine, a serverless platform that runs containers. + +## Prerequisites + +Before you begin, you will need: + +* An [IBM Cloud account](https://cloud.ibm.com/registration). +* The [IBM Cloud CLI](https://cloud.ibm.com/docs/cli/cli) installed on your local machine. + +## GitHub Secrets Setup + +To deploy to IBM Cloud Code Engine, you need to configure the following secret in your GitHub repository: + +* `IBM_CLOUD_API_KEY`: Your IBM Cloud API key. You can create one [here](https://cloud.ibm.com/iam/apikeys). + +To add a secret to your GitHub repository, go to **Settings > Secrets and variables > Actions** and click **New repository secret**. + +## Deployment Steps + +1. Go to the **Actions** tab in your GitHub repository. +2. Under **Workflows**, select **Deploy to IBM Cloud Code Engine**. +3. Click **Run workflow**. +4. Fill in the required input parameters: + * **IBM Cloud region**: The IBM Cloud region where you want to deploy the application (e.g., `us-south`). + * **IBM Cloud resource group**: The IBM Cloud resource group to use (e.g., `Default`). + * **Code Engine project name**: The name of the Code Engine project to create or use. + * **Code Engine app name**: The name of the application to create in Code Engine. + * **Docker image name**: The name of the Docker image to build and push. This should be in the format `us.icr.io//`, where `` is your container registry namespace and `` is the name of the image. +5. Click **Run workflow**. + +## Vector Database + +Please note that the default vector database for this deployment is Elasticsearch. If you want to use a different vector database, you will need to modify the `VECTOR_DB` environment variable in the `backend/core/config.py` file. + +### Elasticsearch Embedding Dimensions + +The embedding dimension for Elasticsearch is now configurable via the `EMBEDDING_DIM` setting. However, there is no automatic migration path for existing Elasticsearch indices. If you have an existing index and you change the embedding dimension, you will need to re-index your data.