Skip to content

LCORE-383: Removing attachment metadata at conversation end point and streamlining o…#983

Open
sriroopar wants to merge 1 commit intolightspeed-core:mainfrom
sriroopar:conversations_attachment_metadata
Open

LCORE-383: Removing attachment metadata at conversation end point and streamlining o…#983
sriroopar wants to merge 1 commit intolightspeed-core:mainfrom
sriroopar:conversations_attachment_metadata

Conversation

@sriroopar
Copy link

@sriroopar sriroopar commented Jan 12, 2026

…utput.

Description

Changes are added to derive better insights from the conversation metadata using new fields to better represent attachments related information.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
    Cursor
  • Generated by: (e.g., tool name and version; N/A if not used)
    Claude Code (minimal use)

Related Tickets & Documents

  • Related Issue #
  • Closes # LCORE - 383

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
    A query with attachment was sent to /conversations and chat_history struct was manually verified.

  • How were the fix/results from this change verified? Please provide relevant screenshots or results.
    A query with attachment was sent to /conversations and chat_history struct was manually verified. Also re-verified through Swagger UI.

Summary by CodeRabbit

  • New Features
    • Added support for attachments in conversations. Attachments are now stored and cached across different storage backends, allowing users to include files alongside their messages and query results.

@openshift-ci
Copy link

openshift-ci bot commented Jan 12, 2026

Hi @sriroopar. Thanks for your PR.

I'm waiting for a lightspeed-core member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 12, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

The changes extend the caching and message infrastructure to support attachments. A new attachments field is added to the CacheEntry data model, serialized and deserialized in both SQLite and PostgreSQL cache implementations, included in the Message response model, and passed through the conversation endpoints and query utility.

Changes

Cohort / File(s) Summary
Cache Data Model
src/models/cache_entry.py
Added optional attachments: Optional[list["Attachment"]] field with TYPE_CHECKING imports to avoid runtime circular dependencies.
Cache Implementations
src/cache/sqlite_cache.py, src/cache/postgres_cache.py
Added attachments column to schema; extended get() to deserialize attachments JSON into Attachment objects; extended insert_or_append() to serialize attachments to JSON. Both include error logging for serialization/deserialization failures.
Response Models
src/models/responses.py
Added optional attachments: Optional[list["Attachment"]] field to Message class with TYPE_CHECKING import.
Endpoints & Utilities
src/app/endpoints/conversations_v2.py, src/utils/query.py
Extended message construction and cache entry creation to include attachments field from request/cache data.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • tisnik
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title is truncated and incomplete, ending with 'and streamlining o…' which appears to be cut off mid-word. Provide the complete, untruncated PR title that clearly describes the changes made in this pull request.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cache/sqlite_cache.py (1)

47-62: Add schema migration to support existing databases with the new attachments column.

The CREATE TABLE IF NOT EXISTS statement won't add the new attachments column to existing tables. When existing databases execute the SELECT_CONVERSATION_HISTORY_STATEMENT (line 80) or INSERT_CONVERSATION_HISTORY_STATEMENT (line 88), they will fail with a "no such column" error.

Add an ALTER TABLE cache ADD COLUMN IF NOT EXISTS attachments text statement in the initialize_cache() method after the CREATE TABLE statements to handle schema evolution for existing deployments.

🧹 Nitpick comments (1)
src/cache/sqlite_cache.py (1)

234-258: LGTM!

The deserialization logic correctly follows the established pattern used for referenced_documents. Error handling gracefully degrades to None on failure with appropriate warning logs.

Minor style note: Line 245 uses implicit string concatenation ("Failed to deserialize attachments for " "conversation %s: %s"). This is valid Python but could use a single string for consistency with the rest of the codebase.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2bc4bdb and 48b6aa0.

📒 Files selected for processing (6)
  • src/app/endpoints/conversations_v2.py
  • src/app/endpoints/query.py
  • src/cache/sqlite_cache.py
  • src/models/cache_entry.py
  • src/models/responses.py
  • src/utils/endpoints.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Use FastAPI dependencies: from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports: from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = logging.getLogger(__name__) pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, using typing_extensions.Self for model validators
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use ABC for abstract base classes with @abstractmethod decorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Run uv run make format to auto-format code with black and ruff before completion
Run uv run make verify to run all linters (black, pyl...

Files:

  • src/utils/endpoints.py
  • src/models/cache_entry.py
  • src/models/responses.py
  • src/app/endpoints/query.py
  • src/app/endpoints/conversations_v2.py
  • src/cache/sqlite_cache.py
src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Use @field_validator and @model_validator for custom validation in Pydantic models
Extend BaseModel for data Pydantic models
Use @model_validator and @field_validator for Pydantic model validation

Files:

  • src/models/cache_entry.py
  • src/models/responses.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

  • src/app/endpoints/query.py
  • src/app/endpoints/conversations_v2.py
🧠 Learnings (1)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

  • src/models/cache_entry.py
  • src/models/responses.py
🧬 Code graph analysis (2)
src/models/cache_entry.py (1)
src/models/requests.py (1)
  • Attachment (16-70)
src/cache/sqlite_cache.py (1)
src/models/requests.py (1)
  • Attachment (16-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: server mode / ci
🔇 Additional comments (11)
src/app/endpoints/conversations_v2.py (1)

243-250: LGTM! Attachments handling follows existing patterns.

The implementation correctly:

  1. Uses explicit type annotation for the dict
  2. Conditionally adds attachments only when present
  3. Uses model_dump(mode="json") consistently with how referenced_documents is serialized (lines 255-257)
src/app/endpoints/query.py (1)

378-387: LGTM! Attachments properly propagated to cache entry.

The attachments=query_request.attachments or None pattern correctly handles the case where attachments might be an empty list, converting it to None for consistency with the CacheEntry model's optional field definition.

src/models/cache_entry.py (2)

6-6: LGTM! Import follows project conventions.

Absolute import from models.requests follows the coding guidelines.


18-28: LGTM! Field definition follows best practices.

The new attachments field:

  • Uses modern union syntax list[Attachment] | None as per guidelines
  • Has proper default value for optional field
  • Is documented in the class docstring
src/utils/endpoints.py (1)

801-810: LGTM! Streaming path matches non-streaming implementation.

The attachments handling in cleanup_after_streaming mirrors the implementation in query.py (line 386), ensuring consistent behavior between streaming and non-streaming query responses.

src/models/responses.py (2)

759-782: LGTM! Documentation accurately reflects new attachments capability.

The updated description and examples correctly demonstrate:

  • Attachments are optional within user messages
  • The attachment structure matches the Attachment model (attachment_type, content_type, content)
  • Examples are realistic and helpful for API consumers

784-823: LGTM! Model config examples are comprehensive.

The examples properly show:

  1. A conversation turn with attachments (first entry)
  2. A conversation turn without attachments (second entry)

This demonstrates both use cases clearly for API documentation.

src/cache/sqlite_cache.py (4)

12-12: LGTM!

Import follows the absolute import pattern specified in the coding guidelines and correctly references the Attachment model from models.requests.


79-84: LGTM!

The SELECT statement correctly adds the attachments column, and the column index (7) aligns with the deserialization code below.


86-90: LGTM!

The INSERT statement correctly includes the attachments column with matching placeholder count (11 columns, 11 placeholders).


304-331: LGTM with a note on error handling.

The serialization logic correctly follows the established pattern for referenced_documents.

Note: If serialization fails, attachments_json remains None and the data is silently lost. This matches the existing behavior for referenced_documents, but consider whether attachment data loss should be logged at a higher level (e.g., logger.error) or propagate an exception depending on business requirements.

@openshift-ci
Copy link

openshift-ci bot commented Jan 12, 2026

@sriroopar: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@sriroopar
Copy link
Author

/retest

@openshift-ci
Copy link

openshift-ci bot commented Jan 12, 2026

@sriroopar: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jrobertboos
Copy link
Contributor

Hi @sriroopar, could you fix the black and lint errors please.

  • You can fix black errors with: make format
  • For the lint errors you will have to solve each error message.

Once you are done before you commit make sure you run make verify and no errors are returned.

Thanks!

@sriroopar sriroopar force-pushed the conversations_attachment_metadata branch from 48b6aa0 to 32af4ec Compare January 13, 2026 16:55
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cache/sqlite_cache.py (1)

345-358: Critical bug: INSERT statement parameter mismatch causes runtime error.

The INSERT statement expects 13 parameters (line 96), but only 10 values are provided. Missing serialization and insertion of tool_calls, tool_results, and attachments. This will raise sqlite3.ProgrammingError: Incorrect number of bindings supplied.

🐛 Proposed fix - serialize and include missing fields
         referenced_documents_json = None
         if cache_entry.referenced_documents:
             try:
                 docs_as_dicts = [
                     doc.model_dump(mode="json")
                     for doc in cache_entry.referenced_documents
                 ]
                 referenced_documents_json = json.dumps(docs_as_dicts)
             except (TypeError, ValueError) as e:
                 logger.warning(
                     "Failed to serialize referenced_documents for "
                     "conversation %s: %s",
                     conversation_id,
                     e,
                 )
 
+        tool_calls_json = None
+        if cache_entry.tool_calls:
+            try:
+                tool_calls_json = json.dumps(
+                    [tc.model_dump(mode="json") for tc in cache_entry.tool_calls]
+                )
+            except (TypeError, ValueError) as e:
+                logger.warning(
+                    "Failed to serialize tool_calls for conversation %s: %s",
+                    conversation_id,
+                    e,
+                )
+
+        tool_results_json = None
+        if cache_entry.tool_results:
+            try:
+                tool_results_json = json.dumps(
+                    [tr.model_dump(mode="json") for tr in cache_entry.tool_results]
+                )
+            except (TypeError, ValueError) as e:
+                logger.warning(
+                    "Failed to serialize tool_results for conversation %s: %s",
+                    conversation_id,
+                    e,
+                )
+
+        attachments_json = None
+        if cache_entry.attachments:
+            try:
+                attachments_json = json.dumps(
+                    [att.model_dump(mode="json") for att in cache_entry.attachments]
+                )
+            except (TypeError, ValueError) as e:
+                logger.warning(
+                    "Failed to serialize attachments for conversation %s: %s",
+                    conversation_id,
+                    e,
+                )
+
         cursor.execute(
             self.INSERT_CONVERSATION_HISTORY_STATEMENT,
             (
                 user_id,
                 conversation_id,
                 current_time,
                 cache_entry.started_at,
                 cache_entry.completed_at,
                 cache_entry.query,
                 cache_entry.response,
                 cache_entry.provider,
                 cache_entry.model,
                 referenced_documents_json,
+                tool_calls_json,
+                tool_results_json,
+                attachments_json,
             ),
         )
🤖 Fix all issues with AI agents
In @src/cache/sqlite_cache.py:
- Around line 274-287: The code is reading attachments from the wrong tuple
index (using conversation_entry[7] which is tool_calls); update the attachments
deserialization to use the correct index (conversation_entry[9]) so
attachments_json_str points to the attachments column, and keep the existing
json.loads and Attachment.model_validate logic intact; locate the attachments
handling block that defines attachments_json_str and attachments_obj (variables
conversation_entry, attachments_json_str, attachments_obj) and change the index
from 7 to 9.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 48b6aa0 and 32af4ec.

📒 Files selected for processing (6)
  • src/app/endpoints/conversations_v2.py
  • src/app/endpoints/query.py
  • src/cache/sqlite_cache.py
  • src/models/cache_entry.py
  • src/models/responses.py
  • src/utils/endpoints.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/models/responses.py
  • src/models/cache_entry.py
  • src/app/endpoints/conversations_v2.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Use FastAPI dependencies: from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports: from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = logging.getLogger(__name__) pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, using typing_extensions.Self for model validators
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use ABC for abstract base classes with @abstractmethod decorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Run uv run make format to auto-format code with black and ruff before completion
Run uv run make verify to run all linters (black, pyl...

Files:

  • src/app/endpoints/query.py
  • src/utils/endpoints.py
  • src/cache/sqlite_cache.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

  • src/app/endpoints/query.py
🧬 Code graph analysis (1)
src/cache/sqlite_cache.py (1)
src/models/requests.py (1)
  • Attachment (16-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: server mode / ci
🔇 Additional comments (2)
src/app/endpoints/query.py (1)

377-388: LGTM!

The attachments field is correctly added to the CacheEntry construction, using the or None pattern consistent with other optional fields like referenced_documents, tool_calls, and tool_results.

src/utils/endpoints.py (1)

801-812: LGTM!

The attachments field is correctly added to the CacheEntry construction in the streaming cleanup path, maintaining consistency with the non-streaming query endpoint implementation.

@sriroopar sriroopar force-pushed the conversations_attachment_metadata branch from 32af4ec to d5254da Compare January 13, 2026 17:00
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @src/cache/sqlite_cache.py:
- Around line 92-97: INSERT_CONVERSATION_HISTORY_STATEMENT declares 13 columns
but the call to cursor.execute only supplies 10 values, causing a binding error;
update the insert call (the cursor.execute in the conversation history insertion
routine) to serialize and include tool_calls, tool_results, and attachments
(e.g., json.dumps on those objects or "null"/"[]" as appropriate) so the
arguments list has 13 items and the order matches the INSERT column order
(user_id, conversation_id, created_at, started_at, completed_at, query,
response, provider, model, referenced_documents, tool_calls, tool_results,
attachments).
🧹 Nitpick comments (1)
src/cache/sqlite_cache.py (1)

50-67: Database migration required for existing deployments.

The CREATE TABLE IF NOT EXISTS statement won't add the new attachments column to existing databases. Existing deployments will encounter sqlite3.OperationalError: table cache has no column named attachments when inserting or selecting.

Consider adding an ALTER TABLE migration or handling the schema upgrade gracefully.

💡 Example migration approach
# Add to initialize_cache() after CREATE_CACHE_TABLE execution
try:
    cursor.execute("ALTER TABLE cache ADD COLUMN attachments text")
    logger.info("Added attachments column to cache table")
except sqlite3.OperationalError:
    # Column already exists
    pass
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 32af4ec and d5254da.

📒 Files selected for processing (6)
  • src/app/endpoints/conversations_v2.py
  • src/app/endpoints/query.py
  • src/cache/sqlite_cache.py
  • src/models/cache_entry.py
  • src/models/responses.py
  • src/utils/endpoints.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/utils/endpoints.py
  • src/app/endpoints/conversations_v2.py
  • src/models/responses.py
  • src/app/endpoints/query.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Use FastAPI dependencies: from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports: from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = logging.getLogger(__name__) pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, using typing_extensions.Self for model validators
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use ABC for abstract base classes with @abstractmethod decorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Run uv run make format to auto-format code with black and ruff before completion
Run uv run make verify to run all linters (black, pyl...

Files:

  • src/models/cache_entry.py
  • src/cache/sqlite_cache.py
src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Use @field_validator and @model_validator for custom validation in Pydantic models
Extend BaseModel for data Pydantic models
Use @model_validator and @field_validator for Pydantic model validation

Files:

  • src/models/cache_entry.py
🧠 Learnings (1)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

  • src/models/cache_entry.py
🧬 Code graph analysis (2)
src/models/cache_entry.py (1)
src/models/requests.py (1)
  • Attachment (16-70)
src/cache/sqlite_cache.py (1)
src/models/requests.py (1)
  • Attachment (16-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: server mode / ci
🔇 Additional comments (2)
src/models/cache_entry.py (1)

7-7: LGTM!

The new attachments field is well-integrated:

  • Uses absolute import as per coding guidelines.
  • Follows the same pattern as other optional list fields (tool_calls, tool_results).
  • Modern union syntax list[Attachment] | None aligns with the coding guidelines.
  • Docstring is updated appropriately.

Also applies to: 23-23, 35-35

src/cache/sqlite_cache.py (1)

274-299: LGTM!

The deserialization logic for attachments follows the established pattern used for referenced_documents, tool_calls, and tool_results:

  • Consistent error handling with json.JSONDecodeError and ValueError.
  • Appropriate warning logs on failure.
  • Uses Attachment.model_validate for proper Pydantic deserialization.

@sriroopar sriroopar force-pushed the conversations_attachment_metadata branch from d5254da to 334ada9 Compare January 13, 2026 17:10
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cache/postgres_cache.py (1)

51-68: Add database migration for the new attachments column in existing PostgreSQL deployments.

The CREATE TABLE IF NOT EXISTS statement only creates the table on first run. Existing PostgreSQL deployments that already have the cache table will not automatically receive the new attachments column, causing INSERT and SELECT statements to fail at runtime.

Provide a migration path for existing deployments—either an ALTER TABLE statement with a CREATE TABLE IF NOT EXISTS fallback, or documentation instructing users to manually add the column before upgrading.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5254da and 334ada9.

📒 Files selected for processing (7)
  • src/app/endpoints/conversations_v2.py
  • src/app/endpoints/query.py
  • src/cache/postgres_cache.py
  • src/cache/sqlite_cache.py
  • src/models/cache_entry.py
  • src/models/responses.py
  • src/utils/endpoints.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/app/endpoints/query.py
  • src/utils/endpoints.py
  • src/cache/sqlite_cache.py
  • src/models/cache_entry.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Use FastAPI dependencies: from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports: from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = logging.getLogger(__name__) pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, using typing_extensions.Self for model validators
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use ABC for abstract base classes with @abstractmethod decorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Run uv run make format to auto-format code with black and ruff before completion
Run uv run make verify to run all linters (black, pyl...

Files:

  • src/app/endpoints/conversations_v2.py
  • src/models/responses.py
  • src/cache/postgres_cache.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

  • src/app/endpoints/conversations_v2.py
src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Use @field_validator and @model_validator for custom validation in Pydantic models
Extend BaseModel for data Pydantic models
Use @model_validator and @field_validator for Pydantic model validation

Files:

  • src/models/responses.py
🧠 Learnings (1)
📚 Learning: 2026-01-12T10:58:40.230Z
Learnt from: blublinsky
Repo: lightspeed-core/lightspeed-stack PR: 972
File: src/models/config.py:459-513
Timestamp: 2026-01-12T10:58:40.230Z
Learning: In lightspeed-core/lightspeed-stack, for Python files under src/models, when a user claims a fix is done but the issue persists, verify the current code state before accepting the fix. Steps: review the diff, fetch the latest changes, run relevant tests, reproduce the issue, search the codebase for lingering references to the original problem, confirm the fix is applied and not undone by subsequent commits, and validate with local checks to ensure the issue is resolved.

Applied to files:

  • src/models/responses.py
🧬 Code graph analysis (1)
src/cache/postgres_cache.py (1)
src/models/requests.py (1)
  • Attachment (16-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: server mode / ci
🔇 Additional comments (7)
src/models/responses.py (2)

761-784: LGTM! Documentation and field example updated correctly for attachment support.

The description and example accurately reflect that user messages can now include attachments, and the example structure (attachment_type, content_type, content) matches the Attachment model defined in src/models/requests.py.


793-823: LGTM! Model config examples properly demonstrate attachment-containing messages.

The nested example in model_config correctly shows the new chat history format with attachments in user messages, providing clear OpenAPI documentation for API consumers.

src/cache/postgres_cache.py (4)

85-91: Verify SELECT column index alignment for attachments field.

The SELECT statement retrieves 10 columns (query, response, provider, model, started_at, completed_at, referenced_documents, tool_calls, tool_results, attachments). The code at line 308 accesses conversation_entry[9] for attachments, which is correct (0-indexed).


93-98: LGTM! INSERT statement correctly includes attachments column.

The INSERT statement has 13 columns and 13 placeholders (%s), and the attachments column is properly added at the end.


307-334: LGTM! Attachments deserialization follows established pattern.

The deserialization logic for attachments mirrors the existing patterns for referenced_documents, tool_calls, and tool_results - using Attachment.model_validate() with appropriate error handling and warning logs on failure.


405-436: LGTM! Attachments serialization follows established pattern.

The serialization logic for attachments correctly uses model_dump(mode="json") and json.dumps(), consistent with the handling of other JSONB fields. The parameter is properly passed to the cursor execute call.

src/app/endpoints/conversations_v2.py (1)

241-265: LGTM! Attachments handling in transform_chat_message is well-implemented.

The implementation:

  1. Adds proper type annotation for user_message
  2. Conditionally includes attachments only when present (avoids sending null/empty arrays)
  3. Uses model_dump(mode="json") consistent with the referenced_documents serialization pattern

This aligns well with the API response schema updates in responses.py.

@sriroopar
Copy link
Author

Hey @jrobertboos, thank you for your comment, fixed the lint errors. Is it sufficient that the attachment meta data is represented at the conversation endpoint, or do we want it to be retained in the DB as a new column? Just wanted to confirm the scope of the attachment data. Ty!

@jrobertboos
Copy link
Contributor

I am not 100% sure I think that it should just be represented at the conversation endpoint just so that we are not messing with the DB but idk.

@tisnik wdyt?

@sriroopar sriroopar force-pushed the conversations_attachment_metadata branch 2 times, most recently from 261fdf4 to df699f0 Compare January 15, 2026 14:54
@sriroopar sriroopar force-pushed the conversations_attachment_metadata branch from df699f0 to fb2772f Compare February 18, 2026 05:24
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
src/cache/sqlite_cache.py (2)

49-66: Schema migration required for existing deployments.

CREATE TABLE IF NOT EXISTS silently skips creation when the table already exists. Any deployment that upgrades with an existing cache table (without the attachments column) will receive OperationalError: no such column: attachments on the first SELECT or INSERT. Consider adding an ALTER TABLE migration step in initialize_cache() to add the column when missing, or providing a standalone migration script.

# Example migration guard in initialize_cache():
cursor.execute(
    "ALTER TABLE cache ADD COLUMN attachments text"
)
# Ignore error if column already exists (sqlite3.OperationalError)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cache/sqlite_cache.py` around lines 49 - 66, The CREATE_CACHE_TABLE uses
IF NOT EXISTS so existing DBs may lack the new attachments column which causes
OperationalError; update initialize_cache() to run an ALTER TABLE cache ADD
COLUMN attachments text migration (using the same DB cursor/connection used for
CREATE_CACHE_TABLE) before any selects/inserts, and catch
sqlite3.OperationalError (or check PRAGMA table_info('cache')) so the code
ignores the error if the column already exists; ensure this migration runs once
during initialization so subsequent calls to functions that reference the
attachments column succeed.

311-312: Move Attachment import to module level.

Importing inside the method body means the module is re-imported on every get() call. There is no circular-import risk (models.requests does not import from cache.*), so this can be a standard top-level import.

♻️ Proposed refactor
+from models.requests import Attachment
 from models.cache_entry import CacheEntry
 from models.config import SQLiteDatabaseConfiguration
 from models.responses import ConversationData

Then in get():

-            # Parse attachments back into Attachment objects
-            attachments_json_str = conversation_entry[9]
-            attachments_obj = None
-            if attachments_json_str:
-                try:
-                    from models.requests import Attachment
-
-                    attachments_data = json.loads(attachments_json_str)
+            # Parse attachments back into Attachment objects
+            attachments_json_str = conversation_entry[9]
+            attachments_obj = None
+            if attachments_json_str:
+                try:
+                    attachments_data = json.loads(attachments_json_str)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cache/sqlite_cache.py` around lines 311 - 312, The Attachment class is
imported inside the get() method causing repeated imports; move the import to
module level by adding "from models.requests import Attachment" at the top of
src/cache/sqlite_cache.py and remove the in-method import inside the get()
function so get() uses the already-imported Attachment symbol.
src/cache/postgres_cache.py (2)

50-67: Schema migration required for existing PostgreSQL deployments.

Same concern as sqlite_cache.py: CREATE TABLE IF NOT EXISTS is a no-op when the table exists. Existing databases without the attachments column will fail with psycopg2.errors.UndefinedColumn on the first query. Provide an ALTER TABLE cache ADD COLUMN IF NOT EXISTS attachments jsonb; migration step in initialize_cache() or a separate migration script.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cache/postgres_cache.py` around lines 50 - 67, The cache schema SQL
(CREATE_CACHE_TABLE) only creates the table if missing but doesn't add new
columns to existing tables, causing UndefinedColumn errors for the attachments
column; update initialize_cache() (or add a migration step run during startup)
to execute an ALTER TABLE cache ADD COLUMN IF NOT EXISTS attachments jsonb; (and
any other new jsonb columns like tool_results/tool_calls if needed) after
running CREATE_CACHE_TABLE so existing PostgreSQL deployments get the new column
without manual migration; ensure this runs using the same DB connection logic in
initialize_cache() and handle/log any errors.

326-342: Move Attachment import to module level (mirrors the same issue in sqlite_cache.py).

♻️ Proposed refactor
+from models.requests import Attachment
 from models.cache_entry import CacheEntry
 from models.config import PostgreSQLDatabaseConfiguration
 from models.responses import ConversationData

Then remove the inline import from get():

-                try:
-                    from models.requests import Attachment
-
-                    attachments_obj = [
+                try:
+                    attachments_obj = [
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cache/postgres_cache.py` around lines 326 - 342, The inline import of
Attachment inside the get() block should be moved to the module level: add "from
models.requests import Attachment" to the top-level imports and remove the
inline "from models.requests import Attachment" from the block that parses
attachments (the code referencing attachments_data, attachments_obj and
conversation_entry). Keep the existing try/except and logging behavior but
reference the module-level Attachment when calling Attachment.model_validate to
deserialize attachments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/models/responses.py`:
- Around line 16-18: The runtime PydanticUndefinedAnnotation occurs because
Attachment is only imported inside the TYPE_CHECKING block so it's not present
when Message.model_json_schema() is called; remove the if TYPE_CHECKING guard
and import Attachment at module level in models.responses, then change any
string forward-reference annotations (e.g., "Attachment") to the actual
Attachment type on the Message (or related) field so Pydantic can resolve it at
runtime (this fixes the error raised when calling Message.model_json_schema()).

---

Nitpick comments:
In `@src/cache/postgres_cache.py`:
- Around line 50-67: The cache schema SQL (CREATE_CACHE_TABLE) only creates the
table if missing but doesn't add new columns to existing tables, causing
UndefinedColumn errors for the attachments column; update initialize_cache() (or
add a migration step run during startup) to execute an ALTER TABLE cache ADD
COLUMN IF NOT EXISTS attachments jsonb; (and any other new jsonb columns like
tool_results/tool_calls if needed) after running CREATE_CACHE_TABLE so existing
PostgreSQL deployments get the new column without manual migration; ensure this
runs using the same DB connection logic in initialize_cache() and handle/log any
errors.
- Around line 326-342: The inline import of Attachment inside the get() block
should be moved to the module level: add "from models.requests import
Attachment" to the top-level imports and remove the inline "from models.requests
import Attachment" from the block that parses attachments (the code referencing
attachments_data, attachments_obj and conversation_entry). Keep the existing
try/except and logging behavior but reference the module-level Attachment when
calling Attachment.model_validate to deserialize attachments.

In `@src/cache/sqlite_cache.py`:
- Around line 49-66: The CREATE_CACHE_TABLE uses IF NOT EXISTS so existing DBs
may lack the new attachments column which causes OperationalError; update
initialize_cache() to run an ALTER TABLE cache ADD COLUMN attachments text
migration (using the same DB cursor/connection used for CREATE_CACHE_TABLE)
before any selects/inserts, and catch sqlite3.OperationalError (or check PRAGMA
table_info('cache')) so the code ignores the error if the column already exists;
ensure this migration runs once during initialization so subsequent calls to
functions that reference the attachments column succeed.
- Around line 311-312: The Attachment class is imported inside the get() method
causing repeated imports; move the import to module level by adding "from
models.requests import Attachment" at the top of src/cache/sqlite_cache.py and
remove the in-method import inside the get() function so get() uses the
already-imported Attachment symbol.

Comment on lines +16 to +18
if TYPE_CHECKING:
from models.requests import Attachment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify models/requests.py does NOT import from models/responses.py (circular import check)
echo "=== Imports in models/requests.py ==="
rg -n "from models.responses\|import models.responses" --type py

# Verify no model_rebuild() call exists for Message
echo ""
echo "=== model_rebuild calls ==="
rg -n "model_rebuild" --type py src/

Repository: lightspeed-core/lightspeed-stack

Length of output: 142


🏁 Script executed:

#!/bin/bash
echo "=== Lines 1-30 of src/models/responses.py ==="
head -30 src/models/responses.py | cat -n

echo ""
echo "=== Lines 825-835 of src/models/responses.py ==="
sed -n '825,835p' src/models/responses.py | cat -n

echo ""
echo "=== Check if Attachment is used in responses.py ==="
rg -n "Attachment" src/models/responses.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 2073


🏁 Script executed:

#!/bin/bash
echo "=== All TYPE_CHECKING imports in responses.py ==="
rg -n "if TYPE_CHECKING" src/models/responses.py -A 2

echo ""
echo "=== Verify Message class definition ==="
rg -n "^class Message" src/models/responses.py -A 5

Repository: lightspeed-core/lightspeed-stack

Length of output: 437


Remove TYPE_CHECKING and directly import Attachment to fix runtime PydanticUndefinedAnnotation error.

Attachment is imported only inside if TYPE_CHECKING:, so it is never in models.responses's global namespace at runtime. Pydantic v2 resolves the string forward-reference "Attachment" by looking it up in the defining module's __dict__, so calling Message.model_json_schema() (which FastAPI does at startup to generate OpenAPI docs) will raise PydanticUndefinedAnnotation.

Verification confirms no circular import exists; models/requests.py does not import from models/responses.py. The fix is to directly import Attachment at module level and update the field annotation:

Fix
-from typing import Any, ClassVar, Literal, Optional, Union, TYPE_CHECKING
+from typing import Any, ClassVar, Literal, Optional, Union

 from fastapi import status
 from pydantic import BaseModel, Field
 from pydantic_core import SchemaError

 from constants import MEDIA_TYPE_EVENT_STREAM
 from models.config import Action, Configuration
+from models.requests import Attachment
 from quota.quota_exceed_error import QuotaExceedError
 from utils.types import RAGChunk, ReferencedDocument, ToolCallSummary, ToolResultSummary

-if TYPE_CHECKING:
-    from models.requests import Attachment
-

Update line 829:

-    attachments: Optional[list["Attachment"]] = Field(
+    attachments: Optional[list[Attachment]] = Field(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/models/responses.py` around lines 16 - 18, The runtime
PydanticUndefinedAnnotation occurs because Attachment is only imported inside
the TYPE_CHECKING block so it's not present when Message.model_json_schema() is
called; remove the if TYPE_CHECKING guard and import Attachment at module level
in models.responses, then change any string forward-reference annotations (e.g.,
"Attachment") to the actual Attachment type on the Message (or related) field so
Pydantic can resolve it at runtime (this fixes the error raised when calling
Message.model_json_schema()).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments