-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backend: Add better support for file content parsing with Python Interpreter #805
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #805 +/- ##
==========================================
- Coverage 79.92% 79.82% -0.10%
==========================================
Files 242 243 +1
Lines 10305 10346 +41
==========================================
+ Hits 8236 8259 +23
- Misses 2069 2087 +18 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Just a few comments not related to the changes in this PR:
validate_file method has unused params index and ctx but these params are not passed here and here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
I tried various ways to get the Python Interpreter to work with files, including sharing docker volumes between the
backend
andterrarium
services, only to find in https://github.com/cohere-ai/cohere-terrarium?tab=readme-ov-file#sandbox-design that filesystem access is not supported by the Python sandbox.This workaround instead tries to force instructions to use read file tools and pass content directly
AI Description
This PR introduces several changes to the codebase, primarily focused on file handling and tool configuration.
Summary
The PR makes changes to the file handling system, adding new functions to read different file formats and updating existing ones. It also renames a tool and modifies its description, ensuring consistent naming across the codebase. Additionally, it removes references to Langchain, a tool for building applications with language models, and updates the default model for chat requests.
Changes
exec-terrarium
, which executes a command in thecohere-toolkit-terrarium-1
container as the root user.src/backend/data
directory, which was used to sync uploaded files.TIMEOUT
variable toTIMEOUT_SECONDS
and updates its value to 60. This change affects the timeout value used in theasyncio.wait_for
function.read_document
toread_file
.ToolName
class to clarify the usage of the Python interpreter without internet access and provide guidelines for file handling.user_id
field from theBaseChatRequest
class, which was previously used to store the conversation under a specific user.user_id
field to theConversationFilePublic
class, allowing for user-specific file handling.read_excel
,read_docx
, andread_parquet
functions and adds new functions with the same names. These new functions have updated argument names and return types.NAME
attribute of theReadFileTool
class fromread_document
toread_file
.LangchainPythonInterpreterToolInput
class and thelangchain_call
andto_langchain_tool
methods.user_id
field to the$ConversationFilePublic
type.user_id
field to theConversationFilePublic
type.TOOL_READ_DOCUMENT_ID
constant toTOOL_READ_FILE_ID
.TOOL_READ_DOCUMENT_ID
constant toTOOL_READ_FILE_ID
.