Skip to content

feat: convert image uploads to base64 multimodal format#610

Merged
EKKOLearnAI merged 3 commits intomainfrom
feat/image-upload-base64
May 10, 2026
Merged

feat: convert image uploads to base64 multimodal format#610
EKKOLearnAI merged 3 commits intomainfrom
feat/image-upload-base64

Conversation

@EKKOLearnAI
Copy link
Copy Markdown
Owner

Summary

  • Image uploads are now read from disk, converted to base64 data URLs, and sent as input_image parts to the /v1/responses API
  • Previously images were replaced with [Image: path] text placeholders — the upstream gateway never received actual image data
  • Input is wrapped in [{role: "user", content: [...]}] format for correct gateway parsing
  • History messages extract text only — no base64 stored in DB or conversation_history
  • File attachments remain as text mentions

Test plan

  • Send a text message — verify normal response
  • Send an image — verify upstream receives and processes the image
  • Send image + text together — verify both are sent correctly
  • Check DB that stored messages don't contain base64 data
  • Verify conversation_history with images doesn't cause context explosion

🤖 Generated with Claude Code

EKKOLearnAI and others added 3 commits May 10, 2026 19:14
Reduce the message count threshold that triggers LLM-based context
compression to avoid excessively long histories before compression kicks in.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Images sent by users are now read from disk, converted to base64 data
URLs, and sent as input_image parts in the /v1/responses API request
instead of being replaced with text placeholders. File attachments remain
as text mentions.

- convertContentBlocks returns multimodal array instead of plain text
- Input is wrapped in [{role:"user", content:[...]}] format for gateway
- History conversion extracts text only (no base64 in conversation_history)
- Add debug logging for request input preview

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@EKKOLearnAI EKKOLearnAI merged commit 377fa41 into main May 10, 2026
1 check passed
mysoul12138 pushed a commit to mysoul12138/hermes-web-ui that referenced this pull request May 10, 2026
)

* fix: lower context compression message threshold from 200 to 150

Reduce the message count threshold that triggers LLM-based context
compression to avoid excessively long histories before compression kicks in.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat: convert image uploads to base64 multimodal format for API

Images sent by users are now read from disk, converted to base64 data
URLs, and sent as input_image parts in the /v1/responses API request
instead of being replaced with text placeholders. File attachments remain
as text mentions.

- convertContentBlocks returns multimodal array instead of plain text
- Input is wrapped in [{role:"user", content:[...]}] format for gateway
- History conversion extracts text only (no base64 in conversation_history)
- Add debug logging for request input preview

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: remove debug console.log from chat-run-socket

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant