-
Notifications
You must be signed in to change notification settings - Fork 48
Open
zhaog100/trashclaw
#1Labels
Description
Bounty: 20 RTC
Task
If the LLM backend supports vision (Llava, Qwen-VL, etc.), allow TrashClaw to send images as part of the conversation.
Requirements
- New tool:
view_image— reads an image file and includes it in the next LLM request as a base64-encoded image - New slash command:
/screenshot— takes a screenshot and includes it (using system tools) - Auto-detect if model supports vision (check
/v1/modelsresponse for multimodal flag) - Graceful fallback: if no vision support, tell the user
- No external dependencies (use stdlib
base64module)
Notes
The OpenAI chat completions API supports image content via {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}} in message content arrays.
Wallet: Drop your wallet name when you submit.
Reactions are currently unavailable