[Feature Request] Add multimodal/vision support for Claude and other vision-capable models

## Problem

Currently, 9Router does not transmit images to upstream providers, even when using vision-capable models like Claude Sonnet 4.5.

## Test Case

I tested sending an image via the OpenAI-compatible API format:

```bash
curl -X POST http://192.168.11.233:20128/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-xxx" \
  -d '{
    "model": "kr/claude-sonnet-4.5",
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "What do you see in this image?"},
          {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
        ]
      }
    ]
  }'
```

**Expected:** Claude should describe the image
**Actual:** Claude responds "I don't see any image in your message"

## Impact

This prevents using 9Router with:
- Claude Sonnet/Opus (vision support)
- GPT-4V/GPT-5 (vision support)
- Gemini (multimodal)
- Any other vision-capable models

## Use Case

I'm using 9Router with OpenClaw (personal AI assistant) and need vision support for analyzing screenshots, diagrams, and images sent via Telegram/WhatsApp.

## Proposed Solution

1. Detect `image_url` content blocks in OpenAI format
2. Translate to provider-specific format:
   - **Claude:** Convert to Anthropic's vision format with `source.type: "base64"`
   - **OpenAI:** Pass through as-is
   - **Gemini:** Convert to `inline_data` format
3. Preserve image data through the routing pipeline

## Environment

- 9Router version: 0.2.99
- Model tested: kr/claude-sonnet-4.5 (via Kiro provider)
- Client: OpenClaw via custom provider config

## Additional Context

The `/v1/models` endpoint doesn't expose model capabilities (text vs multimodal), which makes it hard for clients to know which models support images.

Would be great to also add:
- Model capability metadata in `/v1/models` response
- Documentation on multimodal support per provider

Thanks for this amazing project! 🚀

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Add multimodal/vision support for Claude and other vision-capable models #208

Problem

Test Case

Impact

Use Case

Proposed Solution

Environment

Additional Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feature Request] Add multimodal/vision support for Claude and other vision-capable models #208

Description

Problem

Test Case

Impact

Use Case

Proposed Solution

Environment

Additional Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions