-
Notifications
You must be signed in to change notification settings - Fork 5
feat: add ImageBlock vision support for multimodal chat #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Implemented image vision support in the Anthropic provider, enabling ChatRequest
messages containing ImageBlock to be converted to Anthropic's image content block
format. This is the second provider implementation (after Gemini) proving the
ImageBlock protocol works across multiple multimodal APIs with different formats.
Changes:
- Extended _convert_messages() to handle ImageBlock → Anthropic format conversion
- Added comprehensive test suite (5 tests matching Gemini pattern)
- Included test asset (Macbeth stage production photo)
- Updated dev dependencies (pytest, pytest-asyncio, ruff)
Test Results: All 20 tests passing (5 new image tests + 15 existing, no regressions)
Anthropic uses content blocks with base64 source format:
{"type": "image", "source": {"type": "base64", "media_type": "...", "data": "..."}}
This differs from Gemini's inline_data format but both work seamlessly from the
same ImageBlock protocol - proving the abstraction is solid and provider-agnostic.
🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier)
Co-Authored-By: Amplifier <[email protected]>
bkrabach
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review: ImageBlock Vision Support
Thanks for adding multimodal support! The image conversion implementation is clean and follows our patterns well. However, there are dependency management issues that need to be fixed before merge.
Required Changes
1. Remove Local Path Source (pyproject.toml:27-28)
# REMOVE THIS SECTION
[tool.uv.sources]
amplifier-core = { path = "../amplifier-core" }Modules must treat amplifier-core as a peer dependency, not a local path reference. This breaks:
- Package installation for end users
- CI/CD pipelines
- Anyone who clones the repo standalone
Reference: The OpenAI provider does NOT have this section - that's the correct pattern.
2. Remove Unnecessary dependency-groups Section (pyproject.toml:43-49)
# REMOVE THIS SECTION
[dependency-groups]
dev = [
"amplifier-core",
"pytest>=9.0.2",
"pytest-asyncio>=1.3.0",
"ruff>=0.14.10",
]The OpenAI reference module doesn't have this section. Including it:
- Implies amplifier-core is a dev dependency (it's actually a peer dependency)
- Creates inconsistency between modules
Recommendation (not blocking)
3. Consider Smaller Test Image
tests/assets/macbeth-witches-trio.jpg is 3.4 MB - excessive for unit tests that only verify conversion structure. Consider replacing with a small synthetic image (~10-50 KB).
What Looks Good
- ✅ Image conversion logic is clean and direct
- ✅ Follows existing patterns in the codebase
- ✅ Good test coverage
- ✅ Proper warning logging for unsupported types
- ✅ Security review passed (pure passthrough pattern)
Please fix the two pyproject.toml issues and this is ready to merge!
Removed [tool.uv.sources] section with local path reference to amplifier-core and [dependency-groups] section that incorrectly treated amplifier-core as a dev dependency. These changes align with the OpenAI reference module pattern, treating amplifier-core as a peer dependency. Addresses reviewer feedback from PR microsoft#6. 🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier) Co-Authored-By: Amplifier <[email protected]>
Addressed PR FeedbackThanks for the thorough review @bkrabach! I've addressed both required changes: ✅ Fixed Issues1. Removed local path source (pyproject.toml:27-28)
2. Removed dependency-groups section (pyproject.toml:43-49)
ChangesThe updated Commit: 34af2a5 Re: Test Image SizeGood point about the 3.4 MB test image. I can address this in a follow-up if you'd like, but didn't want to mix concerns in this PR since the primary goal was validating the ImageBlock protocol support. Ready for re-review! Let me know if you need anything else. |
|
Please resolve merge conflicts with uv.lock, then ready to merge. |
Merged main branch into pr-6 to resolve conflicts: - Resolved uv.lock conflict by regenerating with uv lock - Updated README.md, __init__.py, and pyproject.toml from main This addresses reviewer feedback to resolve merge conflicts. 🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier) Co-Authored-By: Amplifier <[email protected]>
|
@bkrabach The uv.lock merge conflicts have been resolved as requested. |
Summary
Implementation Details
The implementation handles the conversion from Amplifier's protocol ImageBlock format to Anthropic's vision API format in
_convert_messages:API Format Differences
Anthropic uses a different vision format compared to Gemini:
Both formats are cleanly supported by the ImageBlock protocol abstraction.
Testing
All 20 tests passing:
🤖 Generated with Amplifier
Co-Authored-By: Amplifier [email protected]