feat(tools): add read_media tool for image/video/audio processing by YvanJiang · Pull Request #1228 · agentscope-ai/CoPaw

YvanJiang · 2026-03-11T06:16:43Z

Summary

Add a new `read_media` tool for reading and processing image, video, and audio files.

Features

Support local paths, file:// URLs, and http(s):// URLs
Automatic compression for images (Pillow) and videos (FFmpeg)
File format validation using magic numbers
Returns appropriate Block types (ImageBlock, VideoBlock, AudioBlock)

Supported Formats

Images: PNG, JPG, GIF, WEBP, BMP
Videos: MP4, AVI, MOV, MKV, WEBM, FLV, WMV
Audio: MP3, WAV, AAC, OGG, M4A, FLAC, WMA

Usage

```python
from copaw.agents.tools import read_media
result = await read_media("/path/to/image.png")
```

Add a new async tool that can read and process media files from: - Local file paths - file:// URLs - http(s):// URLs Features: - Image support (PNG, JPG, GIF, WEBP, BMP) with compression - Video support (MP4, AVI, MOV, etc.) with frame extraction - Audio support (MP3, WAV, AAC, etc.) - File format validation via magic numbers - Maximum file size: 20MB before compression

gemini-code-assist · 2026-03-11T06:16:48Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Leirunlin · 2026-03-16T06:17:51Z

Hi @YvanJiang,

This is a useful feature, but base64 encoding consumes too much context. Additionally, this PR introduces dependencies on Pillow and FFmpeg, adding extra complexity for users.

I’ve submitted an update in #1526 with a more lightweight approach to image reading for multi-modal models. Feel free to share any suggestions there. For audio and video support, new PRs are welcome.

BTW, more discussion is welcome in #1230 if you'd like to follow up there.

github-actions bot added the first-time-contributor PR created by a first time contributor label Mar 11, 2026

YvanJiang mentioned this pull request Mar 11, 2026

feat: add 4 new features for media reading, auth, and Feishu integration #1063

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tools): add read_media tool for image/video/audio processing#1228

feat(tools): add read_media tool for image/video/audio processing#1228
YvanJiang wants to merge 1 commit intoagentscope-ai:mainfrom
YvanJiang:feature/media-reading-tool

YvanJiang commented Mar 11, 2026

Uh oh!

gemini-code-assist bot commented Mar 11, 2026

Uh oh!

Leirunlin commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

YvanJiang commented Mar 11, 2026

Summary

Features

Supported Formats

Usage

Related

Uh oh!

gemini-code-assist bot commented Mar 11, 2026

Uh oh!

Leirunlin commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants