Skip to content

[Feature Request] All worker could accept multi-modal information #956

@Wendong-Fan

Description

@Wendong-Fan

Motivation

now only multi-modal agent has the toolkit to read multi-modal information, but now many LLMs support vision, it's better to natively support worker agent get image path from decomposed sub tasks

Solution

No response

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions