Is there a way for a me to return an image as the result of a tool call, and have the model process it? #2949

aryluthra · 2025-04-23T16:38:51Z

aryluthra
Apr 23, 2025

As part of a project, i want to an agent to make some style based recommendations based on what a PDF for an ad campaign "looks like". This is part of a larger agent that works on ads, so i would want it to be able to get a screenshot/image of the pdf using a tool call specifically, and then analyze it in the LLMs context and then give feedback.

Most modern day frontier models are multimodal, so they should be able to do something like this. I know anthropic supports this in their api along with others

Is there a way to do this in Agno? I have tried scouring the cookbook for examples and tried to look through the codebase, but couldn't find anything, and figured people in the community would have some experience with something like this. Am i missing something simple?

pritipsingh · 2025-04-24T18:14:44Z

pritipsingh
Apr 24, 2025
Collaborator

Yes, what you're trying to do—making an agent analyze the visual styling of a PDF using multimodal LLMs—is totally viable, especially with frontier models like Claude, GPT-4o, and Gemini that support image inputs. Please checkout multimodal compatibility here: https://docs.agno.com/models/compatibility#multimodal-support

Agno doesn't currently have an out-of-the-box example of this flow in the, but it’s absolutely possible to build it using a custom tool.

Making custom tools with Agno is easy:

Any python function can be used as a tool by an Agent.
Use the @tool decorator to modify what happens before and after this tool is called.

Docs for it: https://docs.agno.com/tools/tools

So the steps would look something like this:

Create a Tool to Convert PDF → Image
Attach this Tool to Your Agent
Use a Multimodal Model That Supports Image Inputs
Call the Agent with Your Prompt

Let me know if it answers your question

10 replies

pritipsingh Jul 1, 2025
Collaborator

Hey so the best way imo would be to store it a session state which is accessible by all team members?

Would that help @senzen-terra @prabhmeharbedi ?

We've an example for this: https://docs.agno.com/teams/shared-state#shared-team-state

You can directly run the cookbook here: https://github.com/agno-agi/agno/blob/main/cookbook/teams/team_with_nested_shared_state.py

prabhmeharbedi Jul 2, 2025

@pritipsingh Thanks, but this is not what i am actually looking for -

I just want to return an image from a tool call that the agent can then analyze, like -

What format should a tool function return an image in (base64, file path, bytes) for the agent to analyze it and HOW would i enable it [an example would be really helpful]

Flow -
tool returns image --> agent processes image (uses multimodal llm)

bruceweir Jul 11, 2025

I couldn't get this to work either, so as a temporary workaround my tool function uploads the image to an S3 bucket and returns the URL of the image to the agent. This allows repeated image analysis to work with changing images in the agent's internal loop. Gemini Pro can download images from URLs for analysis, but I am not sure about other models.

prabhmeharbedi Jul 11, 2025

that just adds unnecessary netwrk overhead but if it works it works

bruceweir Jul 11, 2025

It is only necessary because returning an image from a tool to be used by the agent doesn't seem to work, unless you have a functional example to share?

hakimmanik · 2025-06-25T09:59:46Z

hakimmanik
Jun 25, 2025

Any update?

1 reply

pritipsingh Jul 1, 2025
Collaborator

Please let me know if this helps? #2949 (reply in thread)

erictom97 · 2025-08-10T17:18:45Z

erictom97
Aug 10, 2025

Hi ,

I am also having the same problem, #2927 says an update is coming soon. Very pleased that agno framework exists and works like magic. This feature would be a great add on for me.

Thanks

0 replies

Is there a way for a me to return an image as the result of a tool call, and have the model process it? #2949

Uh oh!

Uh oh!

aryluthra Apr 23, 2025

Replies: 3 comments · 11 replies

Uh oh!

pritipsingh Apr 24, 2025 Collaborator

Uh oh!

pritipsingh Jul 1, 2025 Collaborator

Uh oh!

prabhmeharbedi Jul 2, 2025

Uh oh!

bruceweir Jul 11, 2025

Uh oh!

prabhmeharbedi Jul 11, 2025

Uh oh!

bruceweir Jul 11, 2025

Uh oh!

hakimmanik Jun 25, 2025

Uh oh!

pritipsingh Jul 1, 2025 Collaborator

Uh oh!

erictom97 Aug 10, 2025

aryluthra
Apr 23, 2025

Replies: 3 comments 11 replies

pritipsingh
Apr 24, 2025
Collaborator

pritipsingh Jul 1, 2025
Collaborator

hakimmanik
Jun 25, 2025

pritipsingh Jul 1, 2025
Collaborator

erictom97
Aug 10, 2025