-
Notifications
You must be signed in to change notification settings - Fork 0
Description
MCP Image Display Issue in ChatGPT
Issue Overview
Throughout 2025, developers building agents with the OpenAI Agents SDK and MCP (Model Context Protocol) encountered a frustrating limitation: images returned by MCP tools were not displayed inline in ChatGPT. Despite the MCP specification supporting image content blocks (base64-encoded images with MIME types), ChatGPT's interface failed to render them properly.
The issue manifested in several ways:
- Empty or "no result" responses — ChatGPT reported the tool returned nothing
- Massive token consumption (~100k tokens) — base64 data treated as text and fed to the model
- Raw object strings — e.g.,
"<mcp.server.fastmcp.utilities.types.Image object at 0x...>" - Generic errors — "I wasn't able to display the image"
This was particularly frustrating because other MCP-compatible clients (Claude Desktop, Cursor IDE) displayed the same images correctly, confirming the issue was specific to ChatGPT's implementation.
Reports of the Issue
May 2025: OpenAI Developer Forum (Agents SDK)
A developer reported that their MCP server returned an ImageContent block, but ChatGPT treated it as text. The assistant consumed approximately 100,000 tokens attempting to process the raw base64 string, then failed to display anything[1]. The user noted:
"The problem is that the assistant is treating this as text... instead it is interpreting this as a very long string."
No solution was provided at the time, highlighting that the Agents SDK integration was not handling image outputs.
September 2025: OpenAI Forum Bug Report
Another user opened a bug thread titled "Image responses from MCP tools do not work." They provided the exact JSON payload their MCP server returned:
{
"content": [{
"type": "image",
"data": "<base64>...",
"mimeType": "image/png"
}]
}This matched the MCP specification exactly. Yet ChatGPT kept replying with "empty result" or claimed it couldn't see the image[2]. The image was only 18KB — well under any reasonable size limit.
October 2025: Reddit r/mcp Discussion
A Reddit thread asked: "Are image responses in MCP tools supported by ChatGPT/Claude?" The poster showed their MCP JSON response with a properly formatted image content block. ChatGPT responded that it "cannot see the image"[3].
Notably, another user confirmed that Cursor IDE and Claude Desktop displayed the same images correctly[4], proving the issue was ChatGPT-specific.
Claude also had a separate issue: it threw a 400 error claiming "base64 data does not match image/png" — a strict MIME validation problem on Anthropic's side.
October 2025: FastMCP GitHub Issue
A report on the FastMCP project described that returning Image objects from tools produced the string representation of the Python object instead of an inline image:
<mcp.server.fastmcp.utilities.types.Image object at 0x7f...>
This was caused by a developer pitfall: nesting media objects inside dictionaries or other containers. FastMCP only converts Image/Audio/File objects to MCP content blocks when returned at the top level or in a simple list[5][6].
October 2025: Cursor GitHub Issue
Users reported that images from MCP servers were "not getting rendered in the chat even though the MCP server is providing the image as output"[7]. This was later fixed in Cursor, but highlighted the broader ecosystem challenge.
Additional Reports
- Claude Desktop temporarily hid MCP results but restored image display in a subsequent update[8]
- ChatGPT metadata issues — some users reported ChatGPT stopped reading MCP tool metadata entirely[9]
Cause and Technical Explanation
ChatGPT's Internal Handling
The root cause was ChatGPT's incomplete implementation of MCP image content blocks. The MCP specification defines how to return images as part of tool results:
{
"type": "image",
"data": "base64-encoded-data",
"mimeType": "image/png",
"annotations": {
"audience": ["user"],
"priority": 0.9
}
}Per the ImageContent schema, the required fields are:
type: Must be"image"data: Base64-encoded image datamimeType: MIME type of the image (e.g.,image/png,image/jpeg)
The expected behavior:
- Tool returns image content block
- UI renders image inline
- Model receives metadata (not raw data)
What actually happened in ChatGPT:
- The image data was either ignored entirely (empty result)
- Or passed to the model as text (100k token explosion)
- The UI never attempted to render the image
OpenAI's Agents SDK internally used a FunctionCallOutputPayload structure that didn't properly serialize image content blocks. The image data was lost or misrouted before reaching the UI.
FastMCP Serialization Pitfall
A separate but related issue affected FastMCP users. The framework only converts helper classes (Image, Audio, File) to MCP content blocks when returned directly or in a top-level list/tuple[6]:
# ✅ Works — returns at top level
return Image(data=base64_data, mime_type="image/png")
# ✅ Works — returns in a list
return [Image(...), "Caption text"]
# ❌ Fails — nested in dict
return {"image": Image(...), "caption": "..."}When nested in a dictionary, the object is not converted and appears as <Image object at 0x...>. This was intentional behavior — the maintainers clarified it's a design limitation and updated documentation to warn developers[6].
MIME Validation Issues
Claude's 400 error ("base64 data does not match image/png") indicated strict MIME validation. ChatGPT didn't even throw such errors — it simply ignored the content, suggesting it never attempted to decode the base64 at all.
Workarounds and User Solutions
1. Return Images in Supported Formats
Ensure you're using the correct types to trigger MCP content blocks:
- OpenAI Agents SDK (Python): Use
ToolOutputImagetype[10] - FastMCP (Python): Use
fastmcp.utilities.types.Imageand return directly or in a list - MCP SDK (Node.js/TypeScript): Use the
ImageContenttype from@modelcontextprotocol/sdk:
import type { ImageContent, CallToolResult } from "@modelcontextprotocol/sdk/types.js";
// Return image content block
const result: CallToolResult = {
content: [
{
type: "image",
data: base64EncodedData,
mimeType: "image/png"
} as ImageContent
]
};The SDK also provides TextContent, ResourceLink, and other content types for tool results.
This doesn't fix the ChatGPT UI, but ensures other clients can render the image and prepares for future fixes.
2. Avoid Nesting Media in Structs
Do not wrap images in dictionaries or complex JSON. If your tool returns mixed data (text + image), split into separate returns or use MCP's ability to carry both structured data and content blocks separately.
3. Manual Handling via API
If using the Responses API directly (not ChatGPT UI), parse the response JSON for image blocks:
for item in response["content"]:
if item["type"] == "image":
image_data = base64.b64decode(item["data"])
# Save or display the imageThis bypasses ChatGPT's UI entirely and lets you build custom front-ends.
4. Use External Image Links
Have the tool upload the image to cloud storage and return a URL instead. Less elegant, but gets the image to the user. The assistant can output the link or use browsing capabilities to preview it.
5. Leverage Alternative Clients
For development and testing, use MCP-compatible clients that display images correctly:
- Cursor IDE — confirmed working[4]
- Claude Desktop — fixed in later updates[8]
- Cline 3.13+ — added MCP image support[11]
6. Build Custom UI Components
For production applications, implement your own image rendering. Intercept the tool response, detect image content blocks, and display them in your UI rather than relying on ChatGPT's interface.
Resolution Status
OpenAI's Fix (October 2025)
OpenAI addressed the issue in a late-October 2025 update. A pull request titled "[MCP] Render MCP tool call result images to the model" was merged into the Codex repository[12].
The PR description noted:
"Previously, the image content was lost on the way to the model/UI... implements a fix so that image outputs are serialized in the proper array format and reach the model (and UI) as images."
The fix modified how FunctionCallOutputPayload handles image content, ensuring it's properly serialized and routed to both the model and UI.
FastMCP Documentation Update (October 2025)
FastMCP closed the loop by updating their documentation to clarify the container limitation[6]. They marked the behavior as intentional — developers must return images at the top level.
Broader Multimodal Support
OpenAI announced broader multimodal support around the same time:
- Realtime API (August 2025) added support for remote MCP servers and image inputs[13]
- ChatGPT Release Notes (November 2025) mentioned "More images in answers"[14]
Current Status (November 2025)
The expectation is that ChatGPT's developer mode and Agents SDK should now support inline image responses from MCP tools, provided you're using the updated SDK/models. However:
- Some users still report mixed results with custom GPTs
- The fix may be rolling out gradually
- Reliable inline images from user-defined tools are not universally confirmed
Recommendation: Test with the latest SDK version and follow best practices for returning images. If issues persist, use workarounds while monitoring for updates.
Conclusion
The MCP image inline display issue highlighted the challenges of integrating rich content into ChatGPT's agent responses. Throughout 2025, developers experimented with having ChatGPT agents return charts, plots, or other images, only to be frustrated by blank or garbled outputs.
The problem stemmed not from the MCP standard (which defines how to do this) but from ChatGPT's incomplete adoption of that standard in its UI and response handling. Images would either be dropped or mishandled, limiting use-cases like data visualization or image-based analysis.
From a technical perspective, the fix was straightforward — the ChatGPT client needed to properly recognize image content blocks and render them inline (as it does with DALL·E or Code Interpreter). OpenAI's PR #5600 addressed this, though full rollout may still be in progress.
For developers building agents that return images:
- Follow best practices (correct types, not nested)
- Test with alternative clients to verify your server works
- Implement workarounds for production use
- Monitor OpenAI updates for full support confirmation
The issue is on the path to resolution — identified, documented, with fixes merged. Until the day we see the release note "ChatGPT now displays images from custom tools inline," creative workarounds remain necessary.
Sources
-
Image response from an MCP server with agents sdk — OpenAI Developer Forum (May 2025)
-
Image responses from MCP tools do not work — OpenAI Developer Forum Bug Report (Sept 2025)
-
Are image responses in MCP tools supported by ChatGPT/Claude? — Reddit r/mcp (Oct 2025)
-
Cursor/Claude confirmation — Reddit comment confirming other clients work
-
Image objects not serializing properly in MCP tool responses — FastMCP GitHub Issue
-
Document container limitations for Image/Audio/File objects — FastMCP PR #2118
-
Image is not getting rendered in the chat — Cursor GitHub Issue #3365
-
Claude Desktop won't show MCP (image) response — Reddit r/mcp
-
Chat GPT App - not reading the mcp tool's metadata anymore — OpenAI Forum
-
Returning images or files from function tools — OpenAI Agents SDK Docs
-
Cline 3.13: MCP image support — Reddit r/CLine
-
[MCP] Render MCP tool call result images to the model — OpenAI Codex PR #5600
-
Introducing gpt-realtime and Realtime API updates — OpenAI Blog (Aug 2025)
-
ChatGPT Release Notes — OpenAI Help Center
-
MCP Specification: Tools — Image Content — Model Context Protocol (2025-11-25)
-
MCP Specification: ImageContent Schema — Model Context Protocol Schema Reference
-
Image Responses from MCP tool calls — OpenAI Codex Issue #4819
-
FastMCP Tools Documentation — FastMCP Docs