fix(openai): recognize Responses API input_text/input_image/input_file content parts#1902
fix(openai): recognize Responses API input_text/input_image/input_file content parts#1902Mai0313 wants to merge 1 commit intopydantic:mainfrom
Conversation
…e content parts
Previously `_convert_content_part` (latest semconv) only handled the Chat
Completions content types (`text`, `output_text`, `image_url`, `input_audio`)
and `input_to_events` (legacy semconv) only handled `output_text`. As a
result, valid Responses API content parts — `input_text`, `input_image`,
`input_file` per the official OpenAI docs — fell through to `gen_ai.unknown`
events / a generic dict, making traces for any non-trivial Responses API
call (image input, file input, multi-part user messages) very noisy.
Add the three missing types:
* `input_text` → TextPart (latest) / `gen_ai.{role}.message` (legacy)
* `input_image` → UriPart with modality=image. Note that `image_url` here
is a flat string (URL or data URI), not the nested `{url: ...}` dict
used by Chat Completions.
* `input_file` → UriPart with modality=document when `file_url` or
`file_data` is present; falls through to a generic dict for the
`file_id`-only case (no URI to point at).
Closes pydantic#1901
There was a problem hiding this comment.
Pull request overview
This PR fixes OpenAI Responses API instrumentation so standard input content parts are normalized correctly instead of being recorded as unknown events. It fits into the ongoing work to align the OpenAI integration with newer semantic-convention-based message handling while preserving the legacy event path where still needed.
Changes:
- Added
input_text,input_image, andinput_filehandling in_convert_content_partfor the latest semconv path. - Added legacy
input_to_eventssupport forinput_textso user text inputs are no longer emitted asgen_ai.unknown. - Added focused unit tests covering Responses input text, image, file, and legacy
input_textevent conversion.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
logfire/_internal/integrations/llm_providers/openai.py |
Extends Responses API content-part parsing for latest semconv output and legacy event generation. |
tests/otel_integrations/test_openai.py |
Adds regression tests for the newly recognized Responses API input part types. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| url = part.get('image_url', {}).get('url', '') | ||
| return UriPart(type='uri', uri=url, modality='image') | ||
| elif part_type == 'input_image': | ||
| # Responses API: image_url is a flat string (URL or data URI), |
There was a problem hiding this comment.
https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-input-messages.json specifies that blob parts with the actual bytes should be used instead of data URIs
| uri = part.get('file_url') or part.get('file_data') | ||
| if uri: | ||
| return UriPart(type='uri', uri=uri, modality='document') | ||
| return {**part, 'type': part_type} |
There was a problem hiding this comment.
there's a FilePart in https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-input-messages.json, it should be added and used here
Summary
Fixes #1901.
logfire.instrument_openai()currently producesgen_ai.unknownevents for the standard Responses API content types —input_text,input_image,input_file— because_convert_content_part(latest semconv path) andinput_to_events(legacy semconv path) only know about Chat Completions types (text,output_text,image_url,input_audio). Any non-trivial Responses API call (multi-part user message, image input, file input) ends up with most of its content asgen_ai.unknown, even though the payload is exactly what the OpenAI docs prescribe.This PR teaches both code paths the three missing types.
Mapping
input_textTextPart(latest) /gen_ai.{role}.message(legacy)output_text, just on the input side.input_imageUriPart(modality='image')image_urlhere is a flat string (URL or data URI), distinct from Chat Completions where it's a nested{url: ...}dict.input_fileUriPart(modality='document')whenfile_urlorfile_datais presentfile_id-only inputs fall through to a generic dict, since there's no URI to point at and no obvious semconv mapping for an opaque ID.Test plan
_convert_content_part_or_partsand theinput_textbranch ininput_to_events.tests/otel_integrations/test_openai.pypasses (65 tests).tests/otel_integrations/test_openai_agents.pyandtests/otel_integrations/test_litellm.py(which shareinput_to_events) pass.ruff check+ruff format --checkclean.I deliberately kept the legacy
input_to_eventschange to justinput_text(mirroring howoutput_textis the only recognized non-string content there) — adding image/file support to the legacy semconv path would require designing a new event shape for media, and that path is on its way out per #1586. Happy to extend if you'd prefer.Related: #1476, #1586, #1769.