-
Notifications
You must be signed in to change notification settings - Fork 709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Instrument OpenAI's new Responses API #2782 #2787
base: main
Are you sure you want to change the base?
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to c8858fa in 2 minutes and 47 seconds
More details
- Looked at
206
lines of code in2
files - Skipped
0
files when reviewing. - Skipped posting
14
drafted comments based on config settings.
1. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py:247
- Draft comment:
New instrumentation for Responses.retrieve has been added. Ensure this new span is consistent with other spans (naming, attributes) and doesn’t conflict with existing ones. - Reason this comment was not posted:
Confidence changes required:0%
<= threshold50%
None
2. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.py:27
- Draft comment:
Avoid using 'assert' to enforce response type in production; consider handling unexpected types gracefully (e.g. with error logging) because asserts may be disabled. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
The @dont_throw decorator implies this code is already designed to handle errors gracefully. The assert here appears to be used for type checking during development. The suggestion might make the code more defensive, but could also hide bugs that should be caught early. The assert helps catch integration issues with the OpenAI API.
The comment makes a valid point about asserts being disabled in production. However, it may be missing the context of the @dont_throw decorator and the value of failing fast during development.
Given the @dont_throw decorator and that this is an API integration, the assert serves a useful purpose in development while production errors would still be caught by the decorator.
The comment should be deleted. The current error handling approach with @dont_throw and assert for type checking is a reasonable pattern for this API integration code.
3. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.py:124
- Draft comment:
Avoid using 'id' as a variable name as it shadows the built-in function; rename to something like 'thread_id'. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.py:185
- Draft comment:
Accessing content[0] without a length check may cause an IndexError if the content list is empty; consider adding a defensive check. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
This is tricky because we don't have the OpenAI API documentation to confirm the structure of message content. However, since this is production instrumentation code for OpenAI's official API, and the code assumes a specific structure (content[0].text.value), it's likely that this structure is guaranteed by the API. The code is also using .get() for other accesses, showing awareness of null safety, so the direct array access was likely intentional.
I could be wrong about the API guarantees - perhaps there are edge cases where a message could have empty content. The code would be more defensive with the check.
However, if empty content is actually possible, it would likely indicate a bug in the OpenAI API itself, as messages are supposed to have content. Adding a null check would hide an API violation that should be caught.
The direct array access appears intentional and matches expected API guarantees. Adding null checks would mask potential API bugs rather than improve safety.
5. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.py:98
- Draft comment:
Consider using isinstance() for type-checking (for LegacyAPIResponse) to account for possible subclass instances, rather than a direct type comparison. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.py:229
- Draft comment:
Variable 'i' is used for prompt indexing here while messages_list_wrapper uses 'prompt_index'; for consistency and clarity, consider renaming it. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py:247
- Draft comment:
New wrapper for Responses.retrieve added. Ensure the 'openai.resources.responses' module exists across supported versions and add tests for it. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
The try/except block already handles the case where the module doesn't exist. The comment is essentially asking for verification and testing, which violates our rules about not asking for verification or making speculative comments. The code structure already handles potential module absence gracefully.
Maybe there's value in explicitly testing this new functionality? The module existence check could be important for compatibility.
The code already handles missing modules gracefully, and asking for tests is a process request rather than a code quality issue. We should trust the author to follow testing practices.
Delete the comment as it primarily asks for verification and testing, which violates our rules about not making speculative comments or asking for process confirmations.
8. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.py:32
- Draft comment:
Avoid using assert for type-checking in production code. Consider using an explicit isinstance check with graceful error handling or logging to prevent issues when assertions are disabled. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
The code is in a function decorated with @dont_throw, which likely means exceptions are already being handled at a higher level. The assert is being used as a development-time check rather than a runtime validation. The suggestion would change the behavior by forcing an exception even with assertions disabled, which may not be the intended behavior. The current assert usage seems intentional for development-time validation.
The comment raises a valid point about assert statements being disabled in optimized Python. However, I may be missing security implications of relying on assert for type checking.
Given the @dont_throw decorator and the apparent intention for development-time validation only, the current assert usage appears to be an acceptable pattern here. The suggested change would fundamentally alter the runtime behavior.
The comment should be deleted as the current assert usage appears intentional for development-time validation, and changing it would alter the intended behavior of the code.
9. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.py:104
- Draft comment:
Ensure that 'parsed_response.thread_id' is always present and consistent with the key used in runs_create_wrapper. Consider adding explicit checks or handling for missing thread_id. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
The comment asks to "ensure thread_id is present" but the code already handles this safely by checking if thread_id exists in runs before using it. The Run type from OpenAI's SDK likely guarantees thread_id is present. The code is defensive by checking the dictionary access. This seems like an unnecessary speculative concern.
I could be wrong about the Run type guarantees. There could be edge cases where thread_id is None or empty string.
Even if thread_id had edge cases, the existing dictionary check safely handles them by skipping the update if the key isn't found. No data corruption or crashes would occur.
Delete this comment. The code already safely handles thread_id access through dictionary checking, and the comment is speculative about potential issues without clear evidence of a problem.
10. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.py:124
- Draft comment:
Avoid using 'id' as a variable name since it shadows the built-in. Rename it (e.g., to 'thread_id') for clarity. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
11. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.py:142
- Draft comment:
The use of separate counters 'prompt_index' and 'completion_index' in messages_list_wrapper improves clarity. Verify that treating messages with roles 'user' or 'system' as prompts (and others as completions) is the intended behavior. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
This comment is asking the PR author to verify their intention, which violates the rule against asking for confirmation of intended behavior. It does not provide a specific code suggestion or point out a specific issue that needs addressing.
12. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.py:229
- Draft comment:
In runs_create_and_stream_wrapper, consider using a more descriptive variable name (e.g., 'prompt_index' instead of 'i') for consistency with other wrappers. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
13. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/assistant_wrappers.py:88
- Draft comment:
In runs_create_wrapper, 'run_id' from response_dict is stored. Ensure this attribute is used consistently in downstream instrumentation or logging for effective traceability. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%
<= threshold50%
The comment is asking the author to ensure consistent usage of 'run_id' in downstream processes. This falls under the category of asking the author to ensure something is done, which violates the rules.
14. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py:123
- Draft comment:
Typographical error: 'he size of returned vector' should be corrected to 'the size of returned vector'. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
Workflow ID: wflow_M6Zh7uFkU8LmBBUX
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tongyu0924, can you add tests for this?
feat(instrumentation): ...
orfix(instrumentation): ...
.Important
Add
responses_retrieve_wrapper
for OpenAI response retrieval and enhance tracing inassistant_wrappers.py
.responses_retrieve_wrapper
to handleResponses.retrieve
in__init__.py
andassistant_wrappers.py
.messages_list_wrapper
to includeLLM_SYSTEM
andLLM_USAGE_PROMPT_TOKENS
attributes.runs_create_wrapper
to storerun_id
inruns
dictionary.responses_retrieve_wrapper
.messages_list_wrapper
to handle message roles and content more precisely.messages_list_wrapper
for clarity.This description was created by
for c8858fa. It will automatically update as commits are pushed.