Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 30, 2025

📄 10% (0.10x) speedup for to_custom_raw_response_wrapper in src/openai/_response.py

⏱️ Runtime : 98.3 microseconds 89.6 microseconds (best of 564 runs)

📝 Explanation and details

The optimization eliminates unnecessary dictionary copying and casting operations when handling the extra_headers parameter.

Key Changes:

  1. Removed expensive dictionary unpacking: The original code used {**(cast(Any, kwargs.get("extra_headers")) or {})} which always creates a new dictionary via unpacking, even when extra_headers is already a valid dict.

  2. Added conditional copying logic: The optimized version checks if extra_headers is None (creates empty dict), or if it's not already a dict type (converts to dict), otherwise uses the existing dict directly.

  3. Eliminated unnecessary cast operation: Removed the cast(Any, ...) wrapper which added overhead without functional benefit.

Why This Is Faster:

  • Dictionary unpacking ({**dict}) is computationally expensive as it iterates through all key-value pairs to create a new dictionary
  • The optimized version only performs copying when absolutely necessary (when extra_headers is None or not a dict)
  • Direct dictionary access and modification is much faster than creation + unpacking

Performance Characteristics:
The optimization shows consistent 7-13% speedups across all test cases, with particularly strong performance when:

  • extra_headers is already a dict (most common case) - avoids unnecessary copying
  • Large numbers of headers are present - reduces O(n) copying operations
  • Multiple wrapper calls are made - eliminates repeated unnecessary allocations

The 9% overall speedup comes from reducing the most common code path from "always copy" to "copy only when needed."

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 29 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import functools
from typing import Any, Callable, TypeVar, cast

# imports
import pytest  # used for our unit tests
from openai._response import to_custom_raw_response_wrapper
from typing_extensions import ParamSpec

# --- Constants and Dummy Classes for Testing ---
RAW_RESPONSE_HEADER = "X-OpenAI-Raw-Response"
OVERRIDE_CAST_TO_HEADER = "X-OpenAI-Override-Cast-To"

class APIResponse:
    """Dummy base class for APIResponse."""
    def __init__(self, data: Any):
        self.data = data

class BinaryAPIResponse(APIResponse):
    """Dummy concrete APIResponse for bytes."""
    pass

class JSONAPIResponse(APIResponse):
    """Dummy concrete APIResponse for dict."""
    pass

P = ParamSpec("P")
_APIResponseT = TypeVar("_APIResponseT", bound="APIResponse")
from openai._response import to_custom_raw_response_wrapper

# --- Unit Tests ---

# Basic Test Cases

def test_basic_adds_headers_and_returns_response():
    """Test that the wrapper adds the correct headers and returns the response class."""
    def dummy_api_method(*args, **kwargs):
        # Should receive extra_headers with the correct keys
        extra_headers = kwargs.get("extra_headers")
        # Return the correct type
        return BinaryAPIResponse(b"data")
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, BinaryAPIResponse); wrapped = codeflash_output # 4.01μs -> 3.71μs (7.95% faster)
    result = wrapped()

def test_basic_with_args_and_kwargs():
    """Test that positional and keyword arguments are passed through correctly."""
    def dummy_api_method(a, b, c=None, extra_headers=None):
        return JSONAPIResponse({"ok": True})
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, JSONAPIResponse); wrapped = codeflash_output # 3.60μs -> 3.21μs (12.2% faster)
    result = wrapped(1, 2, c=3)

def test_basic_preserves_func_metadata():
    """Test that functools.wraps preserves the original function's metadata."""
    def dummy_api_method():
        """This is a docstring."""
        return BinaryAPIResponse(b"foo")
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, BinaryAPIResponse); wrapped = codeflash_output # 3.44μs -> 3.11μs (10.3% faster)

# Edge Test Cases

def test_extra_headers_are_merged_not_overwritten():
    """Test that existing extra_headers are merged, not replaced."""
    def dummy_api_method(extra_headers=None):
        return JSONAPIResponse({"merged": True})
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, JSONAPIResponse); wrapped = codeflash_output # 3.34μs -> 2.97μs (12.5% faster)
    result = wrapped(extra_headers={"existing": "value"})

def test_extra_headers_none():
    """Test that extra_headers=None is handled gracefully."""
    def dummy_api_method(extra_headers=None):
        return BinaryAPIResponse(b"none")
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, BinaryAPIResponse); wrapped = codeflash_output # 3.27μs -> 2.93μs (11.4% faster)
    result = wrapped(extra_headers=None)

def test_extra_headers_empty_dict():
    """Test that extra_headers={} is handled gracefully."""
    def dummy_api_method(extra_headers=None):
        return JSONAPIResponse({"empty": True})
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, JSONAPIResponse); wrapped = codeflash_output # 3.35μs -> 3.03μs (10.6% faster)
    result = wrapped(extra_headers={})

def test_response_cls_is_used_in_header():
    """Test that the response_cls is passed as value to OVERRIDE_CAST_TO_HEADER."""
    def dummy_api_method(extra_headers=None):
        return BinaryAPIResponse(b"header")
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, BinaryAPIResponse); wrapped = codeflash_output # 3.35μs -> 2.98μs (12.7% faster)
    result = wrapped()

def test_kwargs_not_mutated_outside():
    """Test that kwargs passed to wrapped are not mutated outside the function."""
    def dummy_api_method(extra_headers=None):
        # Mutate extra_headers inside
        extra_headers["test"] = "mutated"
        return JSONAPIResponse({"ok": True})
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, JSONAPIResponse); wrapped = codeflash_output # 3.22μs -> 3.03μs (6.54% faster)
    headers = {"original": "value"}
    result = wrapped(extra_headers=headers)

def test_func_returns_wrong_type_raises():
    """Test that if func returns wrong type, the cast is not enforced, but type may not match."""
    class NotAPIResponse:
        pass
    def dummy_api_method(extra_headers=None):
        return NotAPIResponse()
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, BinaryAPIResponse); wrapped = codeflash_output # 3.54μs -> 3.18μs (11.3% faster)
    result = wrapped()

def test_func_raises_exception_propagates():
    """Test that exceptions in the wrapped function propagate."""
    def dummy_api_method(extra_headers=None):
        raise RuntimeError("fail")
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, BinaryAPIResponse); wrapped = codeflash_output # 3.29μs -> 2.96μs (11.1% faster)
    with pytest.raises(RuntimeError):
        wrapped()

# Large Scale Test Cases

def test_large_scale_many_headers():
    """Test with a large number of extra_headers to ensure merging is efficient."""
    def dummy_api_method(extra_headers=None):
        # Should contain all original headers plus the two new ones
        for i in range(1000):
            pass
        return JSONAPIResponse({"large": True})
    headers = {f"key{i}": f"value{i}" for i in range(1000)}
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, JSONAPIResponse); wrapped = codeflash_output # 3.66μs -> 3.42μs (6.80% faster)
    result = wrapped(extra_headers=headers)

def test_large_scale_many_args_kwargs():
    """Test passing many positional and keyword arguments through the wrapper."""
    def dummy_api_method(*args, **kwargs):
        for i in range(1000):
            pass
        extra_headers = kwargs["extra_headers"]
        return BinaryAPIResponse(b"many")
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, BinaryAPIResponse); wrapped = codeflash_output # 3.41μs -> 3.03μs (12.5% faster)
    kw = {f"k{i}": i for i in range(1000)}
    result = wrapped(*range(1000), **kw)

def test_large_scale_return_large_response():
    """Test that a large APIResponse object is returned correctly."""
    large_data = b"x" * 1000
    def dummy_api_method(extra_headers=None):
        return BinaryAPIResponse(large_data)
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, BinaryAPIResponse); wrapped = codeflash_output # 3.64μs -> 3.53μs (3.32% faster)
    result = wrapped()

def test_large_scale_multiple_calls_consistency():
    """Test that multiple calls to the wrapper remain deterministic and do not leak state."""
    def dummy_api_method(extra_headers=None):
        return JSONAPIResponse({"call": True})
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_method, JSONAPIResponse); wrapped = codeflash_output # 3.28μs -> 3.04μs (8.07% faster)
    for _ in range(100):
        result = wrapped()
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import functools
# function to test
from typing import Any, Callable, TypeVar, cast

# imports
import pytest  # used for our unit tests
from openai._response import to_custom_raw_response_wrapper
from typing_extensions import ParamSpec

# Simulate the constants and APIResponse class as they would be in openai/_response.py
RAW_RESPONSE_HEADER = "X-OpenAI-RAW-RESPONSE"
OVERRIDE_CAST_TO_HEADER = "X-OpenAI-OVERRIDE-CAST-TO"

P = ParamSpec("P")
_APIResponseT = TypeVar("_APIResponseT", bound="APIResponse[Any]")

class APIResponse:
    """Dummy base class for APIResponse."""
    def __init__(self, data: Any, headers: dict[str, Any]):
        self.data = data
        self.headers = headers

class BinaryAPIResponse(APIResponse):
    pass

class JsonAPIResponse(APIResponse):
    pass
from openai._response import to_custom_raw_response_wrapper

# ---- UNIT TESTS ----

# Basic Test Cases

def dummy_api_func(data, extra_headers=None):
    """Simulates a basic API function returning a BinaryAPIResponse."""
    return BinaryAPIResponse(data, extra_headers)

def dummy_api_func_json(data, extra_headers=None):
    """Simulates a basic API function returning a JsonAPIResponse."""
    return JsonAPIResponse(data, extra_headers)

def test_basic_binary_response():
    """Test that the wrapper injects correct headers and returns BinaryAPIResponse."""
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_func, BinaryAPIResponse); wrapped = codeflash_output # 3.52μs -> 3.12μs (12.9% faster)
    resp = wrapped("hello world")

def test_basic_json_response():
    """Test that the wrapper injects correct headers and returns JsonAPIResponse."""
    codeflash_output = to_custom_raw_response_wrapper(dummy_api_func_json, JsonAPIResponse); wrapped = codeflash_output # 3.36μs -> 2.99μs (12.2% faster)
    resp = wrapped({"foo": "bar"})

def test_preserves_other_kwargs():
    """Test that other kwargs are preserved and extra_headers is merged."""
    def func(data, foo=None, extra_headers=None):
        return BinaryAPIResponse(data, extra_headers)
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.40μs -> 3.12μs (9.14% faster)
    resp = wrapped("data", foo="baz")

def test_merges_existing_extra_headers():
    """Test that the wrapper merges with existing extra_headers."""
    def func(data, extra_headers=None):
        return BinaryAPIResponse(data, extra_headers)
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.41μs -> 3.04μs (12.1% faster)
    resp = wrapped("data", extra_headers={"existing": "value"})

# Edge Test Cases

def test_empty_extra_headers():
    """Test with empty extra_headers passed in."""
    def func(data, extra_headers=None):
        return BinaryAPIResponse(data, extra_headers)
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.34μs -> 3.02μs (10.7% faster)
    resp = wrapped("data", extra_headers={})

def test_none_extra_headers():
    """Test with None for extra_headers."""
    def func(data, extra_headers=None):
        return BinaryAPIResponse(data, extra_headers)
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.30μs -> 3.00μs (10.1% faster)
    resp = wrapped("data", extra_headers=None)

def test_overwrites_existing_raw_response_header():
    """Test that the wrapper overwrites RAW_RESPONSE_HEADER if it already exists."""
    def func(data, extra_headers=None):
        return BinaryAPIResponse(data, extra_headers)
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.26μs -> 3.04μs (7.37% faster)
    resp = wrapped("data", extra_headers={RAW_RESPONSE_HEADER: "not-raw"})

def test_overwrites_existing_override_cast_to_header():
    """Test that the wrapper overwrites OVERRIDE_CAST_TO_HEADER if it already exists."""
    def func(data, extra_headers=None):
        return BinaryAPIResponse(data, extra_headers)
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.29μs -> 2.98μs (10.3% faster)
    resp = wrapped("data", extra_headers={OVERRIDE_CAST_TO_HEADER: "not-a-class"})

def test_kwargs_are_passed_through():
    """Test that arbitrary kwargs are passed through."""
    def func(data, foo=None, bar=None, extra_headers=None):
        return BinaryAPIResponse(data, extra_headers)
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.22μs -> 2.94μs (9.55% faster)
    resp = wrapped("data", foo="foo", bar="bar")

def test_positional_args():
    """Test that positional arguments are supported."""
    def func(a, b, extra_headers=None):
        return BinaryAPIResponse((a, b), extra_headers)
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.25μs -> 2.89μs (12.2% faster)
    resp = wrapped(1, 2)

def test_return_type_casting():
    """Test that the return type is cast to the correct response class."""
    def func(data, extra_headers=None):
        # Return a base APIResponse but should be cast to BinaryAPIResponse
        return APIResponse(data, extra_headers)
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.25μs -> 2.95μs (10.1% faster)
    resp = wrapped("data")

# Large Scale Test Cases

def test_large_extra_headers():
    """Test with a large number of extra_headers."""
    def func(data, extra_headers=None):
        # Should have all the injected headers plus the originals
        for i in range(500):
            pass
        return BinaryAPIResponse(data, extra_headers)
    large_headers = {f"key{i}": f"value{i}" for i in range(500)}
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.44μs -> 3.21μs (6.91% faster)
    resp = wrapped("data", extra_headers=large_headers)
    for i in range(500):
        pass

def test_large_data_payload():
    """Test with a large data payload."""
    large_data = "x" * 1000  # 1000 characters
    def func(data, extra_headers=None):
        return BinaryAPIResponse(data, extra_headers)
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.29μs -> 3.06μs (7.35% faster)
    resp = wrapped(large_data)

def test_large_number_of_calls():
    """Test the wrapper under repeated calls for performance and determinism."""
    def func(data, extra_headers=None):
        return BinaryAPIResponse(data, extra_headers)
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.12μs -> 2.93μs (6.66% faster)
    for i in range(1000):
        resp = wrapped(str(i))

def test_large_kwargs():
    """Test passing a large number of kwargs."""
    def func(data, extra_headers=None, **kwargs):
        for i in range(500):
            pass
        return BinaryAPIResponse(data, extra_headers)
    large_kwargs = {f"foo{i}": f"bar{i}" for i in range(500)}
    codeflash_output = to_custom_raw_response_wrapper(func, BinaryAPIResponse); wrapped = codeflash_output # 3.41μs -> 3.17μs (7.46% faster)
    resp = wrapped("data", **large_kwargs)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-to_custom_raw_response_wrapper-mhcx7bnf and push.

Codeflash Static Badge

The optimization eliminates unnecessary dictionary copying and casting operations when handling the `extra_headers` parameter. 

**Key Changes:**
1. **Removed expensive dictionary unpacking**: The original code used `{**(cast(Any, kwargs.get("extra_headers")) or {})}` which always creates a new dictionary via unpacking, even when `extra_headers` is already a valid dict.

2. **Added conditional copying logic**: The optimized version checks if `extra_headers` is None (creates empty dict), or if it's not already a dict type (converts to dict), otherwise uses the existing dict directly.

3. **Eliminated unnecessary cast operation**: Removed the `cast(Any, ...)` wrapper which added overhead without functional benefit.

**Why This Is Faster:**
- Dictionary unpacking (`{**dict}`) is computationally expensive as it iterates through all key-value pairs to create a new dictionary
- The optimized version only performs copying when absolutely necessary (when `extra_headers` is None or not a dict)
- Direct dictionary access and modification is much faster than creation + unpacking

**Performance Characteristics:**
The optimization shows consistent 7-13% speedups across all test cases, with particularly strong performance when:
- `extra_headers` is already a dict (most common case) - avoids unnecessary copying
- Large numbers of headers are present - reduces O(n) copying operations
- Multiple wrapper calls are made - eliminates repeated unnecessary allocations

The 9% overall speedup comes from reducing the most common code path from "always copy" to "copy only when needed."
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 30, 2025 04:24
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant