Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 30, 2025

📄 19% (0.19x) speedup for main in src/openai/cli/_cli.py

⏱️ Runtime : 1.67 seconds 1.40 seconds (best of 5 runs)

📝 Explanation and details

The optimized code achieves an 18% speedup by improving the proxy handling logic, which is the primary bottleneck when multiple proxies are present.

Key optimization: Efficient proxy deduplication
The original code checked for duplicate proxies by tracking them in the proxies dict during iteration, raising an error immediately when duplicates were found. The optimized version uses a two-pass approach:

  1. First pass (reversed): Collects only the last proxy of each protocol type
  2. Second pass (forward): Counts duplicates to maintain the same error behavior
  3. Final pass: Builds the actual proxy transport objects only for unique protocols

Why this is faster:

  • Reduced expensive operations: The original code created httpx.HTTPTransport objects even for proxies that would be rejected due to duplicates. The optimized version only creates transport objects for proxies that will actually be used.
  • Better cache locality: The optimized version processes all proxy strings first (lightweight string operations), then does the heavy HTTP transport creation in a single batch.

Test case performance analysis:

  • Massive gains on duplicate proxy cases: Tests with multiple HTTP/HTTPS proxies show 800-900% speedups because the optimized version avoids creating expensive transport objects for duplicates
  • Minimal impact on normal cases: Single proxy or no-proxy scenarios show negligible performance differences (0-1% variation), confirming the optimization doesn't hurt common usage patterns
  • Scales well: Large-scale tests with 500+ proxies demonstrate the optimization's effectiveness grows with input size

The optimization maintains identical behavior and error messages while significantly reducing computational overhead in proxy-heavy scenarios.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 39 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 77.8%
🌀 Generated Regression Tests and Runtime
import argparse
import sys
import types

# imports
import pytest
from openai.cli._cli import main

# --- Mocks and helpers for dependencies ---

# Mock openai module
class OpenAIMock:
    def __init__(self):
        self.organization = None
        self.api_key = None
        self.base_url = None
        self.api_type = None
        self.azure_endpoint = None
        self.api_version = None
        self.azure_ad_token = None
        self.http_client = None

openai = OpenAIMock()
__version__ = "1.2.3"

class ClientMock:
    def __init__(self, mounts=None, http2=None):
        self.closed = False
        self.mounts = mounts
        self.http2 = http2
    def close(self):
        self.closed = True

# Mock pydantic module
class ValidationError(Exception):
    pass

pydantic = types.SimpleNamespace(ValidationError=ValidationError)

# Mock CLIError and APIError
class CLIError(Exception):
    pass

class APIError(Exception):
    pass

# --- Arguments model and model_parse ---
class Arguments:
    def __init__(self, **kwargs):
        self.__dict__.update(kwargs)
        # Set defaults for all possible args
        self.verbosity = kwargs.get("verbosity", 0)
        self.proxy = kwargs.get("proxy", None)
        self.organization = kwargs.get("organization", None)
        self.api_key = kwargs.get("api_key", None)
        self.api_base = kwargs.get("api_base", None)
        self.api_type = kwargs.get("api_type", None)
        self.api_version = kwargs.get("api_version", None)
        self.azure_endpoint = kwargs.get("azure_endpoint", None)
        self.azure_ad_token = kwargs.get("azure_ad_token", None)
        self.args_model = kwargs.get("args_model", None)
        self.allow_unknown_args = kwargs.get("allow_unknown_args", True)
        self.func = kwargs.get("func", None)

def model_parse(cls, dct):
    return cls(**dct)

# --- _tools.register_commands stub ---
class _tools:
    @staticmethod
    def register_commands(sub_tools, subparsers):
        pass

# --- register_commands stub ---
def register_commands(sub_api):
    # Add dummy command for testing
    def dummy_command(args=None):
        sys.stdout.write("dummy command executed\n")
    sub_api.set_defaults(func=dummy_command)

def _build_parser() -> argparse.ArgumentParser:
    parser = argparse.ArgumentParser(description=None, prog="openai")
    parser.add_argument(
        "-v",
        "--verbose",
        action="count",
        dest="verbosity",
        default=0,
        help="Set verbosity.",
    )
    parser.add_argument("-b", "--api-base", help="What API base url to use.")
    parser.add_argument("-k", "--api-key", help="What API key to use.")
    parser.add_argument("-p", "--proxy", nargs="+", help="What proxy to use.")
    parser.add_argument(
        "-o",
        "--organization",
        help="Which organization to run as (will use your default organization if not specified)",
    )
    parser.add_argument(
        "-t",
        "--api-type",
        type=str,
        choices=("openai", "azure"),
        help="The backend API to call, must be `openai` or `azure`",
    )
    parser.add_argument(
        "--api-version",
        help="The Azure API version, e.g. 'https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#rest-api-versioning'",
    )
    parser.add_argument(
        "--azure-endpoint",
        help="The Azure endpoint, e.g. 'https://endpoint.openai.azure.com'",
    )
    parser.add_argument(
        "--azure-ad-token",
        help="A token from Azure Active Directory, https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id",
    )
    parser.add_argument(
        "-V",
        "--version",
        action="version",
        version="%(prog)s " + __version__,
    )

    def help() -> None:
        parser.print_help()

    parser.set_defaults(func=help)

    subparsers = parser.add_subparsers()
    sub_api = subparsers.add_parser("api", help="Direct API calls")
    register_commands(sub_api)
    sub_tools = subparsers.add_parser("tools", help="Client side tools for convenience")
    _tools.register_commands(sub_tools, subparsers)
    return parser

# --- _parse_args implementation ---
def _parse_args(parser: argparse.ArgumentParser) -> tuple[argparse.Namespace, Arguments, list[str]]:
    if "--" in sys.argv:
        idx = sys.argv.index("--")
        known_args = sys.argv[1:idx]
        unknown_args = sys.argv[idx:]
    else:
        known_args = sys.argv[1:]
        unknown_args = []

    parsed, remaining_unknown = parser.parse_known_args(known_args)
    remaining_unknown.extend(unknown_args)
    args = model_parse(Arguments, vars(parsed))
    if not args.allow_unknown_args:
        parser.parse_args()
    return parsed, args, remaining_unknown
from openai.cli._cli import main

# --- BASIC TEST CASES ---

def test_main_help(capsys):
    """Test: Running with no arguments should print help and return 0."""
    sys.argv = ["openai"]
    codeflash_output = main(); result = codeflash_output
    out, err = capsys.readouterr()

def test_main_version(capsys):
    """Test: Running with --version prints version and exits."""
    sys.argv = ["openai", "--version"]
    with pytest.raises(SystemExit):
        main()
    out, err = capsys.readouterr()

def test_main_api_key_and_base(monkeypatch):
    """Test: Setting api_key and api_base updates openai state."""
    sys.argv = ["openai", "-k", "sk-123", "-b", "https://api.example.com"]
    codeflash_output = main(); result = codeflash_output # 43.3ms -> 43.6ms (0.632% slower)

def test_main_organization(monkeypatch):
    """Test: Setting organization updates openai.organization."""
    sys.argv = ["openai", "-o", "org-test"]
    codeflash_output = main(); result = codeflash_output # 43.5ms -> 43.6ms (0.233% slower)

def test_main_api_type(monkeypatch):
    """Test: Setting api_type to azure updates openai.api_type."""
    sys.argv = ["openai", "-t", "azure"]
    codeflash_output = main(); result = codeflash_output # 43.2ms -> 43.5ms (0.631% slower)

def test_main_azure_fields(monkeypatch):
    """Test: Setting Azure fields updates openai state."""
    sys.argv = [
        "openai", "--azure-endpoint", "https://endpoint.azure.com",
        "--api-version", "2023-01-01", "--azure-ad-token", "token123"
    ]
    codeflash_output = main(); result = codeflash_output # 43.2ms -> 43.5ms (0.627% slower)

def test_main_proxy_http(monkeypatch):
    """Test: Setting HTTP proxy configures http_client."""
    sys.argv = ["openai", "-p", "http://proxy.example.com"]
    codeflash_output = main(); result = codeflash_output # 80.8ms -> 81.3ms (0.646% slower)

def test_main_proxy_https(monkeypatch):
    """Test: Setting HTTPS proxy configures http_client."""
    sys.argv = ["openai", "-p", "https://secureproxy.example.com"]
    codeflash_output = main(); result = codeflash_output # 82.0ms -> 82.3ms (0.422% slower)

def test_main_api_subcommand(capsys):
    """Test: Running 'api' subcommand executes dummy command."""
    sys.argv = ["openai", "api"]
    codeflash_output = main(); result = codeflash_output
    out, err = capsys.readouterr()

def test_main_tools_subcommand(capsys):
    """Test: Running 'tools' subcommand prints help (no commands registered)."""
    sys.argv = ["openai", "tools"]
    codeflash_output = main(); result = codeflash_output
    out, err = capsys.readouterr()

def test_main_verbosity_warning(capsys):
    """Test: Setting verbosity prints warning to stderr."""
    sys.argv = ["openai", "-v"]
    codeflash_output = main(); result = codeflash_output
    out, err = capsys.readouterr()

# --- EDGE TEST CASES ---

def test_main_multiple_http_proxies(monkeypatch):
    """Test: Multiple HTTP proxies should raise CLIError and return 1."""
    sys.argv = ["openai", "-p", "http://proxy1.com", "http://proxy2.com"]
    codeflash_output = main(); result = codeflash_output # 42.3ms -> 4.21ms (903% faster)

def test_main_multiple_https_proxies(monkeypatch):
    """Test: Multiple HTTPS proxies should raise CLIError and return 1."""
    sys.argv = ["openai", "-p", "https://proxy1.com", "https://proxy2.com"]
    codeflash_output = main(); result = codeflash_output # 42.2ms -> 4.06ms (939% faster)






def test_main_no_args(monkeypatch):
    """Test: Running with no arguments should not error."""
    sys.argv = ["openai"]
    codeflash_output = main(); result = codeflash_output # 44.2ms -> 44.0ms (0.382% faster)

def test_main_invalid_api_type(monkeypatch):
    """Test: Invalid api_type should cause argparse error."""
    sys.argv = ["openai", "-t", "invalid"]
    with pytest.raises(SystemExit):
        main()

def test_main_func_args_model(monkeypatch):
    """Test: If args_model is set, func is called with model_parse result."""
    called = {}
    def func(arg):
        called["called"] = True
        called["arg"] = arg
    sys.argv = ["openai", "api"]
    parser = _build_parser()
    parsed, args, unknown = _parse_args(parser)
    args.args_model = Arguments
    args.func = func
    monkeypatch.setattr(__name__ + "._parse_args", lambda parser: (parsed, args, unknown))
    codeflash_output = main(); result = codeflash_output # 43.6ms -> 43.7ms (0.271% slower)

# --- LARGE SCALE TEST CASES ---

def test_main_large_number_of_proxies(monkeypatch):
    """Test: Many proxies, but only one per protocol allowed."""
    # Should succeed with one http and one https
    sys.argv = ["openai", "-p", "http://proxy.com", "https://secure.com"]
    codeflash_output = main(); result = codeflash_output # 118ms -> 118ms (0.030% slower)

def test_main_large_number_of_args(monkeypatch):
    """Test: Many arguments should be handled efficiently."""
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import argparse
import sys
import types

# imports
import pytest
from openai.cli._cli import main


class DummyArguments:
    def __init__(self, **kwargs):
        self.__dict__.update(kwargs)
        # Default values for all possible args
        self.verbosity = kwargs.get("verbosity", 0)
        self.proxy = kwargs.get("proxy", None)
        self.organization = kwargs.get("organization", None)
        self.api_key = kwargs.get("api_key", None)
        self.api_base = kwargs.get("api_base", None)
        self.api_type = kwargs.get("api_type", None)
        self.azure_endpoint = kwargs.get("azure_endpoint", None)
        self.api_version = kwargs.get("api_version", None)
        self.azure_ad_token = kwargs.get("azure_ad_token", None)
        self.args_model = kwargs.get("args_model", None)
        self.allow_unknown_args = kwargs.get("allow_unknown_args", True)

class DummyAPIError(Exception): pass
class DummyCLIError(Exception): pass
class DummyValidationError(Exception): pass

def dummy_display_error(err):
    dummy_display_error.last_error = err

def dummy_register_commands(parser):
    parser.func = lambda: setattr(dummy_register_commands, "called", True)

def dummy_tools_register_commands(sub_tools, subparsers):
    sub_tools.func = lambda: setattr(dummy_tools_register_commands, "called", True)

def dummy_model_parse(cls, dct):
    return cls(**dct)

class DummyOpenAI:
    def __init__(self):
        self.organization = None
        self.api_key = None
        self.base_url = None
        self.api_type = None
        self.azure_endpoint = None
        self.api_version = None
        self.azure_ad_token = None
        self.http_client = None

openai = DummyOpenAI()
pydantic = types.SimpleNamespace(ValidationError=DummyValidationError)
APIError = DummyAPIError
CLIError = DummyCLIError
display_error = dummy_display_error
model_parse = dummy_model_parse

def _parse_args(parser: argparse.ArgumentParser):
    # argparse by default will strip out the `--` but we want to keep it for unknown arguments
    if "--" in sys.argv:
        idx = sys.argv.index("--")
        known_args = sys.argv[1:idx]
        unknown_args = sys.argv[idx:]
    else:
        known_args = sys.argv[1:]
        unknown_args = []
    parsed, remaining_unknown = parser.parse_known_args(known_args)
    remaining_unknown.extend(unknown_args)
    args = model_parse(DummyArguments, vars(parsed))
    if not args.allow_unknown_args:
        parser.parse_args()
    return parsed, args, remaining_unknown
from openai.cli._cli import main

# --- Basic Test Cases ---

def test_main_basic_no_args(monkeypatch):
    """Test main() with no arguments - should call help and return 0."""
    sys.argv = ["openai"]
    codeflash_output = main(); ret = codeflash_output # 45.4ms -> 45.4ms (0.048% slower)

def test_main_basic_api_key(monkeypatch):
    """Test main() with api-key argument."""
    sys.argv = ["openai", "-k", "testkey"]
    codeflash_output = main(); ret = codeflash_output # 43.5ms -> 43.5ms (0.108% slower)

def test_main_basic_organization(monkeypatch):
    """Test main() with organization argument."""
    sys.argv = ["openai", "-o", "org123"]
    codeflash_output = main(); ret = codeflash_output # 43.2ms -> 43.4ms (0.436% slower)

def test_main_basic_api_base(monkeypatch):
    """Test main() with api-base argument."""
    sys.argv = ["openai", "-b", "https://api.example.com"]
    codeflash_output = main(); ret = codeflash_output # 43.4ms -> 43.5ms (0.142% slower)

def test_main_basic_api_type_openai(monkeypatch):
    """Test main() with api-type 'openai'."""
    sys.argv = ["openai", "-t", "openai"]
    codeflash_output = main(); ret = codeflash_output # 43.4ms -> 43.5ms (0.364% slower)

def test_main_basic_api_type_azure(monkeypatch):
    """Test main() with api-type 'azure'."""
    sys.argv = ["openai", "-t", "azure"]
    codeflash_output = main(); ret = codeflash_output # 43.1ms -> 43.6ms (0.958% slower)

def test_main_basic_azure_endpoint(monkeypatch):
    """Test main() with azure-endpoint argument."""
    sys.argv = ["openai", "--azure-endpoint", "https://endpoint.openai.azure.com"]
    codeflash_output = main(); ret = codeflash_output # 43.3ms -> 43.5ms (0.488% slower)

def test_main_basic_api_version(monkeypatch):
    """Test main() with api-version argument."""
    sys.argv = ["openai", "--api-version", "2023-06-01-preview"]
    codeflash_output = main(); ret = codeflash_output # 43.2ms -> 43.5ms (0.720% slower)

def test_main_basic_azure_ad_token(monkeypatch):
    """Test main() with azure-ad-token argument."""
    sys.argv = ["openai", "--azure-ad-token", "token123"]
    codeflash_output = main(); ret = codeflash_output # 43.5ms -> 43.4ms (0.205% faster)

def test_main_basic_proxy_http(monkeypatch):
    """Test main() with http proxy."""
    sys.argv = ["openai", "-p", "http://proxy.example.com"]
    codeflash_output = main(); ret = codeflash_output # 81.2ms -> 81.4ms (0.311% slower)

def test_main_basic_proxy_https(monkeypatch):
    """Test main() with https proxy."""
    sys.argv = ["openai", "-p", "https://proxy.example.com"]
    codeflash_output = main(); ret = codeflash_output # 81.6ms -> 82.3ms (0.831% slower)

def test_main_basic_subcommand_api(monkeypatch):
    """Test main() with subcommand 'api'."""
    sys.argv = ["openai", "api"]
    codeflash_output = main(); ret = codeflash_output # 44.6ms -> 44.7ms (0.127% slower)

def test_main_basic_subcommand_tools(monkeypatch):
    """Test main() with subcommand 'tools'."""
    sys.argv = ["openai", "tools"]
    codeflash_output = main(); ret = codeflash_output # 43.4ms -> 43.7ms (0.620% slower)

# --- Edge Test Cases ---

def test_main_edge_multiple_http_proxies(monkeypatch):
    """Test main() with multiple http proxies (should raise CLIError)."""
    sys.argv = ["openai", "-p", "http://proxy1.com", "http://proxy2.com"]
    codeflash_output = main(); ret = codeflash_output # 42.4ms -> 4.18ms (913% faster)

def test_main_edge_multiple_https_proxies(monkeypatch):
    """Test main() with multiple https proxies (should raise CLIError)."""
    sys.argv = ["openai", "-p", "https://proxy1.com", "https://proxy2.com"]
    codeflash_output = main(); ret = codeflash_output # 42.1ms -> 4.05ms (938% faster)






def test_main_edge_verbosity(monkeypatch):
    """Test main() with verbosity argument."""
    sys.argv = ["openai", "-v"]
    codeflash_output = main(); ret = codeflash_output # 43.8ms -> 43.8ms (0.075% faster)

def test_main_edge_max_verbosity(monkeypatch):
    """Test main() with maximum verbosity."""
    sys.argv = ["openai", "-vvv"]
    codeflash_output = main(); ret = codeflash_output # 43.6ms -> 43.9ms (0.737% slower)



def test_main_large_scale_many_proxies(monkeypatch):
    """Test main() with 500 proxies (should only allow one per protocol)."""
    proxies = [f"http://proxy{i}.com" for i in range(500)]
    sys.argv = ["openai", "-p"] + proxies
    codeflash_output = main(); ret = codeflash_output # 42.8ms -> 4.70ms (810% faster)

def test_main_large_scale_many_https_proxies(monkeypatch):
    """Test main() with 500 https proxies (should only allow one per protocol)."""
    proxies = [f"https://proxy{i}.com" for i in range(500)]
    sys.argv = ["openai", "-p"] + proxies
    codeflash_output = main(); ret = codeflash_output # 42.5ms -> 4.64ms (816% faster)


def test_main_large_scale_many_verbosity(monkeypatch):
    """Test main() with maximum allowed verbosity (1000)."""
    sys.argv = ["openai"] + ["-v"] * 1000
    codeflash_output = main(); ret = codeflash_output # 96.4ms -> 96.1ms (0.241% faster)

def test_main_large_scale_proxy_mix(monkeypatch):
    """Test main() with 999 http and 1 https proxy (should raise CLIError for http)."""
    proxies = [f"http://proxy{i}.com" for i in range(999)] + ["https://proxy999.com"]
    sys.argv = ["openai", "-p"] + proxies
    codeflash_output = main(); ret = codeflash_output # 43.5ms -> 5.19ms (737% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-main-mhcypr1j and push.

Codeflash Static Badge

The optimized code achieves an 18% speedup by improving the proxy handling logic, which is the primary bottleneck when multiple proxies are present.

**Key optimization: Efficient proxy deduplication**
The original code checked for duplicate proxies by tracking them in the `proxies` dict during iteration, raising an error immediately when duplicates were found. The optimized version uses a two-pass approach:
1. First pass (reversed): Collects only the last proxy of each protocol type
2. Second pass (forward): Counts duplicates to maintain the same error behavior
3. Final pass: Builds the actual proxy transport objects only for unique protocols

**Why this is faster:**
- **Reduced expensive operations**: The original code created `httpx.HTTPTransport` objects even for proxies that would be rejected due to duplicates. The optimized version only creates transport objects for proxies that will actually be used.
- **Better cache locality**: The optimized version processes all proxy strings first (lightweight string operations), then does the heavy HTTP transport creation in a single batch.

**Test case performance analysis:**
- **Massive gains on duplicate proxy cases**: Tests with multiple HTTP/HTTPS proxies show 800-900% speedups because the optimized version avoids creating expensive transport objects for duplicates
- **Minimal impact on normal cases**: Single proxy or no-proxy scenarios show negligible performance differences (0-1% variation), confirming the optimization doesn't hurt common usage patterns
- **Scales well**: Large-scale tests with 500+ proxies demonstrate the optimization's effectiveness grows with input size

The optimization maintains identical behavior and error messages while significantly reducing computational overhead in proxy-heavy scenarios.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 30, 2025 05:07
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant