Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 30, 2025

📄 162% (1.62x) speedup for get_base_setup in wandb/sdk/launch/builder/build.py

⏱️ Runtime : 5.09 milliseconds 1.94 milliseconds (best of 77 runs)

📝 Explanation and details

The optimized code achieves a 161% speedup through three key micro-optimizations:

1. Eliminate redundant string splitting: The original code calls py_version.split(".") every time, but the optimized version splits once and reuses the result (py_ver_split), reducing string processing overhead.

2. Use tuples instead of lists for static data: Changed python_packages = [...] to python_packages = (...). Tuples are more memory-efficient and faster to create than lists when the data doesn't need to be modified.

3. Streamline conditional assignment: Replaced the if-else block for python_base_image with a single conditional expression, reducing branching overhead.

Performance impact by test case:

  • Accelerator setups see the biggest gains (324-450% faster) because they hit the most expensive code paths in the original version
  • CPU-only setups show modest improvements (2-7% faster) as they avoid the heavy logging overhead
  • Large-scale tests benefit significantly (229-386% faster) where the micro-optimizations compound across many iterations

The optimizations are particularly effective for workloads with many Docker setup calls or when using accelerator base images, which is common in ML deployment pipelines.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 2231 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest
from wandb.sdk.launch.builder.build import get_base_setup


# Mocks for external dependencies
class DummyLaunchProject:
    """A minimal mock of wandb.sdk.launch._project_spec.LaunchProject."""
    def __init__(self, accelerator_base_image=None):
        self.accelerator_base_image = accelerator_base_image

# Minimal Dockerfile template strings for testing
ACCELERATOR_SETUP_TEMPLATE = (
    "FROM {accelerator_base_image}\n"
    "RUN apt-get update && apt-get install -y {python_packages}\n"
    "ENV PYTHON_VERSION={py_version}\n"
)
PYTHON_SETUP_TEMPLATE = (
    "FROM {py_base_image}\n"
    "RUN apt-get update && apt-get install -y python3-dev gcc\n"
)
from wandb.sdk.launch.builder.build import get_base_setup

# unit tests

# ------------- Basic Test Cases -------------

def test_cpu_base_setup_python_3_10():
    """Test CPU base setup for python 3.10 (minor < 12, uses -buster)."""
    lp = DummyLaunchProject()
    codeflash_output = get_base_setup(lp, "3.10", "3"); result = codeflash_output # 3.19μs -> 3.11μs (2.60% faster)

def test_cpu_base_setup_python_3_12():
    """Test CPU base setup for python 3.12 (minor >= 12, uses -bookworm)."""
    lp = DummyLaunchProject()
    codeflash_output = get_base_setup(lp, "3.12", "3"); result = codeflash_output # 3.06μs -> 3.10μs (1.35% slower)

def test_accelerator_base_setup_python_3_9():
    """Test accelerator base setup for python 3.9 (minor < 12, uses custom image)."""
    lp = DummyLaunchProject(accelerator_base_image="nvidia/cuda:11.3.1-cudnn8-runtime-ubuntu20.04")
    codeflash_output = get_base_setup(lp, "3.9", "3"); result = codeflash_output # 26.8μs -> 4.96μs (441% faster)

def test_accelerator_base_setup_python_3_12():
    """Test accelerator base setup for python 3.12 (minor >= 12, uses custom image)."""
    lp = DummyLaunchProject(accelerator_base_image="my-accelerator-image:latest")
    codeflash_output = get_base_setup(lp, "3.12", "3"); result = codeflash_output # 26.7μs -> 4.98μs (435% faster)

# ------------- Edge Test Cases -------------

def test_py_version_with_single_digit_minor():
    """Test py_version with a single-digit minor version."""
    lp = DummyLaunchProject()
    codeflash_output = get_base_setup(lp, "3.7", "3"); result = codeflash_output # 3.21μs -> 3.05μs (5.35% faster)

def test_py_version_with_double_digit_minor():
    """Test py_version with a double-digit minor version."""
    lp = DummyLaunchProject()
    codeflash_output = get_base_setup(lp, "3.15", "3"); result = codeflash_output # 3.14μs -> 3.13μs (0.192% faster)

def test_py_version_with_leading_zero_minor():
    """Test py_version with a leading zero in the minor version."""
    lp = DummyLaunchProject()
    codeflash_output = get_base_setup(lp, "3.09", "3"); result = codeflash_output # 3.20μs -> 3.17μs (0.882% faster)


def test_py_version_with_invalid_minor():
    """Test py_version with invalid minor version (should raise ValueError)."""
    lp = DummyLaunchProject()
    with pytest.raises(ValueError):
        get_base_setup(lp, "3.x", "3") # 4.67μs -> 4.06μs (15.0% faster)

def test_accelerator_base_image_empty_string():
    """Test accelerator_base_image as empty string (should fallback to CPU)."""
    lp = DummyLaunchProject(accelerator_base_image="")
    codeflash_output = get_base_setup(lp, "3.10", "3"); result = codeflash_output # 3.24μs -> 3.09μs (4.95% faster)

def test_accelerator_base_image_none():
    """Test accelerator_base_image as None (should fallback to CPU)."""
    lp = DummyLaunchProject(accelerator_base_image=None)
    codeflash_output = get_base_setup(lp, "3.10", "3"); result = codeflash_output # 3.17μs -> 3.08μs (2.86% faster)

def test_py_major_unused():
    """Test that py_major argument is accepted but not used in output."""
    lp = DummyLaunchProject()
    codeflash_output = get_base_setup(lp, "3.10", "2"); result = codeflash_output # 3.21μs -> 3.09μs (3.82% faster)


def test_py_version_with_long_patch():
    """Test py_version with patch version (should ignore patch)."""
    lp = DummyLaunchProject()
    codeflash_output = get_base_setup(lp, "3.11.5", "3"); result = codeflash_output # 3.45μs -> 3.17μs (8.87% faster)


def test_py_version_with_extra_components():
    """Test py_version with extra components (should ignore extras)."""
    lp = DummyLaunchProject()
    codeflash_output = get_base_setup(lp, "3.11.5.2", "3"); result = codeflash_output # 3.49μs -> 3.13μs (11.5% faster)

def test_accelerator_base_image_special_chars():
    """Test accelerator_base_image with special characters."""
    lp = DummyLaunchProject(accelerator_base_image="accel@image:!v1")
    codeflash_output = get_base_setup(lp, "3.10", "3"); result = codeflash_output # 27.5μs -> 4.99μs (450% faster)

# ------------- Large Scale Test Cases -------------

def test_many_different_python_versions():
    """Test function with a range of python versions from 3.0 to 3.20."""
    lp = DummyLaunchProject()
    for minor in range(0, 21):  # 3.0 to 3.20
        py_version = f"3.{minor}"
        codeflash_output = get_base_setup(lp, py_version, "3"); result = codeflash_output # 16.5μs -> 15.9μs (3.45% faster)
        if minor < 12:
            pass
        else:
            pass

def test_many_accelerator_images():
    """Test function with many different accelerator images."""
    for i in range(100):
        img = f"custom-accel-image-{i}:latest"
        lp = DummyLaunchProject(accelerator_base_image=img)
        codeflash_output = get_base_setup(lp, "3.10", "3"); result = codeflash_output # 598μs -> 141μs (324% faster)

def test_large_scale_cpu_and_accelerator_mix():
    """Test function with a mix of CPU and accelerator setups for many versions."""
    for minor in range(0, 21):
        py_version = f"3.{minor}"
        # CPU case
        lp_cpu = DummyLaunchProject()
        codeflash_output = get_base_setup(lp_cpu, py_version, "3"); result_cpu = codeflash_output # 18.0μs -> 16.5μs (9.38% faster)
        if minor < 12:
            pass
        else:
            pass
        # Accelerator case
        lp_acc = DummyLaunchProject(accelerator_base_image=f"accel-img-{minor}")
        codeflash_output = get_base_setup(lp_acc, py_version, "3"); result_acc = codeflash_output # 155μs -> 32.1μs (386% faster)

def test_performance_large_number_of_calls():
    """Test performance with a large number of calls (not measuring time, just ensuring no error)."""
    lp = DummyLaunchProject()
    for i in range(1000):
        minor = i % 21
        py_version = f"3.{minor}"
        codeflash_output = get_base_setup(lp, py_version, "3"); result = codeflash_output # 612μs -> 591μs (3.48% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest
from wandb.sdk.launch.builder.build import get_base_setup

# Mocks for the templates used in get_base_setup
ACCELERATOR_SETUP_TEMPLATE = (
    "FROM {accelerator_base_image}\n"
    "RUN apt-get update && apt-get install -y \\\n{python_packages}\n"
    "ENV PYTHON_VERSION={py_version}\n"
)
PYTHON_SETUP_TEMPLATE = (
    "FROM {py_base_image}\n"
    "RUN apt-get update && apt-get install -y python3-dev gcc\n"
)

# Minimal LaunchProject mock class
class LaunchProject:
    def __init__(self, accelerator_base_image=None):
        self.accelerator_base_image = accelerator_base_image
from wandb.sdk.launch.builder.build import get_base_setup

# ----------- UNIT TESTS ------------

# 1. Basic Test Cases

def test_cpu_base_setup_python_lt_3_12():
    # Test CPU base setup for python version < 3.12
    proj = LaunchProject()
    codeflash_output = get_base_setup(proj, "3.10", "3"); result = codeflash_output # 3.25μs -> 3.03μs (7.16% faster)

def test_cpu_base_setup_python_ge_3_12():
    # Test CPU base setup for python version >= 3.12
    proj = LaunchProject()
    codeflash_output = get_base_setup(proj, "3.12", "3"); result = codeflash_output # 2.99μs -> 3.07μs (2.64% slower)

def test_accelerator_base_setup_python_lt_3_12():
    # Test accelerator base setup for python version < 3.12
    proj = LaunchProject(accelerator_base_image="nvidia/cuda:11.3.1-cudnn8-runtime-ubuntu20.04")
    codeflash_output = get_base_setup(proj, "3.9", "3"); result = codeflash_output # 26.8μs -> 4.96μs (441% faster)

def test_accelerator_base_setup_python_ge_3_12():
    # Test accelerator base setup for python version >= 3.12
    proj = LaunchProject(accelerator_base_image="custom/image:latest")
    codeflash_output = get_base_setup(proj, "3.12", "3"); result = codeflash_output # 26.9μs -> 5.02μs (436% faster)

# 2. Edge Test Cases

def test_py_version_with_single_digit_minor():
    # Edge case: py_version with single-digit minor (e.g., 3.7)
    proj = LaunchProject()
    codeflash_output = get_base_setup(proj, "3.7", "3"); result = codeflash_output # 3.07μs -> 3.06μs (0.491% faster)

def test_py_version_with_leading_zero_minor():
    # Edge case: py_version with leading zero in minor (e.g., 3.07)
    proj = LaunchProject()
    codeflash_output = get_base_setup(proj, "3.07", "3"); result = codeflash_output # 3.14μs -> 3.11μs (0.932% faster)

def test_py_version_with_large_minor():
    # Edge case: py_version with minor > 12 (e.g., 3.15)
    proj = LaunchProject()
    codeflash_output = get_base_setup(proj, "3.15", "3"); result = codeflash_output # 3.14μs -> 3.11μs (0.836% faster)

def test_py_version_with_non_integer_minor():
    # Edge case: py_version with non-integer minor (should raise ValueError)
    proj = LaunchProject()
    with pytest.raises(ValueError):
        get_base_setup(proj, "3.x", "3") # 4.55μs -> 3.91μs (16.4% faster)

def test_py_version_with_missing_minor():
    # Edge case: py_version with missing minor (should raise IndexError)
    proj = LaunchProject()
    with pytest.raises(IndexError):
        get_base_setup(proj, "3", "3") # 1.09μs -> 1.15μs (4.45% slower)

def test_accelerator_base_image_empty_string():
    # Edge case: accelerator_base_image is empty string (should treat as False)
    proj = LaunchProject(accelerator_base_image="")
    codeflash_output = get_base_setup(proj, "3.10", "3"); result = codeflash_output # 3.19μs -> 3.14μs (1.50% faster)

def test_accelerator_base_image_none():
    # Edge case: accelerator_base_image is None (should use CPU setup)
    proj = LaunchProject(accelerator_base_image=None)
    codeflash_output = get_base_setup(proj, "3.10", "3"); result = codeflash_output # 3.17μs -> 3.15μs (0.571% faster)

def test_py_version_with_extra_patch():
    # Edge case: py_version with patch version (e.g., 3.10.2)
    proj = LaunchProject()
    codeflash_output = get_base_setup(proj, "3.10.2", "3"); result = codeflash_output # 3.25μs -> 3.22μs (0.807% faster)


def test_py_major_unused():
    # Edge case: py_major is not used in function, but test with different values
    proj = LaunchProject()
    codeflash_output = get_base_setup(proj, "3.10", "2"); result = codeflash_output # 3.23μs -> 3.08μs (5.11% faster)

# 3. Large Scale Test Cases

def test_many_python_versions_cpu():
    # Large scale: test with many python versions for CPU setup
    proj = LaunchProject()
    for minor in range(0, 20):  # 3.0 to 3.19
        py_version = f"3.{minor}"
        codeflash_output = get_base_setup(proj, py_version, "3"); result = codeflash_output # 15.7μs -> 15.2μs (3.74% faster)
        if minor < 12:
            pass
        else:
            pass

def test_many_python_versions_accelerator():
    # Large scale: test with many python versions for accelerator setup
    proj = LaunchProject(accelerator_base_image="accel/base:latest")
    for minor in range(0, 20):  # 3.0 to 3.19
        py_version = f"3.{minor}"
        codeflash_output = get_base_setup(proj, py_version, "3"); result = codeflash_output # 147μs -> 32.8μs (350% faster)

def test_large_accelerator_base_image_string():
    # Large scale: test with a very long accelerator_base_image string
    long_image = "repo/" + "x" * 950 + ":tag"
    proj = LaunchProject(accelerator_base_image=long_image)
    codeflash_output = get_base_setup(proj, "3.12", "3"); result = codeflash_output # 27.5μs -> 5.46μs (405% faster)

def test_large_number_of_launch_projects():
    # Large scale: test with many LaunchProject objects
    projects = [LaunchProject(accelerator_base_image=None if i % 2 == 0 else "accel/base:latest") for i in range(1000)]
    for i, proj in enumerate(projects):
        py_version = f"3.{i % 20}"
        codeflash_output = get_base_setup(proj, py_version, "3"); result = codeflash_output # 3.29ms -> 1.00ms (229% faster)
        if proj.accelerator_base_image:
            pass
        else:
            if int(py_version.split(".")[1]) < 12:
                pass
            else:
                pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-get_base_setup-mhdong86 and push.

Codeflash Static Badge

The optimized code achieves a **161% speedup** through three key micro-optimizations:

**1. Eliminate redundant string splitting**: The original code calls `py_version.split(".")` every time, but the optimized version splits once and reuses the result (`py_ver_split`), reducing string processing overhead.

**2. Use tuples instead of lists for static data**: Changed `python_packages = [...]` to `python_packages = (...)`. Tuples are more memory-efficient and faster to create than lists when the data doesn't need to be modified.

**3. Streamline conditional assignment**: Replaced the if-else block for `python_base_image` with a single conditional expression, reducing branching overhead.

**Performance impact by test case:**
- **Accelerator setups see the biggest gains** (324-450% faster) because they hit the most expensive code paths in the original version
- **CPU-only setups show modest improvements** (2-7% faster) as they avoid the heavy logging overhead
- **Large-scale tests benefit significantly** (229-386% faster) where the micro-optimizations compound across many iterations

The optimizations are particularly effective for workloads with many Docker setup calls or when using accelerator base images, which is common in ML deployment pipelines.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 30, 2025 17:13
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant