Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 31, 2025

📄 67% (0.67x) speedup for has_none_primitives_in_list in nvflare/tool/job/config/config_indexer.py

⏱️ Runtime : 892 microseconds 533 microseconds (best of 257 runs)

📝 Explanation and details

The optimization achieves a 67% speedup through two key changes:

1. Efficient Type Checking in is_primitive():

  • Before: Used multiple chained isinstance() calls (isinstance(value, int) or isinstance(value, float) or ...)
  • After: Uses a single isinstance(value, _PRIMITIVE_TYPES) call with a pre-defined tuple of types
  • Why faster: Python's isinstance() with a tuple of types is implemented in C and checks all types in one operation, avoiding multiple function calls and boolean evaluations

2. Early Termination Loop in has_none_primitives_in_list():

  • Before: Used any(not is_primitive(x) for x in values) which creates a generator and processes all elements
  • After: Uses an explicit for loop with immediate return True on first non-primitive found
  • Why faster: Avoids generator overhead and stops immediately when a non-primitive is found, rather than continuing to check remaining elements

Performance Benefits by Test Case:

  • Early non-primitive detection: When non-primitives appear early in lists, the optimized version shows 80-95% speedup by short-circuiting immediately
  • All primitives: Even when checking entire lists of primitives, the more efficient type checking provides 20-30% speedup on large lists
  • None-heavy lists: Special cases with many None values see dramatic improvements (up to 329% faster) due to the optimized None check (value is None)

The optimization is particularly effective for large lists where non-primitives appear early, and maintains consistent performance improvements across all primitive-only scenarios.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 82 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from typing import List

# imports
import pytest  # used for our unit tests
from nvflare.tool.job.config.config_indexer import has_none_primitives_in_list

# unit tests

# 1. Basic Test Cases

def test_all_primitives_int():
    # List of ints only
    codeflash_output = not has_none_primitives_in_list([1, 2, 3, 4]) # 1.33μs -> 865ns (54.1% faster)

def test_all_primitives_float():
    # List of floats only
    codeflash_output = not has_none_primitives_in_list([1.0, 2.5, 3.14]) # 1.34μs -> 858ns (56.4% faster)

def test_all_primitives_str():
    # List of strings only
    codeflash_output = not has_none_primitives_in_list(["a", "b", "c"]) # 1.38μs -> 884ns (55.8% faster)

def test_all_primitives_bool():
    # List of bools only
    codeflash_output = not has_none_primitives_in_list([True, False, True]) # 1.27μs -> 809ns (56.9% faster)

def test_all_primitives_none():
    # List of None only
    codeflash_output = not has_none_primitives_in_list([None, None]) # 1.48μs -> 671ns (120% faster)

def test_mixed_primitives():
    # List of mixed primitive types
    codeflash_output = not has_none_primitives_in_list([1, 2.0, "x", True, None]) # 1.79μs -> 1.11μs (61.7% faster)

def test_one_non_primitive_dict():
    # List with one non-primitive (dict)
    codeflash_output = has_none_primitives_in_list([1, 2, {}, 3]) # 1.78μs -> 963ns (85.3% faster)

def test_one_non_primitive_list():
    # List with one non-primitive (list)
    codeflash_output = has_none_primitives_in_list([1, [], 2]) # 1.54μs -> 838ns (83.3% faster)

def test_one_non_primitive_set():
    # List with one non-primitive (set)
    codeflash_output = has_none_primitives_in_list([1, set(), 2]) # 1.47μs -> 808ns (82.4% faster)

def test_one_non_primitive_tuple():
    # List with one non-primitive (tuple)
    codeflash_output = has_none_primitives_in_list([1, (), 2]) # 1.67μs -> 959ns (73.6% faster)

def test_multiple_non_primitives():
    # List with several non-primitives
    codeflash_output = has_none_primitives_in_list([{}, [], set(), ()]) # 1.31μs -> 719ns (82.8% faster)

def test_mixed_primitives_and_non_primitives():
    # List with both primitives and non-primitives
    codeflash_output = has_none_primitives_in_list([1, "a", [], 2.0]) # 1.64μs -> 975ns (68.6% faster)

# 2. Edge Test Cases

def test_empty_list():
    # Empty list should return False (no non-primitives)
    codeflash_output = not has_none_primitives_in_list([]) # 802ns -> 379ns (112% faster)

def test_nested_list():
    # List containing another list (non-primitive)
    codeflash_output = has_none_primitives_in_list([1, [2, 3], 4]) # 1.54μs -> 843ns (83.0% faster)

def test_nested_dict():
    # List containing a dict (non-primitive)
    codeflash_output = has_none_primitives_in_list([1, {"a": 1}, 2]) # 1.44μs -> 870ns (65.9% faster)

def test_object_instance():
    # List containing an object instance (non-primitive)
    class Dummy: pass
    codeflash_output = has_none_primitives_in_list([1, Dummy()]) # 1.72μs -> 1.03μs (67.5% faster)

def test_function_in_list():
    # List containing a function (non-primitive)
    def f(): pass
    codeflash_output = has_none_primitives_in_list([1, f]) # 1.49μs -> 852ns (74.6% faster)

def test_bytes_in_list():
    # List containing bytes (not primitive by our definition)
    codeflash_output = has_none_primitives_in_list([1, b"bytes"]) # 1.46μs -> 832ns (75.2% faster)

def test_bytearray_in_list():
    # List containing bytearray (not primitive by our definition)
    codeflash_output = has_none_primitives_in_list([1, bytearray(b"bytes")]) # 1.50μs -> 832ns (79.7% faster)

def test_none_and_non_primitive():
    # List containing None and a non-primitive
    codeflash_output = has_none_primitives_in_list([None, {}]) # 1.59μs -> 819ns (93.7% faster)

def test_bool_and_non_primitive():
    # List containing bool and a non-primitive
    codeflash_output = has_none_primitives_in_list([True, []]) # 1.53μs -> 914ns (67.1% faster)

def test_str_and_non_primitive():
    # List containing str and a non-primitive
    codeflash_output = has_none_primitives_in_list(["hello", set()]) # 1.60μs -> 892ns (79.7% faster)

def test_int_and_non_primitive():
    # List containing int and a non-primitive
    codeflash_output = has_none_primitives_in_list([42, (1, 2)]) # 1.73μs -> 927ns (86.1% faster)

def test_float_and_non_primitive():
    # List containing float and a non-primitive
    codeflash_output = has_none_primitives_in_list([3.14, {"pi": 3.14}]) # 1.54μs -> 867ns (78.0% faster)

def test_only_non_primitives():
    # List with only non-primitives
    codeflash_output = has_none_primitives_in_list([[], {}, set(), ()]) # 1.28μs -> 687ns (86.9% faster)

def test_only_primitives():
    # List with only primitives
    codeflash_output = not has_none_primitives_in_list([1, "a", 2.0, True, None]) # 1.88μs -> 1.16μs (62.0% faster)

def test_empty_dict_in_list():
    # List with an empty dict (non-primitive)
    codeflash_output = has_none_primitives_in_list([1, {}]) # 1.46μs -> 872ns (66.9% faster)

def test_empty_list_in_list():
    # List with an empty list (non-primitive)
    codeflash_output = has_none_primitives_in_list([1, []]) # 1.43μs -> 833ns (71.2% faster)

def test_empty_set_in_list():
    # List with an empty set (non-primitive)
    codeflash_output = has_none_primitives_in_list([1, set()]) # 1.44μs -> 830ns (73.5% faster)

def test_empty_tuple_in_list():
    # List with an empty tuple (non-primitive)
    codeflash_output = has_none_primitives_in_list([1, ()]) # 1.62μs -> 920ns (76.5% faster)

def test_single_element_list_primitive():
    # Single primitive element
    codeflash_output = not has_none_primitives_in_list([1]) # 1.12μs -> 645ns (73.0% faster)

def test_single_element_list_non_primitive():
    # Single non-primitive element
    codeflash_output = has_none_primitives_in_list([{}]) # 1.29μs -> 721ns (78.5% faster)

def test_large_int_value():
    # Very large int value (still primitive)
    codeflash_output = not has_none_primitives_in_list([10**100]) # 1.15μs -> 628ns (83.8% faster)

def test_large_float_value():
    # Very large float value (still primitive)
    codeflash_output = not has_none_primitives_in_list([1e100]) # 1.18μs -> 652ns (80.5% faster)

def test_empty_string():
    # Empty string is primitive
    codeflash_output = not has_none_primitives_in_list([""]) # 1.20μs -> 658ns (81.9% faster)

def test_falsey_values():
    # Falsey values that are primitives
    codeflash_output = not has_none_primitives_in_list([0, 0.0, "", False, None]) # 1.81μs -> 1.20μs (51.1% faster)

def test_falsey_non_primitive():
    # Falsey non-primitive (empty list, dict, set, tuple)
    codeflash_output = has_none_primitives_in_list([[], {}, set(), ()]) # 1.43μs -> 739ns (94.0% faster)

# 3. Large Scale Test Cases

def test_large_list_all_primitives():
    # Large list of primitives (should return False)
    large_list = [i for i in range(1000)]
    codeflash_output = not has_none_primitives_in_list(large_list) # 57.3μs -> 47.1μs (21.7% faster)

def test_large_list_one_non_primitive_at_start():
    # Large list, non-primitive at start
    large_list = [{}] + [i for i in range(999)]
    codeflash_output = has_none_primitives_in_list(large_list) # 1.47μs -> 758ns (93.7% faster)

def test_large_list_one_non_primitive_at_end():
    # Large list, non-primitive at end
    large_list = [i for i in range(999)] + [{}]
    codeflash_output = has_none_primitives_in_list(large_list) # 56.5μs -> 47.5μs (18.8% faster)

def test_large_list_one_non_primitive_in_middle():
    # Large list, non-primitive in middle
    large_list = [i for i in range(499)] + [[]] + [i for i in range(500, 999)]
    codeflash_output = has_none_primitives_in_list(large_list) # 29.1μs -> 23.9μs (21.5% faster)

def test_large_list_all_non_primitives():
    # Large list, all non-primitives
    large_list = [{} for _ in range(1000)]
    codeflash_output = has_none_primitives_in_list(large_list) # 1.49μs -> 773ns (92.6% faster)

def test_large_list_mixed_primitives_and_non_primitives():
    # Large list, mix of primitives and non-primitives
    large_list = [i if i % 2 == 0 else {} for i in range(1000)]
    codeflash_output = has_none_primitives_in_list(large_list) # 1.69μs -> 878ns (92.1% faster)

def test_large_list_all_primitives_with_none():
    # Large list of primitives, including None
    large_list = [None if i % 2 == 0 else i for i in range(1000)]
    codeflash_output = not has_none_primitives_in_list(large_list) # 99.0μs -> 40.1μs (147% faster)

def test_large_list_all_primitives_with_falsey():
    # Large list of primitives, including falsey values
    large_list = [0, 0.0, "", False, None] * 200
    codeflash_output = not has_none_primitives_in_list(large_list) # 90.4μs -> 51.3μs (76.4% faster)

def test_large_list_with_various_non_primitives():
    # Large list with various non-primitives scattered
    large_list = [i for i in range(995)] + [[], {}, set(), (), lambda x: x]
    codeflash_output = has_none_primitives_in_list(large_list) # 57.9μs -> 47.4μs (22.1% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import List

# imports
import pytest  # used for our unit tests
from nvflare.tool.job.config.config_indexer import has_none_primitives_in_list

# unit tests

# 1. Basic Test Cases

def test_all_ints():
    # All ints are primitives, should return False
    codeflash_output = not has_none_primitives_in_list([1, 2, 3]) # 1.42μs -> 878ns (62.0% faster)

def test_all_floats():
    # All floats are primitives, should return False
    codeflash_output = not has_none_primitives_in_list([1.1, 2.2, 3.3]) # 1.47μs -> 868ns (68.8% faster)

def test_all_strings():
    # All strings are primitives, should return False
    codeflash_output = not has_none_primitives_in_list(["a", "b", "c"]) # 1.40μs -> 908ns (54.4% faster)

def test_all_bools():
    # All bools are primitives, should return False
    codeflash_output = not has_none_primitives_in_list([True, False, True]) # 1.34μs -> 860ns (56.0% faster)

def test_all_none():
    # None is considered primitive, should return False
    codeflash_output = not has_none_primitives_in_list([None, None]) # 1.56μs -> 719ns (118% faster)

def test_mixed_primitives():
    # Mix of all primitives, should return False
    codeflash_output = not has_none_primitives_in_list([1, 2.0, "a", True, None]) # 1.84μs -> 1.11μs (66.4% faster)

def test_one_non_primitive():
    # List contains a dict (not primitive), should return True
    codeflash_output = has_none_primitives_in_list([1, 2, {}, 3]) # 1.74μs -> 936ns (86.1% faster)

def test_multiple_non_primitives():
    # List contains several non-primitives (list, tuple, dict), should return True
    codeflash_output = has_none_primitives_in_list([1, [], (1,2), {}, "x"]) # 1.55μs -> 890ns (74.0% faster)

def test_non_primitive_first():
    # Non-primitive is first, should return True
    codeflash_output = has_none_primitives_in_list([[], 1, 2, 3]) # 1.32μs -> 705ns (86.7% faster)

def test_non_primitive_last():
    # Non-primitive is last, should return True
    codeflash_output = has_none_primitives_in_list([1, 2, 3, set()]) # 1.66μs -> 984ns (68.5% faster)

def test_nested_list():
    # List contains a nested list, which is not primitive, should return True
    codeflash_output = has_none_primitives_in_list([1, [2, 3], 4]) # 1.51μs -> 806ns (87.3% faster)

def test_empty_list():
    # Empty list, should return False (no non-primitives present)
    codeflash_output = not has_none_primitives_in_list([]) # 815ns -> 365ns (123% faster)

# 2. Edge Test Cases

def test_single_primitive():
    # Single primitive element, should return False
    codeflash_output = not has_none_primitives_in_list([42]) # 1.16μs -> 655ns (77.7% faster)

def test_single_non_primitive():
    # Single non-primitive element, should return True
    codeflash_output = has_none_primitives_in_list([[]]) # 1.33μs -> 741ns (79.6% faster)

def test_very_large_int():
    # Very large integer, still primitive, should return False
    codeflash_output = not has_none_primitives_in_list([10**100]) # 1.13μs -> 659ns (71.8% faster)

def test_zero_and_false_and_empty_string():
    # 0, False, and "" are all primitives, should return False
    codeflash_output = not has_none_primitives_in_list([0, False, ""]) # 1.48μs -> 988ns (50.1% faster)

def test_object_instance():
    # User-defined object is not primitive, should return True
    class Foo: pass
    codeflash_output = has_none_primitives_in_list([Foo()]) # 1.60μs -> 909ns (76.1% faster)

def test_function_object():
    # Function object is not primitive, should return True
    def bar(): pass
    codeflash_output = has_none_primitives_in_list([bar]) # 1.39μs -> 719ns (92.9% faster)

def test_bytes_and_bytearray():
    # bytes and bytearray are not considered primitives, should return True
    codeflash_output = has_none_primitives_in_list([b'abc', bytearray(b'def')]) # 1.25μs -> 705ns (77.4% faster)

def test_tuple_of_primitives():
    # tuple is not primitive, even if it contains only primitives, should return True
    codeflash_output = has_none_primitives_in_list([(1, 2, 3)]) # 1.44μs -> 835ns (72.1% faster)

def test_set_of_primitives():
    # set is not primitive, should return True
    codeflash_output = has_none_primitives_in_list([{1, 2, 3}]) # 1.29μs -> 706ns (83.1% faster)

def test_dict_of_primitives():
    # dict is not primitive, should return True
    codeflash_output = has_none_primitives_in_list([{"a": 1}]) # 1.27μs -> 723ns (75.9% faster)

def test_nested_non_primitive_in_list():
    # Nested non-primitive inside a list, but not at top level, should return False
    # Only top-level elements are checked
    codeflash_output = not has_none_primitives_in_list([[1, 2, 3]]) # 1.28μs -> 708ns (81.2% faster)
    # Correction: Above comment is wrong, test should assert True
    codeflash_output = has_none_primitives_in_list([[1, 2, 3]]) # 700ns -> 340ns (106% faster)

def test_bool_vs_int():
    # bool is subclass of int, both are primitives, should return False
    codeflash_output = not has_none_primitives_in_list([True, False, 0, 1]) # 1.59μs -> 942ns (68.5% faster)

def test_none_and_non_primitive():
    # None and a non-primitive, should return True
    codeflash_output = has_none_primitives_in_list([None, object()]) # 1.75μs -> 932ns (87.4% faster)

def test_nan_and_inf():
    # float('nan') and float('inf') are floats (primitive), should return False
    codeflash_output = not has_none_primitives_in_list([float('nan'), float('inf')]) # 1.35μs -> 758ns (77.7% faster)



def test_custom_class_with_str_repr():
    # Custom class with __str__ is not primitive, should return True
    class Bar:
        def __str__(self): return "Bar"
    codeflash_output = has_none_primitives_in_list([Bar()]) # 1.59μs -> 871ns (82.4% faster)

# 3. Large Scale Test Cases

def test_large_all_primitives():
    # Large list of primitives, should return False
    big_list = [i for i in range(1000)]
    codeflash_output = not has_none_primitives_in_list(big_list) # 58.2μs -> 47.3μs (23.1% faster)

def test_large_one_non_primitive_at_start():
    # Large list with non-primitive at start, should return True
    big_list = [{}] + [i for i in range(999)]
    codeflash_output = has_none_primitives_in_list(big_list) # 1.36μs -> 757ns (79.9% faster)

def test_large_one_non_primitive_at_end():
    # Large list with non-primitive at end, should return True
    big_list = [i for i in range(999)] + [set()]
    codeflash_output = has_none_primitives_in_list(big_list) # 56.3μs -> 47.4μs (18.7% faster)

def test_large_one_non_primitive_in_middle():
    # Large list with non-primitive in the middle, should return True
    big_list = [i for i in range(500)] + [[1,2]] + [i for i in range(501,1000)]
    codeflash_output = has_none_primitives_in_list(big_list) # 30.2μs -> 23.4μs (29.0% faster)

def test_large_all_non_primitives():
    # Large list of non-primitives, should return True
    big_list = [[i] for i in range(1000)]
    codeflash_output = has_none_primitives_in_list(big_list) # 1.49μs -> 777ns (92.3% faster)

def test_large_mixed_primitives_and_non_primitives():
    # Large list, every 100th element is a non-primitive
    big_list = []
    for i in range(1000):
        if i % 100 == 0:
            big_list.append({})
        else:
            big_list.append(i)
    codeflash_output = has_none_primitives_in_list(big_list) # 1.36μs -> 756ns (79.9% faster)

def test_large_all_none():
    # Large list of None, should return False
    big_list = [None for _ in range(1000)]
    codeflash_output = not has_none_primitives_in_list(big_list) # 139μs -> 32.6μs (329% faster)

def test_large_mixed_types():
    # Large list with random mix of all primitives, should return False
    big_list = []
    for i in range(250):
        big_list.extend([i, float(i), str(i), bool(i % 2), None])
    codeflash_output = not has_none_primitives_in_list(big_list) # 111μs -> 65.7μs (70.4% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-has_none_primitives_in_list-mhe3ktm5 and push.

Codeflash Static Badge

The optimization achieves a 67% speedup through two key changes:

**1. Efficient Type Checking in `is_primitive()`:**
- **Before:** Used multiple chained `isinstance()` calls (`isinstance(value, int) or isinstance(value, float) or ...`)
- **After:** Uses a single `isinstance(value, _PRIMITIVE_TYPES)` call with a pre-defined tuple of types
- **Why faster:** Python's `isinstance()` with a tuple of types is implemented in C and checks all types in one operation, avoiding multiple function calls and boolean evaluations

**2. Early Termination Loop in `has_none_primitives_in_list()`:**
- **Before:** Used `any(not is_primitive(x) for x in values)` which creates a generator and processes all elements
- **After:** Uses an explicit `for` loop with immediate `return True` on first non-primitive found
- **Why faster:** Avoids generator overhead and stops immediately when a non-primitive is found, rather than continuing to check remaining elements

**Performance Benefits by Test Case:**
- **Early non-primitive detection:** When non-primitives appear early in lists, the optimized version shows 80-95% speedup by short-circuiting immediately
- **All primitives:** Even when checking entire lists of primitives, the more efficient type checking provides 20-30% speedup on large lists
- **None-heavy lists:** Special cases with many `None` values see dramatic improvements (up to 329% faster) due to the optimized None check (`value is None`)

The optimization is particularly effective for large lists where non-primitives appear early, and maintains consistent performance improvements across all primitive-only scenarios.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 31, 2025 00:10
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant