⚡️ Speed up function check_disk_space by 16%
#2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 16% (0.16x) speedup for
check_disk_spaceinultralytics/utils/downloads.py⏱️ Runtime :
3.46 milliseconds→2.99 milliseconds(best of129runs)📝 Explanation and details
The optimized code achieves a 15% performance improvement through several micro-optimizations focused on reducing Python overhead and attribute lookups:
Key optimizations:
Eliminated generator/tuple unpacking overhead: Instead of using
(x / gib for x in shutil.disk_usage(path))which creates a generator and then unpacks it, the code now directly indexes the tuple returned byshutil.disk_usage(). This saves the generator creation cost and tuple unpacking overhead.Reduced attribute lookups: The
Content-Lengthheader is retrieved once into a local variable (data_header) before processing, rather than callingr.headers.get()twice in the original code's arithmetic expression.Pre-calculated required space: The value
data * sfis computed once and stored inrequired_space, eliminating repeated multiplication operations in both the comparison and error message formatting.Optimized None handling: Added explicit None check for
Content-Lengthheader to avoid unnecessaryint(0)conversion when the header is missing.Performance impact: The line profiler shows the most significant gains in the disk usage processing (lines that went from 5.49ms to 2.71ms combined) and header processing sections. The optimizations are particularly effective for scenarios with:
Hot path relevance: Since
check_disk_spaceis called fromsafe_download()before every file download in the Ultralytics framework, these micro-optimizations will accumulate meaningful time savings across multiple downloads, model loading, and dataset preparation workflows where the function may be invoked hundreds of times.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-check_disk_space-mi5v6ldgand push.