-
-
Notifications
You must be signed in to change notification settings - Fork 2
feat: ✨ Add rectangular inference support rect and speed optimization for pre-processors and post-processors
#59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
…ments - Added `#[allow(clippy::struct_excessive_bools)]` to `InferenceConfig` to suppress excessive bool warnings. - Removed unnecessary logging initialization code in `init_logging`. - Suppressed unnecessary wraps in the `main` function. - Enhanced `YOLOModel` with additional Clippy lints for better code quality. - Optimized image processing in `YOLOModel` by reducing unnecessary allocations and improving data handling. - Refactored post-processing to use zero-copy techniques and SIMD for faster detection extraction. - Introduced a new zero-copy preprocessing function to minimize memory allocations during image processing. - Improved letterbox resizing and bilinear interpolation with SIMD optimizations and LRU caching for X coordinate lookups. - Cleaned up deprecated code and comments for better readability and maintainability. Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
…tests Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
rect rect and speed optimization for pre-processors and post-processors
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
…ze, adjust IoU threshold, and improve download image path handling Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
…rgo.toml Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
…onfig example Signed-off-by: Onuralp SEZER <onuralp@ultralytics.com>
|
Merged! Huge thanks @onuralpszr for pushing Inference 0.0.8 forward with a seriously thoughtful set of performance + UX upgrades, and to @picsalex for the valuable contributions and collaboration.
This PR embodies that idea: fewer copies, fewer allocations, smarter defaults, and smoother Linux deployment—resulting in faster end-to-end CPU inference, better throughput with rect + batch, and a CLI that feels more consistent with Ultralytics (including the Appreciate the craftsmanship and attention to real-world usability—this is a big win for everyone building with Ultralytics Inference. |
This pull request introduces several significant improvements and new features to the Ultralytics inference Rust library, focusing on enhanced performance, usability, and expanded functionality. Major highlights include support for rectangular and batch inference, improved hardware acceleration options, expanded CLI arguments, and optimizations for preprocessing and post-processing. The documentation and example outputs have also been updated to reflect these changes.
New Features and CLI Enhancements:
--rect) and batch inference (--batch) support, both enabled/configurable via CLI and passed through the inference pipeline. [1] [2] [3]Performance and Preprocessing Optimizations:
widecrate and introduced an LRU cache for preprocessing LUTs for faster image handling. [1] [2].cargo/config.tomlto simplify shared library loading.Batch Processing and Pipeline Improvements:
Documentation and Example Updates:
README.mdwith new CLI options, example commands, output samples, and a detailed breakdown of the codebase structure and dependencies. [1] [2] [3] [4] [5]Codebase and Dependency Updates:
0.0.8and added new dependencies (wide,lru) for preprocessing and caching. [1] [2]These changes collectively make the library faster, more flexible, and easier to use for a wider range of inference scenarios.
New Features and CLI Enhancements
--rect), batch inference (--batch), and expanded CLI options for IoU, max detections, and device selection. Updated CLI help and examples accordingly. [1] [2] [3] [4] [5] [6] [7] [8]Performance and Pipeline Improvements
wide), LRU cache for LUTs (lru), and improved release build optimization with "fat" LTO. [1] [2] [3]Hardware Acceleration and Platform Support
Documentation and Example Updates
README.mdwith new CLI options, example outputs, features checklist, and detailed module/dependency breakdowns. [1] [2] [3] [4] [5]Codebase and Dependency Updates
0.0.8, added new dependencies, and clarified module structure in documentation. [1] [2] [3]🛠️ PR Summary
Made with ❤️ by Ultralytics Actions
🌟 Summary
Ultralytics Inference
0.0.8boosts performance and usability with SIMD-accelerated pre/postprocessing, rectangular + batch inference, improved CLI defaults, and smoother Linux/device/runtime behavior 🚀📊 Key Changes
wide) plus an LRU-cached LUT (lru) to reduce repeated work.--rect) added + enabled by default: dynamically adjusts input shapes to reduce padding (when the ONNX model supports dynamic shapes) for better throughput/latency.--batch): batch size is now configurable, and the CLI pipeline was updated accordingly.--iouchanged to0.7(was0.45) 🎛️--max-detis300(and builder/docs updated) 🧾--saveand--verbosenow show defaults more explicitly$ORIGINso binaries can findlibonnxruntime*.soplaced beside the executable (no need to setLD_LIBRARY_PATH) 📦🎯 Purpose & Impact
$ORIGINRPATH.iou=0.7,max_det=300,rect=true) align behavior with Ultralytics Python, reducing surprises when switching environments.📋 Skipped 1 file (lock files, generated, images, etc.)
Cargo.lock