feat: ✨ Add rectangular inference support `rect` and speed optimization for pre-processors and post-processors #59

onuralpszr · 2026-01-10T14:58:56Z

This pull request introduces several significant improvements and new features to the Ultralytics inference Rust library, focusing on enhanced performance, usability, and expanded functionality. Major highlights include support for rectangular and batch inference, improved hardware acceleration options, expanded CLI arguments, and optimizations for preprocessing and post-processing. The documentation and example outputs have also been updated to reflect these changes.

New Features and CLI Enhancements:

Added rectangular inference (--rect) and batch inference (--batch) support, both enabled/configurable via CLI and passed through the inference pipeline. [1] [2] [3]
Increased default IoU threshold for NMS to 0.7, raised default max detections to 300, and exposed these as CLI arguments. [1] [2] [3] [4]
Expanded device selection to include more hardware acceleration options (CUDA, TensorRT, CoreML, OpenVINO, XNNPACK), and improved CLI help/examples.

Performance and Preprocessing Optimizations:

Added SIMD-accelerated preprocessing via the wide crate and introduced an LRU cache for preprocessing LUTs for faster image handling. [1] [2]
Switched to "fat" LTO in release builds for improved optimization.
On Linux, configured RPATH in .cargo/config.toml to simplify shared library loading.

Batch Processing and Pipeline Improvements:

Implemented a pipelined, multi-threaded batch processing system using bounded channels between frame decoding and inference, improving throughput and responsiveness.
Centralized batch management in the prediction pipeline for more efficient processing.

Documentation and Example Updates:

Updated README.md with new CLI options, example commands, output samples, and a detailed breakdown of the codebase structure and dependencies. [1] [2] [3] [4] [5]
Revised example output to reflect new defaults, improved speed, and updated versioning.
Added new features to the "Features" checklist and clarified in-progress items.

Codebase and Dependency Updates:

Bumped crate version to 0.0.8 and added new dependencies (wide, lru) for preprocessing and caching. [1] [2]
Expanded and clarified module structure in documentation, highlighting new modules for batch processing, device management, annotation, I/O, and logging.

These changes collectively make the library faster, more flexible, and easier to use for a wider range of inference scenarios.

New Features and CLI Enhancements

Added rectangular inference (--rect), batch inference (--batch), and expanded CLI options for IoU, max detections, and device selection. Updated CLI help and examples accordingly. [1] [2] [3] [4] [5] [6] [7] [8]

Performance and Pipeline Improvements

Introduced SIMD-accelerated preprocessing (wide), LRU cache for LUTs (lru), and improved release build optimization with "fat" LTO. [1] [2] [3]
Implemented pipelined, multi-threaded batch processing using bounded channels for better throughput.

Hardware Acceleration and Platform Support

Added support for more hardware acceleration backends (CUDA, TensorRT, CoreML, OpenVINO, XNNPACK) and configured Linux RPATH for runtime library loading. [1] [2] [3]

Documentation and Example Updates

Updated README.md with new CLI options, example outputs, features checklist, and detailed module/dependency breakdowns. [1] [2] [3] [4] [5]

Codebase and Dependency Updates

Bumped version to 0.0.8, added new dependencies, and clarified module structure in documentation. [1] [2] [3]

Signed-off-by: Onuralp SEZER <[email protected]>

…ments - Added `#[allow(clippy::struct_excessive_bools)]` to `InferenceConfig` to suppress excessive bool warnings. - Removed unnecessary logging initialization code in `init_logging`. - Suppressed unnecessary wraps in the `main` function. - Enhanced `YOLOModel` with additional Clippy lints for better code quality. - Optimized image processing in `YOLOModel` by reducing unnecessary allocations and improving data handling. - Refactored post-processing to use zero-copy techniques and SIMD for faster detection extraction. - Introduced a new zero-copy preprocessing function to minimize memory allocations during image processing. - Improved letterbox resizing and bilinear interpolation with SIMD optimizations and LRU caching for X coordinate lookups. - Cleaned up deprecated code and comments for better readability and maintainability. Signed-off-by: Onuralp SEZER <[email protected]>

Signed-off-by: Onuralp SEZER <[email protected]>

…tests Signed-off-by: Onuralp SEZER <[email protected]>

Signed-off-by: Onuralp SEZER <[email protected]>

codecov · 2026-01-10T15:37:03Z

Codecov Report

❌ Patch coverage is 67.71218% with 175 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/postprocessing.rs	61.70%	90 Missing ⚠️
src/preprocessing.rs	76.96%	47 Missing ⚠️
src/model.rs	65.11%	15 Missing ⚠️
src/cli/predict.rs	64.51%	11 Missing ⚠️
src/source.rs	0.00%	6 Missing ⚠️
src/download.rs	72.72%	3 Missing ⚠️
src/main.rs	0.00%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

…ze, adjust IoU threshold, and improve download image path handling Signed-off-by: Onuralp SEZER <[email protected]>

…rgo.toml Signed-off-by: Onuralp SEZER <[email protected]>

Signed-off-by: Onuralp SEZER <[email protected]>

…onfig example Signed-off-by: Onuralp SEZER <[email protected]>

onuralpszr added 4 commits January 6, 2026 23:21

feat: ✨ Add rectangular inference support rect

794b9ed

Signed-off-by: Onuralp SEZER <[email protected]>

feat: ✨ Optimize bilinear resizing for Python compatibility with SIMD

99a3eee

Signed-off-by: Onuralp SEZER <[email protected]>

Merge branch 'main' into feat/rect

0dcd03d

Signed-off-by: Onuralp SEZER <[email protected]>

onuralpszr requested a review from picsalex January 10, 2026 15:05

onuralpszr added 6 commits January 10, 2026 18:07

fix: 🐞 update max_det parameter in examples and tests to 300

fbb8f63

Signed-off-by: Onuralp SEZER <[email protected]>

fix: 🐞 update max_det parameter in inference example to 300

7d85ce0

Signed-off-by: Onuralp SEZER <[email protected]>

fix: 🐞 update max_det parameter to 300 in prediction and integration …

741ef82

…tests Signed-off-by: Onuralp SEZER <[email protected]>

fix: 🐞 remove unnecessary blank lines in extract_detect_boxes function

1d90592

Signed-off-by: Onuralp SEZER <[email protected]>

fix: 🐞 rename max_detections to max_det in extract_detect_boxes function

e0592e4

Signed-off-by: Onuralp SEZER <[email protected]>

fix: 🐞 reorder imports for clarity in predict.rs

ffe6567

Signed-off-by: Onuralp SEZER <[email protected]>

onuralpszr changed the title ~~feat: ✨ Add rectangular inference support rect~~ feat: ✨ Add rectangular inference support rect and speed optimization for pre-processors and post-processors Jan 10, 2026

onuralpszr and others added 5 commits January 10, 2026 19:35

refactor: update CLI arguments for rectangular inference and batch si…

8a2c26e

…ze, adjust IoU threshold, and improve download image path handling Signed-off-by: Onuralp SEZER <[email protected]>

chore: 📦 update version to 0.0.8 and remove unused dependencies in Ca…

5e13f78

…rgo.toml Signed-off-by: Onuralp SEZER <[email protected]>

Auto-format by https://ultralytics.com/actions

011a03a

fix: 🐞 remove duplicate entry for zidane.jpg in README.md

12ebea6

Signed-off-by: Onuralp SEZER <[email protected]>

docs: increase max detections per image from 100 to 300 in InferenceC…

4e891b1

…onfig example Signed-off-by: Onuralp SEZER <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: ✨ Add rectangular inference support `rect` and speed optimization for pre-processors and post-processors #59

feat: ✨ Add rectangular inference support `rect` and speed optimization for pre-processors and post-processors #59

Uh oh!

onuralpszr commented Jan 10, 2026 •

edited

Loading

Uh oh!

codecov bot commented Jan 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

feat: ✨ Add rectangular inference support rect and speed optimization for pre-processors and post-processors #59

Are you sure you want to change the base?

feat: ✨ Add rectangular inference support rect and speed optimization for pre-processors and post-processors #59

Uh oh!

Conversation

onuralpszr commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: ✨ Add rectangular inference support `rect` and speed optimization for pre-processors and post-processors #59

feat: ✨ Add rectangular inference support `rect` and speed optimization for pre-processors and post-processors #59

onuralpszr commented Jan 10, 2026 •

edited

Loading

codecov bot commented Jan 10, 2026 •

edited

Loading