Skip to content

Conversation

@onuralpszr
Copy link
Member

@onuralpszr onuralpszr commented Dec 31, 2025

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Adds Docker build artifacts and a Rust-based inference HTTP server (with Swagger UI) plus a containerized CLI for running Ultralytics inference.

📊 Key Changes

  • Added a new .dockerignore to reduce Docker build context and exclude artifacts, models, tests, and docs.
  • Introduced docker/Dockerfile-cli to build and ship a rootless runtime image for the ultralytics-inference CLI (downloads a default yolo11n.onnx).
  • Introduced docker/Dockerfile-server and a new Rust crate under docker/server/ to run an Axum-based inference server.
  • Implemented server endpoints: / (root), /health, /info, and POST /predict (multipart upload with conf and max_det query params), plus Swagger UI at /swagger-ui.

🎯 Purpose & Impact

  • Makes it easier to deploy inference as a containerized service or run a containerized CLI 📦
  • Provides an OpenAPI/Swagger UI for quick API exploration and integration 🧭
  • Improves security posture by running containers as a non-root user 🔐
  • Enables consistent, reproducible builds for users who want a turnkey inference server 🚀

@UltralyticsAssistant UltralyticsAssistant added dependencies Dependency-related topics enhancement New feature or request labels Dec 31, 2025
@UltralyticsAssistant
Copy link
Member

👋 Hello @onuralpszr, thank you for submitting a ultralytics/inference 🚀 PR! This is an automated message to help with review—an Ultralytics engineer will assist soon. A few quick checks to ensure smooth integration ✨

  • Define a Purpose: Clearly explain the purpose of your fix or feature in your PR description, and link to any relevant issues. Ensure your commit messages are clear, concise, and adhere to the project's conventions.
  • Synchronize with Source: Confirm your PR is synchronized with the ultralytics/inference main branch. If it's behind, update it by clicking the 'Update branch' button or by running git pull and git merge main locally.
  • Ensure CI Checks Pass: Verify all Ultralytics Continuous Integration (CI) checks are passing. If any checks fail, please address the issues.
  • Update Documentation: Update the relevant documentation for any new or modified features.
  • Add Tests: If applicable, include or update tests to cover your changes, and confirm that all tests are passing.
  • Sign the CLA: Please ensure you have signed our Contributor License Agreement if this is your first Ultralytics PR by writing "I have read the CLA Document and I sign the CLA" in a new message.
  • Minimize Changes: Limit your changes to the minimum necessary for your bug fix or feature addition. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." — Bruce Lee

For more guidance, please refer to our Contributing Guide. Don't hesitate to leave a comment if you have any questions. Thank you for contributing to Ultralytics! 🚀

Copy link
Member

@UltralyticsAssistant UltralyticsAssistant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 PR Review

Made with ❤️ by Ultralytics Actions

Solid addition overall: Docker artifacts are clear, the server routes + Swagger UI integration look coherent, and non-root runtime is a good security baseline. Main issues to address are server robustness (multipart error handling and avoiding a potential panic on empty results) and Docker build reliability (ensure CA certs in the server builder stage and avoid fragile wildcard copies for ONNX Runtime shared libs).

💬 Posted 6 inline comments

}
};

let result = &results[0];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL: let result = &results[0]; will panic if the model returns an empty results vec (e.g., unexpected input, internal model error, or library behavior changes). This should be handled gracefully (return 500 with an ErrorResponse) to avoid crashing the server process.

COPY --from=builder /usr/src/app/target/release/ultralytics-inference /usr/local/bin/ultralytics-inference

# Copy ONNX Runtime shared libraries
COPY --from=builder /usr/src/app/target/release/libonnxruntime*.so* /usr/lib/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ HIGH: COPY --from=builder .../target/release/libonnxruntime*.so* /usr/lib/ assumes ONNX Runtime shared libs are present under target/release/. If the build produces no matching files (feature changes, different linking strategy, etc.), the Docker build will fail at this step. Consider making the presence of these libs explicit (e.g., copying from a known output dir/artifact) or ensuring the build always emits them.

FROM ubuntu:questing-20251029 AS builder

# Install build dependencies + Rust (image-only)
RUN apt-get update && apt-get install -y \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ HIGH: Builder stage installs curl but not ca-certificates. rustup download uses TLS and may fail in minimal images without CA roots. Add ca-certificates to the builder stage deps to make the build more robust.

COPY --from=builder /usr/src/app/docker/server/target/release/ultralytics-inference-server /usr/local/bin/ultralytics-inference-server

# Copy ONNX Runtime shared libraries
COPY --from=builder /usr/src/app/docker/server/target/release/libonnxruntime*.so* /usr/lib/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ HIGH: Same concern as the CLI image: copying libonnxruntime*.so* from docker/server/target/release/ will fail the Docker build if no files match. It would be safer to copy from a deterministic location or ensure the server build step always produces these shared libs.

let max_det = params.max_det.unwrap_or(300);

// Extract image from multipart form
while let Ok(Some(field)) = multipart.next_field().await {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ HIGH: while let Ok(Some(field)) = multipart.next_field().await silently treats multipart parsing errors as end-of-stream, which can misreport a malformed request as "Missing 'image' field" (400) instead of a 400/500 with the actual error. Consider explicitly handling Err(e) from next_field() and returning an appropriate error response.

};

// Run inference
let mut model = state.model.lock().await;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 MEDIUM: The model mutex is held across the entire inference + response construction path. This serializes all requests and also increases tail latency under load. If YOLOModel supports concurrent inference, consider narrowing the critical section to only the model call (and cloning/copying data you need from result before unlocking), or using a pool/sharded models if intended for multi-request throughput.

@codecov
Copy link

codecov bot commented Dec 31, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@onuralpszr onuralpszr marked this pull request as draft January 3, 2026 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Dependency-related topics enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants