-
-
Notifications
You must be signed in to change notification settings - Fork 2
feat: ✨ Add Dockerfiles and configuration for inference server and CLI #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Onuralp SEZER <[email protected]>
|
👋 Hello @onuralpszr, thank you for submitting a
For more guidance, please refer to our Contributing Guide. Don't hesitate to leave a comment if you have any questions. Thank you for contributing to Ultralytics! 🚀 |
UltralyticsAssistant
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔍 PR Review
Made with ❤️ by Ultralytics Actions
Solid addition overall: Docker artifacts are clear, the server routes + Swagger UI integration look coherent, and non-root runtime is a good security baseline. Main issues to address are server robustness (multipart error handling and avoiding a potential panic on empty results) and Docker build reliability (ensure CA certs in the server builder stage and avoid fragile wildcard copies for ONNX Runtime shared libs).
💬 Posted 6 inline comments
| } | ||
| }; | ||
|
|
||
| let result = &results[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❗ CRITICAL: let result = &results[0]; will panic if the model returns an empty results vec (e.g., unexpected input, internal model error, or library behavior changes). This should be handled gracefully (return 500 with an ErrorResponse) to avoid crashing the server process.
| COPY --from=builder /usr/src/app/target/release/ultralytics-inference /usr/local/bin/ultralytics-inference | ||
|
|
||
| # Copy ONNX Runtime shared libraries | ||
| COPY --from=builder /usr/src/app/target/release/libonnxruntime*.so* /usr/lib/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
COPY --from=builder .../target/release/libonnxruntime*.so* /usr/lib/ assumes ONNX Runtime shared libs are present under target/release/. If the build produces no matching files (feature changes, different linking strategy, etc.), the Docker build will fail at this step. Consider making the presence of these libs explicit (e.g., copying from a known output dir/artifact) or ensuring the build always emits them.
| FROM ubuntu:questing-20251029 AS builder | ||
|
|
||
| # Install build dependencies + Rust (image-only) | ||
| RUN apt-get update && apt-get install -y \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curl but not ca-certificates. rustup download uses TLS and may fail in minimal images without CA roots. Add ca-certificates to the builder stage deps to make the build more robust.
| COPY --from=builder /usr/src/app/docker/server/target/release/ultralytics-inference-server /usr/local/bin/ultralytics-inference-server | ||
|
|
||
| # Copy ONNX Runtime shared libraries | ||
| COPY --from=builder /usr/src/app/docker/server/target/release/libonnxruntime*.so* /usr/lib/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
libonnxruntime*.so* from docker/server/target/release/ will fail the Docker build if no files match. It would be safer to copy from a deterministic location or ensure the server build step always produces these shared libs.
| let max_det = params.max_det.unwrap_or(300); | ||
|
|
||
| // Extract image from multipart form | ||
| while let Ok(Some(field)) = multipart.next_field().await { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while let Ok(Some(field)) = multipart.next_field().await silently treats multipart parsing errors as end-of-stream, which can misreport a malformed request as "Missing 'image' field" (400) instead of a 400/500 with the actual error. Consider explicitly handling Err(e) from next_field() and returning an appropriate error response.
| }; | ||
|
|
||
| // Run inference | ||
| let mut model = state.model.lock().await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 MEDIUM: The model mutex is held across the entire inference + response construction path. This serializes all requests and also increases tail latency under load. If YOLOModel supports concurrent inference, consider narrowing the critical section to only the model call (and cloning/copying data you need from result before unlocking), or using a pool/sharded models if intended for multi-request throughput.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
🛠️ PR Summary
Made with ❤️ by Ultralytics Actions
🌟 Summary
Adds Docker build artifacts and a Rust-based inference HTTP server (with Swagger UI) plus a containerized CLI for running Ultralytics inference.
📊 Key Changes
.dockerignoreto reduce Docker build context and exclude artifacts, models, tests, and docs.docker/Dockerfile-clito build and ship a rootless runtime image for theultralytics-inferenceCLI (downloads a defaultyolo11n.onnx).docker/Dockerfile-serverand a new Rust crate underdocker/server/to run an Axum-based inference server./(root),/health,/info, andPOST /predict(multipart upload withconfandmax_detquery params), plus Swagger UI at/swagger-ui.🎯 Purpose & Impact