-
Notifications
You must be signed in to change notification settings - Fork 131
[Draft] docker: use pytorch install from hsdk 3.10 base image #1305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Alexis Girault <[email protected]>
WalkthroughRemoves multi-stage Docker build pattern from Torch installation workflow. Simplifies GPU/CUDA-specific wheel selection logic by introducing conditional INDEX_URL computation. Introduces Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Pre-merge checks✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
applications/object_detection_torch/Dockerfile (1)
39-40: Consider providing default values for build arguments.Both
GPU_TYPEandCUDA_MAJORlack defaults, which will cause the build to fail if these arguments are not explicitly provided. Consider adding sensible defaults (e.g.,ARG GPU_TYPE=dgpuandARG CUDA_MAJOR=12) to maintain backward compatibility with existing build workflows.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
applications/object_detection_torch/Dockerfile(2 hunks)
🔇 Additional comments (2)
applications/object_detection_torch/Dockerfile (2)
18-19: Note: AI summary inconsistency.The AI summary states that
ARG BASE_IMAGEwas removed, but it's still present on line 18. Only the stage name (as base) was removed from the FROM instruction, which is correct for eliminating the multi-stage build pattern.
34-37: Document the expected BASE_IMAGE or add build-time validation.The ldconfig setup at lines 34-37 depends on PyTorch being available in BASE_IMAGE via
import torch. While the Holoscan container image on NGC includes all dependencies with expected versions (including Torch), the Dockerfile has no default BASE_IMAGE value and doesn't document which image should be used. Consider adding a comment or validation to clarify that BASE_IMAGE must be set to an official Holoscan SDK container image (e.g.,nvcr.io/nvidia/clara-holoscan:v3.10.0-devel-cuda12.2-cudnn8.9-ubuntu22.04), or add a fallback with error messaging if torch import fails.
| RUN INDEX_URL=""; \ | ||
| if [ "${GPU_TYPE}" = "igpu" ]; then \ | ||
| INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \ | ||
| elif [ "${CUDA_MAJOR}" = "12" ]; then \ | ||
| INDEX_URL="https://download.pytorch.org/whl/cu129"; \ | ||
| elif [ "${CUDA_MAJOR}" = "13" ]; then \ | ||
| INDEX_URL="https://download.pytorch.org/whl/cu130"; \ | ||
| fi \ | ||
| && python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle the case when INDEX_URL is empty.
If GPU_TYPE is not "igpu" and CUDA_MAJOR is neither "12" nor "13", INDEX_URL will remain an empty string. The subsequent pip install with --index-url "" may fail or exhibit unpredictable behavior.
Apply this diff to add a fallback or validation:
ARG GPU_TYPE
ARG CUDA_MAJOR
RUN INDEX_URL=""; \
if [ "${GPU_TYPE}" = "igpu" ]; then \
INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \
elif [ "${CUDA_MAJOR}" = "12" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu129"; \
elif [ "${CUDA_MAJOR}" = "13" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu130"; \
+ else \
+ echo "Error: Unsupported GPU_TYPE (${GPU_TYPE}) or CUDA_MAJOR (${CUDA_MAJOR})" && exit 1; \
fi \
&& python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| RUN INDEX_URL=""; \ | |
| if [ "${GPU_TYPE}" = "igpu" ]; then \ | |
| INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \ | |
| elif [ "${CUDA_MAJOR}" = "12" ]; then \ | |
| INDEX_URL="https://download.pytorch.org/whl/cu129"; \ | |
| elif [ "${CUDA_MAJOR}" = "13" ]; then \ | |
| INDEX_URL="https://download.pytorch.org/whl/cu130"; \ | |
| fi \ | |
| && python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL} | |
| RUN INDEX_URL=""; \ | |
| if [ "${GPU_TYPE}" = "igpu" ]; then \ | |
| INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \ | |
| elif [ "${CUDA_MAJOR}" = "12" ]; then \ | |
| INDEX_URL="https://download.pytorch.org/whl/cu129"; \ | |
| elif [ "${CUDA_MAJOR}" = "13" ]; then \ | |
| INDEX_URL="https://download.pytorch.org/whl/cu130"; \ | |
| else \ | |
| echo "Error: Unsupported GPU_TYPE (${GPU_TYPE}) or CUDA_MAJOR (${CUDA_MAJOR})" && exit 1; \ | |
| fi \ | |
| && python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL} |
🤖 Prompt for AI Agents
In applications/object_detection_torch/Dockerfile around lines 41 to 49,
INDEX_URL can be left empty which causes pip to be invoked with --index-url "";
change the logic to provide a safe fallback or omit the flag when INDEX_URL is
empty: set INDEX_URL to the default PyPI index (https://pypi.org/simple) if no
GPU/CUDA-specific URL is selected, or modify the final pip install invocation to
conditionally include --index-url only when INDEX_URL is non-empty (ensuring
proper quoting of the variable).
| INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \ | ||
| elif [ "${CUDA_MAJOR}" = "12" ]; then \ | ||
| INDEX_URL="https://download.pytorch.org/whl/cu129"; \ | ||
| elif [ "${CUDA_MAJOR}" = "13" ]; then \ | ||
| INDEX_URL="https://download.pytorch.org/whl/cu130"; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use cu126 for CUDA 12 instead of cu129 to ensure broader compatibility.
The current mapping of cu129 for CUDA 12 is problematic because cu129 is specifically for CUDA 12.9, while cu126 is compatible with LibCUDA 12.0 and above. Mapping CUDA_MAJOR="12" to cu129 unconditionally will fail for CUDA 12.0 through 12.8. PyTorch versions 2.7.x support cu118, cu126, and cu128, but not cu129, making cu129 unsuitable as a blanket CUDA 12 wheel. Consider using cu126 as the baseline for CUDA 12.x or implement more granular minor version checks to select the appropriate wheel variant (cu126, cu128, or cu129).
The cu130 mapping for CUDA 13 is correct.
🤖 Prompt for AI Agents
In applications/object_detection_torch/Dockerfile around lines 43 to 47, the
mapping for CUDA_MAJOR="12" currently points to the cu129 wheel which will fail
for CUDA 12.0–12.8; change the mapping so CUDA_MAJOR="12" uses the cu126 index
URL (or implement minor-version detection to select cu126/cu128/cu129 as
appropriate) — update the INDEX_URL assignment for the "12" branch to
"https://download.pytorch.org/whl/cu126" (or add logic to inspect CUDA minor
version and choose cu126/cu128/cu129 accordingly).
Greptile OverviewGreptile SummarySimplified PyTorch installation by relying on the hsdk 3.10 base image's pre-installed PyTorch and only installing torchvision. Removed the multi-stage build and complex version-specific PyTorch installation logic. Key changes:
Issues found:
Confidence Score: 2/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant Base as BASE_IMAGE (hsdk 3.10)
participant Docker as Dockerfile Build
participant Holohub as Holohub Setup
participant PyTorch as PyTorch (pre-installed)
participant Torch as Torchvision Install
Note over Base: PyTorch assumed pre-installed
Docker->>Base: FROM ${BASE_IMAGE}
Docker->>Holohub: COPY holohub scripts
Docker->>Holohub: Run holohub setup
Note over Docker,PyTorch: Lines 35-37: Configure libtorch paths
Docker->>PyTorch: import torch (get __path__)
PyTorch-->>Docker: Returns torch installation path
Docker->>Docker: Configure ldconfig with torch lib path
Note over Docker,Torch: Lines 41-49: Install torchvision
Docker->>Docker: Determine INDEX_URL based on GPU_TYPE/CUDA_MAJOR
Docker->>Torch: pip install torchvision from INDEX_URL
Torch-->>Docker: Installation complete
Note over Docker: Build complete
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, 2 comments
| RUN echo $(python3 -c "import torch; print(torch.__path__[0])")/lib > /etc/ld.so.conf.d/libtorch.conf \ | ||
| && ldconfig \ | ||
| && ldconfig -p | grep -q "libtorch.so" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: this code assumes PyTorch is already installed in the base image, but torchvision installation happens later (line 41-49). If PyTorch isn't pre-installed or if there's a dependency between the two, this will fail with an ImportError.
| RUN echo $(python3 -c "import torch; print(torch.__path__[0])")/lib > /etc/ld.so.conf.d/libtorch.conf \ | |
| && ldconfig \ | |
| && ldconfig -p | grep -q "libtorch.so" | |
| # Verify PyTorch is available in base image | |
| RUN python3 -c "import torch; print(f'PyTorch {torch.__version__} found')" | |
| # Set up Holoscan SDK container libtorch to be found with ldconfig for app C++ build and runtime | |
| RUN echo $(python3 -c "import torch; print(torch.__path__[0])")/lib > /etc/ld.so.conf.d/libtorch.conf \ | |
| && ldconfig \ | |
| && ldconfig -p | grep -q "libtorch.so" |
Is PyTorch guaranteed to be pre-installed in the hsdk 3.10 base image, and are there any version compatibility requirements between PyTorch and torchvision?
| RUN INDEX_URL=""; \ | ||
| if [ "${GPU_TYPE}" = "igpu" ]; then \ | ||
| INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \ | ||
| elif [ "${CUDA_MAJOR}" = "12" ]; then \ | ||
| INDEX_URL="https://download.pytorch.org/whl/cu129"; \ | ||
| elif [ "${CUDA_MAJOR}" = "13" ]; then \ | ||
| INDEX_URL="https://download.pytorch.org/whl/cu130"; \ | ||
| fi \ | ||
| && python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: if INDEX_URL is empty (e.g., GPU_TYPE is not "igpu" and CUDA_MAJOR is not "12" or "13"), pip will attempt to install from the default PyPI index, which may not have the correct CUDA version of torchvision. Consider adding an else clause or validation.
| RUN INDEX_URL=""; \ | |
| if [ "${GPU_TYPE}" = "igpu" ]; then \ | |
| INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \ | |
| elif [ "${CUDA_MAJOR}" = "12" ]; then \ | |
| INDEX_URL="https://download.pytorch.org/whl/cu129"; \ | |
| elif [ "${CUDA_MAJOR}" = "13" ]; then \ | |
| INDEX_URL="https://download.pytorch.org/whl/cu130"; \ | |
| fi \ | |
| && python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL} | |
| ARG GPU_TYPE | |
| ARG CUDA_MAJOR | |
| RUN INDEX_URL=""; \ | |
| if [ "${GPU_TYPE}" = "igpu" ]; then \ | |
| INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \ | |
| elif [ "${CUDA_MAJOR}" = "12" ]; then \ | |
| INDEX_URL="https://download.pytorch.org/whl/cu129"; \ | |
| elif [ "${CUDA_MAJOR}" = "13" ]; then \ | |
| INDEX_URL="https://download.pytorch.org/whl/cu130"; \ | |
| else \ | |
| echo "Error: Unsupported GPU_TYPE=${GPU_TYPE} or CUDA_MAJOR=${CUDA_MAJOR}"; \ | |
| exit 1; \ | |
| fi \ | |
| && python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL} |
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.