Skip to content

Conversation

@agirault
Copy link
Contributor

@agirault agirault commented Dec 16, 2025

Summary by CodeRabbit

  • Chores
    • Simplified the Docker build process for the object detection application, streamlining GPU and CUDA dependency management to improve build efficiency and reduce configuration complexity.

✏️ Tip: You can customize this high-level summary in your review settings.

@agirault agirault requested a review from tbirdso December 16, 2025 05:19
@agirault agirault self-assigned this Dec 16, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2025

Walkthrough

Removes multi-stage Docker build pattern from Torch installation workflow. Simplifies GPU/CUDA-specific wheel selection logic by introducing conditional INDEX_URL computation. Introduces ARG CUDA_MAJOR parameter and reintroduces ARG GPU_TYPE. Consolidates Torch setup and torchvision installation into streamlined conditional RUN blocks.

Changes

Cohort / File(s) Change Summary
Dockerfile multi-stage build refactor
applications/object_detection_torch/Dockerfile
Removes multi-stage scaffolding (removes "as base" stage). Replaces complex conditional Torch wheel selection logic with simplified INDEX_URL-based approach. Adds ARG CUDA_MAJOR, reintroduces ARG GPU_TYPE. Consolidates torchvision installation into new RUN block. Retains existing libtorch ldconfig configuration.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Verify the new INDEX_URL conditional logic correctly handles all GPU_TYPE and CUDA_MAJOR combinations
  • Confirm torchvision installation from computed index URL produces expected artifacts
  • Ensure libtorch ldconfig configuration remains functional after refactoring
  • Validate that removal of multi-stage build does not inadvertently affect downstream image dependencies

Pre-merge checks

✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: simplifying the Dockerfile to use PyTorch from the hsdk 3.10 base image instead of a multi-stage build.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
applications/object_detection_torch/Dockerfile (1)

39-40: Consider providing default values for build arguments.

Both GPU_TYPE and CUDA_MAJOR lack defaults, which will cause the build to fail if these arguments are not explicitly provided. Consider adding sensible defaults (e.g., ARG GPU_TYPE=dgpu and ARG CUDA_MAJOR=12) to maintain backward compatibility with existing build workflows.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9dc2d30 and 9909828.

📒 Files selected for processing (1)
  • applications/object_detection_torch/Dockerfile (2 hunks)
🔇 Additional comments (2)
applications/object_detection_torch/Dockerfile (2)

18-19: Note: AI summary inconsistency.

The AI summary states that ARG BASE_IMAGE was removed, but it's still present on line 18. Only the stage name (as base) was removed from the FROM instruction, which is correct for eliminating the multi-stage build pattern.


34-37: Document the expected BASE_IMAGE or add build-time validation.

The ldconfig setup at lines 34-37 depends on PyTorch being available in BASE_IMAGE via import torch. While the Holoscan container image on NGC includes all dependencies with expected versions (including Torch), the Dockerfile has no default BASE_IMAGE value and doesn't document which image should be used. Consider adding a comment or validation to clarify that BASE_IMAGE must be set to an official Holoscan SDK container image (e.g., nvcr.io/nvidia/clara-holoscan:v3.10.0-devel-cuda12.2-cudnn8.9-ubuntu22.04), or add a fallback with error messaging if torch import fails.

Comment on lines +41 to +49
RUN INDEX_URL=""; \
if [ "${GPU_TYPE}" = "igpu" ]; then \
INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \
elif [ "${CUDA_MAJOR}" = "12" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu129"; \
elif [ "${CUDA_MAJOR}" = "13" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu130"; \
fi \
&& python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Handle the case when INDEX_URL is empty.

If GPU_TYPE is not "igpu" and CUDA_MAJOR is neither "12" nor "13", INDEX_URL will remain an empty string. The subsequent pip install with --index-url "" may fail or exhibit unpredictable behavior.

Apply this diff to add a fallback or validation:

 ARG GPU_TYPE
 ARG CUDA_MAJOR
 RUN INDEX_URL=""; \
     if [ "${GPU_TYPE}" = "igpu" ]; then \
         INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \
     elif [ "${CUDA_MAJOR}" = "12" ]; then \
         INDEX_URL="https://download.pytorch.org/whl/cu129"; \
     elif [ "${CUDA_MAJOR}" = "13" ]; then \
         INDEX_URL="https://download.pytorch.org/whl/cu130"; \
+    else \
+        echo "Error: Unsupported GPU_TYPE (${GPU_TYPE}) or CUDA_MAJOR (${CUDA_MAJOR})" && exit 1; \
     fi \
     && python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN INDEX_URL=""; \
if [ "${GPU_TYPE}" = "igpu" ]; then \
INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \
elif [ "${CUDA_MAJOR}" = "12" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu129"; \
elif [ "${CUDA_MAJOR}" = "13" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu130"; \
fi \
&& python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL}
RUN INDEX_URL=""; \
if [ "${GPU_TYPE}" = "igpu" ]; then \
INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \
elif [ "${CUDA_MAJOR}" = "12" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu129"; \
elif [ "${CUDA_MAJOR}" = "13" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu130"; \
else \
echo "Error: Unsupported GPU_TYPE (${GPU_TYPE}) or CUDA_MAJOR (${CUDA_MAJOR})" && exit 1; \
fi \
&& python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL}
🤖 Prompt for AI Agents
In applications/object_detection_torch/Dockerfile around lines 41 to 49,
INDEX_URL can be left empty which causes pip to be invoked with --index-url "";
change the logic to provide a safe fallback or omit the flag when INDEX_URL is
empty: set INDEX_URL to the default PyPI index (https://pypi.org/simple) if no
GPU/CUDA-specific URL is selected, or modify the final pip install invocation to
conditionally include --index-url only when INDEX_URL is non-empty (ensuring
proper quoting of the variable).

Comment on lines +43 to +47
INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \
elif [ "${CUDA_MAJOR}" = "12" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu129"; \
elif [ "${CUDA_MAJOR}" = "13" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu130"; \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use cu126 for CUDA 12 instead of cu129 to ensure broader compatibility.

The current mapping of cu129 for CUDA 12 is problematic because cu129 is specifically for CUDA 12.9, while cu126 is compatible with LibCUDA 12.0 and above. Mapping CUDA_MAJOR="12" to cu129 unconditionally will fail for CUDA 12.0 through 12.8. PyTorch versions 2.7.x support cu118, cu126, and cu128, but not cu129, making cu129 unsuitable as a blanket CUDA 12 wheel. Consider using cu126 as the baseline for CUDA 12.x or implement more granular minor version checks to select the appropriate wheel variant (cu126, cu128, or cu129).

The cu130 mapping for CUDA 13 is correct.

🤖 Prompt for AI Agents
In applications/object_detection_torch/Dockerfile around lines 43 to 47, the
mapping for CUDA_MAJOR="12" currently points to the cu129 wheel which will fail
for CUDA 12.0–12.8; change the mapping so CUDA_MAJOR="12" uses the cu126 index
URL (or implement minor-version detection to select cu126/cu128/cu129 as
appropriate) — update the INDEX_URL assignment for the "12" branch to
"https://download.pytorch.org/whl/cu126" (or add logic to inspect CUDA minor
version and choose cu126/cu128/cu129 accordingly).

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 16, 2025

Greptile Overview

Greptile Summary

Simplified PyTorch installation by relying on the hsdk 3.10 base image's pre-installed PyTorch and only installing torchvision. Removed the multi-stage build and complex version-specific PyTorch installation logic.

Key changes:

  • Removed multi-stage Dockerfile (base → torch stages merged into single stage)
  • Removed explicit PyTorch installation logic (now assumes PyTorch is pre-installed in base image)
  • Simplified to only install torchvision with appropriate CUDA-version-specific index URLs
  • Changed iGPU PyTorch wheel source from cu129 to cu126
  • Removed PyTorch installation validation check (libtorch_cuda.so)
  • Removed /opt/libtorch symlink setup

Issues found:

  • Critical: Code attempts to import PyTorch (line 35-37) before torchvision installation (line 41-49), which could fail if PyTorch isn't actually in the base image
  • No fallback handling when GPU_TYPE/CUDA_MAJOR don't match expected values - pip will use default index which may install incompatible versions

Confidence Score: 2/5

  • This PR has critical assumptions that may not hold, potentially breaking the build
  • The PR simplifies the Dockerfile significantly but introduces two critical issues: (1) it assumes PyTorch is pre-installed in the base image without verification, and attempts to use it before torchvision is installed; (2) it lacks proper error handling for unsupported GPU_TYPE/CUDA_MAJOR combinations. While the simplification is good in principle, these issues could cause build failures in production.
  • The single file applications/object_detection_torch/Dockerfile requires attention for the logical ordering issue and missing error handling

Important Files Changed

File Analysis

Filename Score Overview
applications/object_detection_torch/Dockerfile 2/5 Simplified PyTorch installation by relying on base image, but has logical ordering issues with torch import before torchvision install and missing fallback handling

Sequence Diagram

sequenceDiagram
    participant Base as BASE_IMAGE (hsdk 3.10)
    participant Docker as Dockerfile Build
    participant Holohub as Holohub Setup
    participant PyTorch as PyTorch (pre-installed)
    participant Torch as Torchvision Install
    
    Note over Base: PyTorch assumed pre-installed
    Docker->>Base: FROM ${BASE_IMAGE}
    Docker->>Holohub: COPY holohub scripts
    Docker->>Holohub: Run holohub setup
    
    Note over Docker,PyTorch: Lines 35-37: Configure libtorch paths
    Docker->>PyTorch: import torch (get __path__)
    PyTorch-->>Docker: Returns torch installation path
    Docker->>Docker: Configure ldconfig with torch lib path
    
    Note over Docker,Torch: Lines 41-49: Install torchvision
    Docker->>Docker: Determine INDEX_URL based on GPU_TYPE/CUDA_MAJOR
    Docker->>Torch: pip install torchvision from INDEX_URL
    Torch-->>Docker: Installation complete
    
    Note over Docker: Build complete
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 35 to 37
RUN echo $(python3 -c "import torch; print(torch.__path__[0])")/lib > /etc/ld.so.conf.d/libtorch.conf \
&& ldconfig \
&& ldconfig -p | grep -q "libtorch.so"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: this code assumes PyTorch is already installed in the base image, but torchvision installation happens later (line 41-49). If PyTorch isn't pre-installed or if there's a dependency between the two, this will fail with an ImportError.

Suggested change
RUN echo $(python3 -c "import torch; print(torch.__path__[0])")/lib > /etc/ld.so.conf.d/libtorch.conf \
&& ldconfig \
&& ldconfig -p | grep -q "libtorch.so"
# Verify PyTorch is available in base image
RUN python3 -c "import torch; print(f'PyTorch {torch.__version__} found')"
# Set up Holoscan SDK container libtorch to be found with ldconfig for app C++ build and runtime
RUN echo $(python3 -c "import torch; print(torch.__path__[0])")/lib > /etc/ld.so.conf.d/libtorch.conf \
&& ldconfig \
&& ldconfig -p | grep -q "libtorch.so"

Is PyTorch guaranteed to be pre-installed in the hsdk 3.10 base image, and are there any version compatibility requirements between PyTorch and torchvision?

Comment on lines +41 to +49
RUN INDEX_URL=""; \
if [ "${GPU_TYPE}" = "igpu" ]; then \
INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \
elif [ "${CUDA_MAJOR}" = "12" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu129"; \
elif [ "${CUDA_MAJOR}" = "13" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu130"; \
fi \
&& python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: if INDEX_URL is empty (e.g., GPU_TYPE is not "igpu" and CUDA_MAJOR is not "12" or "13"), pip will attempt to install from the default PyPI index, which may not have the correct CUDA version of torchvision. Consider adding an else clause or validation.

Suggested change
RUN INDEX_URL=""; \
if [ "${GPU_TYPE}" = "igpu" ]; then \
INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \
elif [ "${CUDA_MAJOR}" = "12" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu129"; \
elif [ "${CUDA_MAJOR}" = "13" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu130"; \
fi \
&& python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL}
ARG GPU_TYPE
ARG CUDA_MAJOR
RUN INDEX_URL=""; \
if [ "${GPU_TYPE}" = "igpu" ]; then \
INDEX_URL="https://pypi.jetson-ai-lab.io/jp6/cu126"; \
elif [ "${CUDA_MAJOR}" = "12" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu129"; \
elif [ "${CUDA_MAJOR}" = "13" ]; then \
INDEX_URL="https://download.pytorch.org/whl/cu130"; \
else \
echo "Error: Unsupported GPU_TYPE=${GPU_TYPE} or CUDA_MAJOR=${CUDA_MAJOR}"; \
exit 1; \
fi \
&& python3 -m pip install torchvision --no-cache-dir --index-url ${INDEX_URL}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant