diff --git a/README.md b/README.md index cd50e88..6a5c435 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,10 @@ # navigation2.ai Nav2 Integrations, Contact Points, and Demos using AI + + +## nav2_depth_estimation_ai + +A ROS 2 composable perception pipeline that integrates camera driver, image preprocessing, AI-based depth estimation using DepthAnything V3, and point cloud projection into a modular component container for use in navigation. + +More info : [nav2_depth_estimation_ai README](nav2_depth_estimation_ai/README.md) + diff --git a/nav2_depth_estimation_ai/CMakeLists.txt b/nav2_depth_estimation_ai/CMakeLists.txt new file mode 100644 index 0000000..9056dc8 --- /dev/null +++ b/nav2_depth_estimation_ai/CMakeLists.txt @@ -0,0 +1,12 @@ +cmake_minimum_required(VERSION 3.8) +project(nav2_depth_estimation_ai) + +find_package(ament_cmake REQUIRED) + +install(DIRECTORY + launch + params + DESTINATION share/${PROJECT_NAME} +) + +ament_package() \ No newline at end of file diff --git a/nav2_depth_estimation_ai/README.md b/nav2_depth_estimation_ai/README.md new file mode 100644 index 0000000..2343f9a --- /dev/null +++ b/nav2_depth_estimation_ai/README.md @@ -0,0 +1,293 @@ +# nav2_depth_estimation_ai + +This package provides a **perception pipeline** using AI-based depth estimation using DepthAnything V3 from RGB images for use in navigation and mobility tasks. + +The pipeline is designed to be **modular and configurable**, allowing users to swap components such as image sources, pre- and post-processing nodes, and depth estimation models by modifying the launch file or configuration file. + +All components run as **ROS2 composable nodes** inside a single container for efficient intra-process communication. + +## Pipeline Architecture + +```mermaid +graph LR + +Source[Image Source] -->|Image Topic| Preprocess[Image Preprocessing] + +Preprocess -->|Processed Image| Depth[Depth Estimator] + +Depth -->|Depth Image| Projection[PointCloud Projection] + +Projection -->|PointCloud| Nav2Costmap((Nav2Costmap)) +``` + +The pipeline performs the following transformations: + +1. RGB images are captured from a image source. +2. Optional preprocessing (crop/resize/decimation) is applied. +3. A depth estimation model generates a depth map. +4. The depth map is converted into a **3D point cloud**. + +The resulting point cloud can be used by **Nav2 perception pipelines, mapping systems, or obstacle detection modules**. + +## Demo + +Perception pipeline that generates depth maps and point clouds from RGB input. + +https://github.com/user-attachments/assets/12ce2808-099f-4718-b8c1-1de120bb601a + +## Dependencies + +### Core Dependencies +The following packages are required for the basic pipeline: + +- `image_proc` – Used for image preprocessing operations. +- `depth_image_proc` – Used to project depth images into point clouds. + +### Example Dependencies +The pipeline can be configured with different nodes. A typical setup may include: + +- `usb_cam` as the **RGB image source** +- [depth_anything_v3](https://github.com/ika-rwth-aachen/ros2-depth-anything-v3-trt) as the **depth estimation model** + +To build `depth_anything_v3` from source, follow instructions: + +```bash +cd ~/ros2_ws/src + +git clone https://github.com/ika-rwth-aachen/ros2-depth-anything-v3-trt.git + +cd .. + +rosdep install --from-paths src --ignore-src -r -y + +# From your ROS 2 workspace +colcon build --packages-select depth_anything_v3 --cmake-args -DCMAKE_BUILD_TYPE=Release + +source install/setup.bash +``` + +# Model preparation +1. Obtain the ONNX model (Two Options): + A. Download the ONNX file from [Huggingface](https://huggingface.co/TillBeemelmanns/Depth-Anything-V3-ONNX) + B. Generate ONNX following the instruction [here](https://github.com/ika-rwth-aachen/ros2-depth-anything-v3-trt/blob/main/onnx/README.md) +2. Update configuration: Modify config/nav2_depth_ai_params.yaml with the correct model path + +```yaml +depth_anything_v3: + ros__parameters: + # Model configuration + onnx_path: "$(find-pkg-share depth_anything_v3)/models/DA3METRIC-LARGE.onnx" + precision: "fp16" # fp16 or fp32 +``` + +3. (Optional) Generate TensorRT engine for optimized inference (if using TensorRT backend): + +```bash +./src/ros2-depth-anything-v3-trt/generate_engines.sh +``` + +For the upstream build instructions, see: + +https://github.com/ika-rwth-aachen/ros2-depth-anything-v3-trt/tree/main#building + +--- + +### Image Source + +Defines the node responsible for providing the **input image stream** to the perception pipeline. + +Example configuration: + +```yaml +image_source: + type: rgb + package: usb_cam + plugin: usb_cam::UsbCamNode + parameters: + video_device: /dev/video0 + image_width: 640 + image_height: 480 + pixel_format: mjpeg2rgb + frame_rate: 30.0 + topics: + output_topic: /image_raw + camera_info_topic: /camera_info +``` + +If using a camera with a ROS driver, it may be used instead. +--- + +### Image Preprocessing + +Image preprocessing can be enabled to crop, decimate, or resize the image before depth estimation. + +```yaml +usb_cam: + ros__parameters: + video_device: /dev/video0 + image_width: 640 + image_height: 480 + pixel_format: mjpeg2rgb + frame_rate: 30.0 + +crop_decimate: + ros__parameters: + x_offset: 0 + y_offset: 0 + width: 640 + height: 480 + decimation_x: 1 + decimation_y: 1 + +resize: + ros__parameters: + width: 504 + height: 280 +``` + +Preprocessing nodes used: + +* `image_proc::CropDecimateNode` +* `image_proc::ResizeNode` + +However, others may be easily added into the pipeline if necessary. +--- + +### Depth Estimator + +If the input type is **RGB**, a depth estimation model is used to generate a depth image from the incoming RGB frames. + +Example configuration: + +```yaml +depth_anything_v3: + ros__parameters: + # Model configuration + onnx_path: "$(find-pkg-share depth_anything_v3)/models/DA3METRIC-LARGE.onnx" + precision: "fp16" # fp16 or fp32 + + # Debug configuration + enable_debug: true + debug_colormap: "JET" # JET, HOT, COOL, SPRING, SUMMER, AUTUMN, WINTER, BONE, GRAY, HSV, PARULA, PLASMA, INFERNO, VIRIDIS, MAGMA, CIVIDIS + debug_filepath: "/tmp/depth_anything_v3_debug/" + write_colormap: false + debug_colormap_min_depth: 0.0 # Minimum depth value for colormap visualization + debug_colormap_max_depth: 50.0 # Maximum depth value for colormap visualization + sky_threshold: 0.3 # Threshold for sky classification (lower = more sky) + sky_depth_cap: 200.0 # Maximum depth value to fill sky regions + + # Point cloud downsampling (1 = no downsampling, 10 = every 10th point) + point_cloud_downsample_factor: 2 + + # Point cloud colorization with RGB from input image + colorize_point_cloud: true # Set to true to publish RGB point cloud instead of XYZ only +``` + +--- + +## Running the Pipeline + +Launch the pipeline: + +```bash +ros2 launch nav2_depth_estimation_ai perception_pipeline.launch.py +``` + +All nodes run inside a **ComposableNodeContainer**. + +--- + +## Output Topics + +| Topic | Description | +| ------------------------------ | ---------------------------------- | +| `/pipeline/image_raw` | Raw image from camera | +| `/pipeline/image_preprocessed` | Preprocessed image | +| `/pipeline/depth` | Depth image generated by the model | +| `/pipeline/points` | Generated 3D point cloud | + + +# Nav2 Costmap Integration + +The generated point cloud can be integrated into Nav2's costmap layer for obstacle detection and navigation. Below is an example configuration for the local costmap to use the point cloud from the pipeline: + +```yaml +local_costmap: + local_costmap: + ros__parameters: + update_frequency: 5.0 + publish_frequency: 2.0 + global_frame: odom + robot_base_frame: base_link + rolling_window: true + width: 3 + height: 3 + resolution: 0.05 + robot_radius: 0.15 + plugins: ["voxel_layer", "inflation_layer"] + inflation_layer: + plugin: "nav2_costmap_2d::InflationLayer" + inflation_radius: 0.5 + cost_scaling_factor: 5.0 + voxel_layer: + plugin: "nav2_costmap_2d::VoxelLayer" + enabled: true + publish_voxel_map: true + origin_z: 0.0 + z_resolution: 0.05 + z_voxels: 16 + max_obstacle_height: 2.0 + mark_threshold: 0 + observation_sources: pointcloud + pointcloud: + topic: /pipeline/points + max_obstacle_height: 2.0 + min_obstacle_height: 0.2 + obstacle_max_range: 8.0 + obstacle_min_range: 0.0 + raytrace_max_range: 6.0 + raytrace_min_range: 0.0 + clearing: True + marking: True + data_type: "PointCloud2" +``` + +Users can adjust the costmap parameters such as `obstacle_max_range`, `obstacle_min_range`, and `max_obstacle_height` of nav2 launch file based on their specific robot and environment requirements. + +--- + +## Troubleshooting + +### 1. Depth estimator dependency mismatch + +If you are using `depth_anything_v3`, ensure that the dependency versions match those required by the package. + +Refer to the official repository for tested dependencies: + +https://github.com/ika-rwth-aachen/ros2-depth-anything-v3-trt#dependencies + +Version mismatches (e.g., TensorRT, CUDA) may prevent the depth estimator from loading or running correctly. + +--- + +### 2. Point cloud not visualizing + +If the pipeline is running and point cloud messages are being published but no data appears in RViz or visualization tools, check the `camera_info` topic. + +You can verify it using: + +```bash +ros2 topic echo /pipeline/camera_info +``` +If the intrinsic camera parameters are all zeros, `depth_image_proc` will not be able to correctly project the depth image into a point cloud. + +Ensure that: +- The camera is properly calibrated +- A valid camera_info message is being published + +--- + +## Note to Future Contributors + +If any changes are made to the pipeline architecture, configuration structure, or node interfaces, please update the README and documentation accordingly to keep them consistent with the implementation. + diff --git a/nav2_depth_estimation_ai/launch/perception_pipeline.launch.py b/nav2_depth_estimation_ai/launch/perception_pipeline.launch.py new file mode 100644 index 0000000..eb6ff53 --- /dev/null +++ b/nav2_depth_estimation_ai/launch/perception_pipeline.launch.py @@ -0,0 +1,306 @@ +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from launch import LaunchDescription +from launch_ros.actions import LoadComposableNodes, Node +from launch_ros.descriptions import ComposableNode +from launch.conditions import IfCondition +from launch.substitutions import PathJoinSubstitution +from ament_index_python.packages import get_package_share_directory +from launch.substitutions import LaunchConfiguration, PythonExpression +from launch.actions import DeclareLaunchArgument, GroupAction, SetEnvironmentVariable +from launch_ros.descriptions import ParameterFile +from nav2_common.launch import RewrittenYaml + + +def generate_launch_description() -> LaunchDescription: + # Get the launch directory + nav2_depth_ai_dir = get_package_share_directory("nav2_depth_estimation_ai") + + # Create the launch configuration variables + container_name = LaunchConfiguration("container_name") + params_file = LaunchConfiguration("params_file") + use_intra_process_comms = LaunchConfiguration("use_intra_process_comms") + use_usb_cam = LaunchConfiguration("use_usb_cam") + use_depth_anything = LaunchConfiguration("use_depth_anything") + use_composition = LaunchConfiguration("use_composition") + container_name = LaunchConfiguration("container_name") + use_respawn = LaunchConfiguration("use_respawn") + log_level = LaunchConfiguration("log_level") + + configured_params = ParameterFile( + RewrittenYaml( + source_file=params_file, + param_rewrites={}, + ), + allow_substs=True, + ) + + stdout_linebuf_envvar = SetEnvironmentVariable( + "RCUTILS_LOGGING_BUFFERED_STREAM", "1" + ) + + declare_params_file_cmd = DeclareLaunchArgument( + "params_file", + default_value=PathJoinSubstitution( + [nav2_depth_ai_dir, "params", "nav2_depth_ai_params.yaml"] + ), + description="Full path to the ROS2 parameters file to use for all launched nodes", + ) + + declare_use_usb_cam_cmd = DeclareLaunchArgument( + "use_usb_cam", + default_value="True", + description="Whether to use Usb Cam", + ) + + declare_use_depth_anything_cmd = DeclareLaunchArgument( + "use_depth_anything", + default_value="True", + description="Whether to use Depth Anything", + ) + + declare_use_composition_cmd = DeclareLaunchArgument( + "use_composition", + default_value="True", + description="Whether to use composed bringup", + ) + + declare_use_intra_process_comms_cmd = DeclareLaunchArgument( + "use_intra_process_comms", + default_value="True", + description="Whether to use intra process communications", + ) + + declare_container_name_cmd = DeclareLaunchArgument( + "container_name", + default_value="nav2_depth_ai_container", + description="the name of container that nodes will load in if use composition", + ) + + declare_use_respawn_cmd = DeclareLaunchArgument( + "use_respawn", + default_value="False", + description="Whether to respawn if a node crashes. Applied when composition is disabled.", + ) + + declare_log_level_cmd = DeclareLaunchArgument( + "log_level", default_value="info", description="log level" + ) + + load_nodes = GroupAction( + condition=IfCondition(PythonExpression(["not ", use_composition])), + actions=[ + # Replace with your sensor driver as you see fit (i.e. realsense) + Node( + package="usb_cam", + name="usb_cam", + executable="usb_cam_exe", + output="screen", + respawn=use_respawn, + parameters=[configured_params], + # remappings=[ + # ("/camera_info", "/pipeline/camera_info"), + # ("/image_raw", "/pipeline/image_raw"), + # ], + arguments=["--ros-args", "--log-level", log_level], + condition=IfCondition(use_usb_cam), + ), + Node( + package="image_proc", + name="crop_decimate", + executable="crop_decimate_node", + output="screen", + respawn=use_respawn, + parameters=[configured_params], + remappings=[ + ("/in/camera_info", "/camera_info"), + ("/in/image_raw", "/image_raw"), + ("/out/camera_info", "/image_crop_decimate/camera_info"), + ("/out/image_raw", "/image_crop_decimate/image"), + ], + arguments=["--ros-args", "--log-level", log_level], + ), + Node( + package="image_proc", + name="resize", + executable="resize_node", + output="screen", + respawn=use_respawn, + parameters=[configured_params], + remappings=[ + ("/image/camera_info", "/image_crop_decimate/camera_info"), + ("/image/image_raw", "/image_crop_decimate/image_raw"), + ], + arguments=["--ros-args", "--log-level", log_level], + ), + Node( + package="depth_anything_v3", + name="depth_anything_v3", + executable="depth_anything_v3_exe", + output="screen", + respawn=use_respawn, + parameters=[configured_params], + remappings=[ + ("~/input/camera_info", "/resize/camera_info"), + ("~/input/image", "/resize/image_raw"), + ("~/output/depth_image", "depth_image"), + ], + arguments=["--ros-args", "--log-level", log_level], + condition=IfCondition(use_depth_anything), + ), + Node( + package="depth_image_proc", + name="pointcloud", + executable="point_cloud_xyzrgb_node", + output="screen", + respawn=use_respawn, + parameters=[configured_params], + remappings=[ + ("rgb/camera_info", "/resize/camera_info"), + ("rgb/image_rect_color", "/resize/image_raw"), + ("depth_registered/image_rect", "depth_image"), + ("points", "/pipeline/points"), + ], + arguments=["--ros-args", "--log-level", log_level], + ), + ], + ) + + load_composable_nodes = GroupAction( + condition=IfCondition(use_composition), + actions=[ + LoadComposableNodes( + target_container=container_name, + composable_node_descriptions=[ + # Replace with your sensor driver as you see fit (i.e. realsense) + ComposableNode( + package="usb_cam", + plugin="usb_cam::UsbCamNode", + name="usb_cam", + parameters=[configured_params], + # remappings=[ + # ("/camera_info", "/pipeline/camera_info"), + # ("/image_raw", "/pipeline/image_raw"), + # ], + extra_arguments=[ + {"use_intra_process_comms": use_intra_process_comms} + ], + condition=IfCondition(use_usb_cam), + ), + ComposableNode( + package="image_proc", + plugin="image_proc::CropDecimateNode", + name="crop_decimate", + parameters=[configured_params], + remappings=[ + ("/in/camera_info", "/camera_info"), + ("/in/image_raw", "/image_raw"), + ("/out/camera_info", "/image_crop_decimate/camera_info"), + ("/out/image_raw", "/image_crop_decimate/image"), + ], + extra_arguments=[ + {"use_intra_process_comms": use_intra_process_comms} + ], + ), + ComposableNode( + package="image_proc", + plugin="image_proc::ResizeNode", + name="resize", + parameters=[configured_params], + remappings=[ + ("/image/camera_info", "/image_crop_decimate/camera_info"), + ("/image/image_raw", "/image_crop_decimate/image"), + # NOTE: this camera_info topic is not remapping + # ( + # "/resize/camera_info", + # "/pipeline/camera_info_preprocessed", + # ), + # ("/resize/image_raw", "/pipeline/image_preprocessed"), + # ( + # "/resize/image_raw/compressed", + # "/pipeline/image_preprocessed/compressed", + # ), + ], + extra_arguments=[ + {"use_intra_process_comms": use_intra_process_comms} + ], + ), + ComposableNode( + package="depth_anything_v3", + plugin="depth_anything_v3::DepthAnythingV3Node", + name="depth_anything_v3", + parameters=[configured_params], + remappings=[ + ("~/input/camera_info", "/resize/camera_info"), + ("~/input/image", "/resize/image_raw/compressed"), + ("~/output/depth_image", "depth_image"), + ], + extra_arguments=[ + {"use_intra_process_comms": use_intra_process_comms} + ], + condition=IfCondition(use_depth_anything), + ), + ComposableNode( + package="depth_image_proc", + plugin="depth_image_proc::PointCloudXyzrgbNode", + name="pointcloud", + parameters=[configured_params], + remappings=[ + ("rgb/camera_info", "/resize/camera_info"), + ("rgb/image_rect_color", "/resize/image_raw"), + ("depth_registered/image_rect", "depth_image"), + ("points", "/pipeline/points"), + ], + extra_arguments=[ + {"use_intra_process_comms": use_intra_process_comms} + ], + ), + ], + ) + ], + ) + + container = Node( + package="rclcpp_components", + executable="component_container", + name=container_name, + output="screen", + parameters=[{"use_intra_process_comms": use_intra_process_comms}], + arguments=["--ros-args", "--log-level", log_level], + condition=IfCondition(use_composition), + ) + + # Create the launch description and populate + ld = LaunchDescription() + + # Set environment variables + ld.add_action(stdout_linebuf_envvar) + + # Declare the launch options + ld.add_action(declare_params_file_cmd) + ld.add_action(declare_use_usb_cam_cmd) + ld.add_action(declare_use_depth_anything_cmd) + ld.add_action(declare_use_composition_cmd) + ld.add_action(declare_use_intra_process_comms_cmd) + ld.add_action(declare_container_name_cmd) + ld.add_action(declare_use_respawn_cmd) + ld.add_action(declare_log_level_cmd) + + # Add the actions to launch + ld.add_action(load_nodes) + ld.add_action(container) + ld.add_action(load_composable_nodes) + + return ld diff --git a/nav2_depth_estimation_ai/package.xml b/nav2_depth_estimation_ai/package.xml new file mode 100644 index 0000000..182b475 --- /dev/null +++ b/nav2_depth_estimation_ai/package.xml @@ -0,0 +1,26 @@ + + + + nav2_depth_estimation_ai + 0.1.0 + A ROS 2 composable perception pipeline that integrates camera driver, + image preprocessing, depth estimation, and point cloud projection into a modular component + container for navigation use. + Vidyadharan + Steve Macenski + Apache-2.0 + + ament_cmake + + depth_image_proc + depth_anything_v3 + image_proc + usb_cam + + ament_lint_auto + ament_lint_common + + + ament_cmake + + \ No newline at end of file diff --git a/nav2_depth_estimation_ai/params/nav2_depth_ai_params.yaml b/nav2_depth_estimation_ai/params/nav2_depth_ai_params.yaml new file mode 100644 index 0000000..f52bda0 --- /dev/null +++ b/nav2_depth_estimation_ai/params/nav2_depth_ai_params.yaml @@ -0,0 +1,54 @@ +usb_cam: + ros__parameters: + video_device: /dev/video0 + image_width: 640 + image_height: 480 + pixel_format: mjpeg2rgb + frame_rate: 30.0 + +crop_decimate: + ros__parameters: + x_offset: 0 + y_offset: 0 + width: 640 + height: 480 + decimation_x: 1 + decimation_y: 1 + +resize: + ros__parameters: + width: 504 + height: 280 + + +depth_anything_v3: + ros__parameters: + # Model configuration + onnx_path: "$(find-pkg-share depth_anything_v3)/models/DA3METRIC-LARGE.onnx" + precision: "fp16" # fp16 or fp32 + + # Debug configuration + enable_debug: false + debug_colormap: "JET" # JET, HOT, COOL, SPRING, SUMMER, AUTUMN, WINTER, BONE, GRAY, HSV, PARULA, PLASMA, INFERNO, VIRIDIS, MAGMA, CIVIDIS + debug_filepath: "/tmp/depth_anything_v3_debug/" + write_colormap: false + debug_colormap_min_depth: 0.0 # Minimum depth value for colormap visualization + debug_colormap_max_depth: 50.0 # Maximum depth value for colormap visualization + sky_threshold: 0.3 # Threshold for sky classification (lower = more sky) + sky_depth_cap: 200.0 # Maximum depth value to fill sky regions + + # Point cloud downsampling (1 = no downsampling, 10 = every 10th point) + point_cloud_downsample_factor: 2 + + # Point cloud colorization with RGB from input image + colorize_point_cloud: true # Set to true to publish RGB point cloud instead of XYZ only + + +pointcloud: + ros__parameters: + image_transport: "raw" # "raw" or "compressed" + depth_image_transport: "raw" # "raw" or "compressed" + queue_size: 10 + invalid_depth: 0.0 # Depth value to use for invalid points (e.g., sky) + colorize: true # Set to true to publish RGB point cloud instead of XYZ only + exact_sync: true # Set to true to use exact sync, false for approximate synchronization \ No newline at end of file