Skip to content

Add nav2 perception pipeline (RGB → depth → point cloud)#1

Open
Vidyadharan98 wants to merge 22 commits intoros-navigation:mainfrom
Vidyadharan98:feature/nav2-perception-pipelines
Open

Add nav2 perception pipeline (RGB → depth → point cloud)#1
Vidyadharan98 wants to merge 22 commits intoros-navigation:mainfrom
Vidyadharan98:feature/nav2-perception-pipelines

Conversation

@Vidyadharan98
Copy link
Copy Markdown

Summary

This PR adds the nav2_perception_pipelines package, which provides a configurable perception pipeline that converts RGB images into depth maps and point clouds.

The pipeline is modular and configurable, allowing users to swap components such as image sources and depth estimation models through a YAML configuration file.

All components run as ROS 2 composable nodes inside a single container with intra-process communication enabled for improved efficiency.


Key Features

  • End-to-end pipeline:
    image source → preprocessing → depth estimation → point cloud projection
  • Configurable image source and depth estimator to allow swapping with alternative implementations
  • Preprocessing parameters fully configurable via YAML
  • Launch file composes all nodes into a single component container with intra-process communication enabled

Demo

nav2_perception_pipe_demo_1.mp4

Issue reference

Implements the perception pipeline discussed in ros-navigation/navigation2#5536

…d config

Signed-off-by: Vidyadharan98 <vidyadharan98@gmail.com>
Copy link
Copy Markdown
Member

@SteveMacenski SteveMacenski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this generally looks good, but where's the example of it integrated into Nav2 (configuration file, tutorial launch file, etc) and video of it all working? This is a good first step to get it going, now its about actually showing it in use in an example configuration/launch file/short video

Overall, a good first run!

Comment thread nav2_depth_estimation_ai/package.xml
Comment thread nav2_perception_pipelines/package.xml Outdated
Comment thread nav2_depth_estimation_ai/package.xml
Comment thread nav2_perception_pipelines/docs/nav2_perception_pipe_demo_1.mp4 Outdated
Comment thread README.md Outdated
Comment thread nav2_perception_pipelines/README.md Outdated
Comment thread nav2_perception_pipelines/README.md Outdated
Comment thread nav2_perception_pipelines/README.md Outdated
Comment thread nav2_depth_estimation_ai/README.md
Comment thread nav2_perception_pipelines/README.md Outdated
@SteveMacenski
Copy link
Copy Markdown
Member

@sachinkum0009 @Vidyadharan98 Any update? I'd LOVE to have this in before Lyrical!

@Vidyadharan98
Copy link
Copy Markdown
Author

@sachinkum0009 @Vidyadharan98 Any update? I'd LOVE to have this in before Lyrical!

Hello @SteveMacenski , I have made some updates but yet to complete. Occupied with some personal commitments lately 😅. But we will be able to complete this before Lyrical. Will update soon. Thank you.

@sachinkum0009
Copy link
Copy Markdown

Hi, @SteveMacenski @Vidyadharan98

I have setup the Turtlebot3 with camera and tried to run basic implementation
RGB->Depth->Pointcloud->Voxel Costmap Layer (Nav2)

I have attached image for Rviz containing the RGB, Depth images, with Pointcloud and Nav2 costmap for voxel layer.

Screenshot 2026-04-20 at 14 05 18 Screenshot 2026-04-20 at 14 03 59 tb3_setup

@SteveMacenski
Copy link
Copy Markdown
Member

@sachinkum0009 any interest in helping drive this to the finishline? There's a few open comments and then obviously using your videos and images. Thanks for the images (and hopefully video navigating using it)!

@Vidyadharan98
Copy link
Copy Markdown
Author

Hello @sachinkum0009 , thank you for sharing the demo 💙.

Hello @SteveMacenski , apologies for the delay. Could you please give me until April 27? I’d like to complete it by then.

@SteveMacenski
Copy link
Copy Markdown
Member

OK!

@sachinkum0009
Copy link
Copy Markdown

@sachinkum0009 any interest in helping drive this to the finishline? There's a few open comments and then obviously using your videos and images. Thanks for the images (and hopefully video navigating using it)!

Thanks, yes. I will do the navigating test today and will record some videos.

@Vidyadharan98 Can you please invite me to your repo? I will take care of the comments and will push to finishline. 😄

@sachinkum0009
Copy link
Copy Markdown

Hi @SteveMacenski @Vidyadharan98

Please find the video attached of TB3 Navigation2.

tb3_nav2_dep_any.mp4

There are two issues I want to investigate.

  1. The Pointcloud data is flickering a little, it could be because of Camera auto focus, exposure or white balance. (will investigate in upcoming days)
  2. I am thinking of post processing the PC to remove the ground plane and remove noise using PointCloud filters

LMK, what are your suggestions for this?

@SteveMacenski
Copy link
Copy Markdown
Member

I think thoes are good suggestions! Maybe not removing the ground plane (the minimum height filter should take care of that in the voxel layer - maybe just needs to be increased?) but removing noise may be good.

- End-to-end pipeline: image source → preprocessing → depth estimation →
  point cloud projection
- Image source and depth estimator are configurable to allow swapping
  with alternative implementations
- Preprocessing parameters are fully configurable via YAML
- Launch file composes all nodes into a single component container with
  intra-process communication enabled

Signed-off-by: Vidyadharan98 <vidyadharan98@gmail.com>
@Vidyadharan98 Vidyadharan98 force-pushed the feature/nav2-perception-pipelines branch from 1f27d65 to 9705f0a Compare April 26, 2026 03:38
@Vidyadharan98
Copy link
Copy Markdown
Author

Hello @SteveMacenski ,
I’ve pushed my changes, but there are still a few comments that need to be addressed.

Hello @sachinkum0009 ,
Thank you for sharing the navigation demo video. I’ve sent you a collaboration invite—please check and accept it when you have a moment.

Thank you for your understanding!

Signed-off-by: Vidyadharan98 <vidyadharan98@gmail.com>
@Vidyadharan98 Vidyadharan98 force-pushed the feature/nav2-perception-pipelines branch from e3f9aac to 0f839ca Compare April 26, 2026 04:52
sachinkum0009 and others added 3 commits April 27, 2026 22:23
- params launch file for all nodes
- CMakelist updated to include params folder

Co-authored-by: Copilot <copilot@github.com>
Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
Comment thread nav2_depth_estimation_ai/launch/perception_pipeline_launch.py Outdated
Comment thread nav2_depth_estimation_ai/launch/perception_pipeline_launch.py Outdated
Comment thread nav2_depth_estimation_ai/README.md Outdated
@SteveMacenski
Copy link
Copy Markdown
Member

Check out the still open comments - there are a few :-) I think the launch file needs to be redone from scratch using standard ROS 2 launch formatting. The tutorial README / configuration yaml then updated respectively.

@sachinkum0009
Copy link
Copy Markdown

sachinkum0009 commented Apr 28, 2026

To Do

  • Add license
  • Add launch file to use costmap
  • Update readme with instructions for costmap
  • Add PC filter to remove noise

Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
- removed commented code

Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
Comment thread nav2_depth_estimation_ai/launch/perception_pipeline.launch.py
Comment thread nav2_depth_estimation_ai/launch/perception_pipeline.launch.py
Comment thread nav2_depth_estimation_ai/params/nav2_depth_ai_params.yaml Outdated
Comment thread nav2_depth_estimation_ai/README.md Outdated
Comment thread nav2_depth_estimation_ai/README.md Outdated
Comment on lines +87 to +116
### Image Source

Defines the node responsible for providing the **input image stream** to the perception pipeline.

Example configuration:

```yaml
image_source:
type: rgb
package: usb_cam
plugin: usb_cam::UsbCamNode
parameters:
video_device: /dev/video0
image_width: 640
image_height: 480
pixel_format: mjpeg2rgb
frame_rate: 30.0
topics:
output_topic: /image_raw
camera_info_topic: /camera_info
```

| Parameter | Description |
| -------------------------- | -------------------------------------------------------------- |
| `type` | Specifies the input image type used by the pipeline. Supported types: `rgb` or `depth` |
| `package` | ROS 2 package that provides the image source node. |
| `plugin` | Fully qualified composable node plugin used to start the node. |
| `parameters` | Configuration parameters passed to the image source node. |
| `topics.output_topic` | Topic where the node publishes the image stream. |
| `topics.camera_info_topic` | Topic where the node publishes camera calibration information. |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old I think?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't think, we need these explanation for these params now, as they are self explanatary. Should I remove them?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we still use this old launch file method, is there a "package" and "plugin" "type" parameter?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, these params were from old yaml file.

Copy link
Copy Markdown
Member

@SteveMacenski SteveMacenski Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand... The launch file now uses a consistent node, why does the README tutorial still have things like the following:


image_source:
  type: rgb
  package: usb_cam
  plugin: usb_cam::UsbCamNode
 
...

Comment thread nav2_depth_estimation_ai/README.md
@SteveMacenski
Copy link
Copy Markdown
Member

Thanks @sachinkum0009 - looks better already :-) Let me know on the costmap parts, but nit-pick tweaks at this point from my review only.

sachinkum0009 and others added 8 commits April 28, 2026 20:59
Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>
Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>
Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>
Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
- fixed the typo for remapping

Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
- nav2 params updated with voxel costmap layer for tb3
- rviz config added to visualize the pointcloud
- exec dependencies added for the
- removed config dir and added rviz dir in cmakelists
- readme updated with instructions for costmap layer

Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
@sachinkum0009
Copy link
Copy Markdown

Hi @SteveMacenski

Almost most of the things have be fixed. Would like to ask for another review with suggestions.

  1. will add and test the PC filter to remove noise tomorrow.
  2. plan to test different camera to see if flickering still happens

Also, would like to ask which License should be added to the launch files?

Thanks

Comment thread nav2_depth_estimation_ai/launch/nav2_bringup.launch.py Outdated
Comment thread nav2_depth_estimation_ai/params/nav2_params_waffle_pi.yaml Outdated
Comment thread nav2_depth_estimation_ai/rviz/nav2_pipeline.rviz Outdated
Comment thread nav2_depth_estimation_ai/README.md Outdated
Comment thread nav2_depth_estimation_ai/README.md
@SteveMacenski
Copy link
Copy Markdown
Member

Also, would like to ask which License should be added to the launch files?

Apache2.0 in general unless you have a reason to do otherwise, is my preference to be consistent with other Nav2 code

sachinkum0009 and others added 2 commits May 3, 2026 11:43
Co-authored-by: Steve Macenski <stevenmacenski@gmail.com>
Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
- Removed the launch file for the waffle as it doesn't fit
- Updated package and cmakelist
- Updated readme for users to integrate with their nav2 requirements.

Signed-off-by: Sachin Kumar <sachinkum123567@gmail.com>
@sachinkum0009
Copy link
Copy Markdown

Hi @SteveMacenski

Thanks for the review. I have pushed the changes.

Copy link
Copy Markdown
Member

@SteveMacenski SteveMacenski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More or less looks good to me. Do you think you could record a video where there isn't so much costmap noise in the scene before navigating (i.e. clearing the costmap from the previous test)? There's some noise in there I'd love to not have so that we can see the navigation more clearly without distracting noise.

I think with the few comments below, this is good to merge.

The next steps here would be to convert the README into a Nav2 tutorial and probably expand a little on the explanations. Imagine you're starting without knowing anything; explaining each of the technology items, how we put them together, and the steps. Maybe explain a bit more on the model / you can use any RGB camera / why we might want to do some pre- or post-processing / some of the key costmap configurations for this / etc.


This package provides a **perception pipeline** using AI-based depth estimation using DepthAnything V3 from RGB images for use in navigation and mobility tasks.

The pipeline is designed to be **modular and configurable**, allowing users to swap components such as image sources and depth estimation models using a YAML configuration file.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The pipeline is designed to be **modular and configurable**, allowing users to swap components such as image sources and depth estimation models using a YAML configuration file.
The pipeline is designed to be **modular and configurable**, allowing users to swap components such as image sources, pre- and post-processing nodes, and depth estimation models by modifying the launch file or configuration file.

@SteveMacenski
Copy link
Copy Markdown
Member

This is still open #1 (comment). I think some of these config parameters are from the old version that aren't used anymore

@sachinkum0009
Copy link
Copy Markdown

Thanks again for the review. I will update the old version config parameters and apply the filter to remove the noise for the pointcloud to create a map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants