Skip to content

Architecture Overview

Mohsin Mirza edited this page Mar 31, 2026 · 3 revisions

Architecture Overview

The perception pipeline is built around a central dispatcher that orchestrates a set of isolated ROS 2 Lifecycle Nodes, each responsible for one detection capability. An external brain (robot controller) communicates with the dispatcher via a single ROS 2 Action interface.


High-Level Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Brain / Robot Controller                 │
│                    ros2 run perception brain_client <task>      │
└──────────────────────────────┬──────────────────────────────────┘
                               │  ROS 2 Action (RunVision)
                               ▼
┌─────────────────────────────────────────────────────────────────┐
│               Perception Dispatcher (vision_manager_node)       │
│         Manages lifecycle state of all downstream nodes         │
└──┬──────────┬──────────┬──────────┬──────────┬──────────┬──────┘
   │          │          │          │          │          │
   ▼          ▼          ▼          ▼          ▼          ▼
Car Obj   Screw     Screwdriver  Subdoor  Table Ht  Place Obj
Pipeline  Pipeline  Pipeline     Node     Node      Node

The 6 Vision Tasks

As shown in the pipeline diagram, there are 6 distinct tasks:

# Task Nodes Involved Output
1 Car Object Detection YOLO Car Node + Pose Node PoseStamped on /car/pose
2 Place Object Place Object Node PoseStamped on /perception/target_place_pose
3 Subdoor Poses Subdoor Pose Estimator MarkerArray on /body_markers
4 Screw Detection YOLO Screw Node + Pose Node PoseStamped on /screw/pose
5 Screwdriver Detection YOLO-OBB Node + Pose-OBB Node PoseStamped on /object_pose_screwdriver
6 Table Height Table Height RANSAC Node PoseStamped on /perception/table_pose

Node Inventory

Dispatcher

Node Executable Role
perception_pipeline_node vision_manager_node Central action server & lifecycle manager

Lifecycle Nodes (Managed)

Node Name Executable Task
yolo_car_node action_yolo_node YOLO detection for car objects
cropper_car_node action_pose_node 3D pose from car detections
yolo_screw_node action_yolo_node YOLO detection for screws
cropper_screw_node action_pose_node 3D pose from screw detections
obb_detector_node obb_node OBB detection for screwdriver
point_obb_cloud_cropper_node pose_obb_node 3D pose from OBB detections
subdoor_pose_estimator subdoor_node Body/door pose estimation
table_height_estimator table_height RANSAC table height
place_object_node place_object_node Safe placement zone detection

Lifecycle State Machine (FIX THIS)

Each managed node follows the standard ROS 2 lifecycle:

Unconfigured ──configure──▶ Inactive ──activate──▶ Active
                                ◀──deactivate──        │
     ◀─────────────cleanup─────────────────────────────┘

The dispatcher controls these transitions automatically based on the requested task. See Lifecycle Node System for details on eager vs lazy loading.


Package Structure

perception/
├── r4s/src/
│   ├── my_robot_interfaces/     # Custom ROS 2 action definition (RunVision.action)
│   └── perception/
│       ├── launch/              # Launch files
│       ├── models/              # YOLO model weights (~68MB)
│       └── perception/          # Python nodes
│           ├── action_vision_manager.py       # Dispatcher
│           ├── action_yolo_object_detection.py
│           ├── action_obb_object_detection.py
│           ├── action_pose_pca.py
│           ├── action_obb_pose.py
│           ├── table_height_ransac.py
│           ├── subdoor_pose_estimator.py
│           ├── place_object.py
│           └── brain_client.py
└── requirements.txt

Clone this wiki locally