Skip to content

JS-RML/learned_scooping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Learned Scooping

1. Motivation

We have designed a model-based scooping method via motion control with a minimalist hardware design: a two-fingered parallel-jaw gripper with a fixed-length finger and a variable-length thumb. When being executed in a bin scenario, instance segmentation using Mask R-CNN and pose estimation using Open3D 0.7.0.0 are needed. Also, the model analyzes one object on a flat surface, and cannot reflect complex interactions in a 3-D environment. For a heterogeneous cluster of unseen objects, it is difficult to apply the previous model-based method. Thus, we design a supervised hierarchical learning framework to predict the parameters of the scooping action directly from the RGB-D image of the bin scenario. Here are some video clips of the experiments.

2. Our learning framework

There are five parameters to be predicted: the finger position 𝑝, the horizontal distance between two fingers 𝑑, the ZYX Euler angle representation of the gripper orientation: yaw 𝛼, pitch 𝛽, and roll 𝛾. We design a hierarchical three-tier learning method. The input of the framework is the RGB-D image of the bin scenario. Tier 1 outputs the prediction of finger position 𝑝, and yaw 𝛼. Tier 2 predicts the distance 𝑑. Tier 3 predicts another two parameters: 𝛽 and 𝛾. See the following figure:

3. Prerequisites

3.1 Hardware

3.2 Software

This implementation requires the following dependencies (tested on Ubuntu 16.04 LTS):

4. Making Heightmap and Annotating Software

1. Make Heightmap

We use a Realsense L515 camera to get the RGB image and the depth image. Then, we combine the RGB image and the depth image to make the RGB-D heightmap. A heightmap is an RGB-D image obtained from a 3D point cloud, describing the 3D information of the bin scenario. Each pixel in the heightmap is in linear relation to its horizontal position in the world frame and corresponds to a value indicating the height-from-bottom information.

python utils/heightmap.py

Here is the set of heightmaps describing the cluster of Go stones, domino blockes, Acrylic borads: image set

2. Annotating Software

I also write an annotating software to label the data.

  • learned_scooping/annotating_software/label_Tier1.py is for Tier 1, where the pixel where should (not) be the target finger position should be labeled green (red). You can choose the shape and size of the brush.
  • learned_scooping/annotating_software/label_Tier2.py is for Tier 2. We need to label the target thumb position given the target finger position.

5. Training the Network

  • Train Tier 1: learned_scooping/training_program/training_tier1.ipynb
  • Train Tier 2: learned_scooping/training_program/training_tier2.ipynb
  • Train Tier 3: learned_scooping/training_program/training_tier3.ipynb

Note: The online compiler Jupyter Notebook is needed to run our program.

6. Network Parameters

7. Test on Real UR10 Robot:

Please run the following program: learned_scooping/test_on_real_robot.py

Before running the program, please first download the network parameters and save them to a proper position. Then, change the corresponding address in the code (line 40, line 42, and line 44).

This can be successfully applied to the cluster of Go stones, domino blockes, Acrylic borads and key-shape 3D-printed models.

Maintenance

For any technical issues, please contact: Tierui He ([email protected]).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published