Framework for robots to learn simple human tasks with 1 video.
Our solution comprises a 3-step pipeline for enabling robots to mimic human tasks through video:
- Feed robot with videos of human doing a basic task
- Provide input through UI to identify important visual features
- Robot mimics task, accounting for different configurations (e.g. cleaning, packing, cutting, etc.)
Currently, the code is broken down into a few main components:
cv/
: all visual preprocessing logic (hand tracking & object detection)third_party/
: external packages, including Git submodulessimulation/
: MuJoCo testbe
- Python 3
- MuJoCo
- Run
git clone --recurse-submodules <repo-url>
and navigate to the cloned folder - (Optional) Create a virtual environment:
python -m venv venv
and activate it:venv\Scripts\activate
pip install -r requirements.txt
See each subfolder for a more detailed README on how to execute the code in that module. yes
-
Make sure to have all detic submodules updated. Change the path to classifier in detic here: Detic/predict.py and Detic/detic/modeling/utils.py like this mimic/third_party/Detic/datasets/metadata/lvis_v1_clip_a+cname.npy
-
Download the Detic model and place in DETIC_ROOT/models/
- Used the last model in Cross Dataset Evaluation here: https://github.com/facebookresearch/Detic/blob/main/docs/MODEL_ZOO.md
- Pretty sure it is LVIS trained
- maybe we should check out box supervised (idk what that means)
Change path to Detic weights and yaml file in tracker.py
- For Segment-and-Track-Anything:
Run bash script/install.sh and bash script/download_ckpt.sh.
Download the two models and put int Segment-and-Track-Anything/ckpt/
-
Get the SwinB-DeAOTL model from here https://github.com/yoxu515/aot-benchmark/blob/main/MODEL_ZOO.md
-
download the SAM VIT huge and SAM VIT Large and SAM VIT base if you want to test different things
-
Change paths to weights files in Segment-and-Track-Anything/model_args.py; change path to grounding dino in detector.py
Even after doing everything, I had some error with this file: third_party/Detic/detic/modeling/backbone/timm.py, I just commented out the last import in that file and it ran fine.
- clean code above and push to something usable by this bum arsh
- Run detic + sam on query image - detic_sam_init() in tracker
- use dino matching to match object mask i got in step 1 with step 2 masks - arsh you got it chief