This repo is the official implementation of NeurIPS 2023 paper, Demand-driven Navigation
- README
- Instruction Dataset
- Trajectory Dataset
- Pre-generated Dataset
- Utils Code
- Training
- Testing
Warning: I am currently refactoring my code. Although all code committed to the repository has been tested, there may still be some minor issues. More comments will be continuously added to the code to improve its readability. Feel free to ask any questions.
For all dataset and pretrained models, the download link is Googledrive and Onedrive(recommend).
For Chinese, we provide 百度网盘.
Please see dataset.
We provide the raw trajectory data. Please move them to dataset and then unzip them. The following is the structure of the files in the raw_trajectory_dataset.zip
package. bc_{train,val}_check.json
are the metadata of trajectory dataset.
┌bc
│ ├train
│ │ └house_{idx}
│ │ └path_{idx}
│ │ └{idx}.jpg
│ └val
│ └house_{idx}
│ └path_{idx}
│ │ └{idx}.jpg
├bc_train_check.json
┕bc_val_check.json
In order to speed up the training, we use DETR model to segment the image in advance and get the corresponding CLIP-Visual-Feature. It takes
python generate_pre_data.py --mode=pre_traj_crop --dataset_mode=train --top_k=16
python generate_pre_data.py --mode=pre_traj_crop --dataset_mode=val --top_k=16
python generate_pre_data.py --mode=merge_pre_crop_json
We have provided the pre-generated dataset in the Materials Download
.
To train the Attribute Module, prepare the following files in the dataset: instruction_{train,val}_check.json
, LGO_features.json
, instruction_bert_features_check.json
Then run:
python train_attribute_features.py --epoch=5000
Finally, select the model with the lowest loss on the validation set, named attribute_model2.pt
.
To train the navigation policy, prepare the following files in the dataset: bc_train_{0,1,2,3,4}_pre.h5
, bc_{train,val}_check.json
, in the pretrained_model: attribute_model2.pt
, mae_pretrain_model.pth
Then run
python main.py --epoch=30 --mode=train_DDN --workers=32 --dataset_mode=train --device=cuda:0
First, we need to select the model using validation set.
python eval.py --mode=eval_DDN --eval_path=$path_to_saved_model$ --dataset_mode=val --device=cuda:0 --workers=32
Then we select the model with the highest accuracy on the validation set, assuming its index is
python eval.py --mode=test_DDN --eval_path=$path_to_saved_model$ --dataset_mode=$train,test$ --seen_instruction=$0,1$ --device=cuda:0 --epoch=500 --eval_ckpt=$idx$
For the parameter dataset_mode
, 'train' represents 'seen_scene', while 'test' represents 'unseen_scene'. Just choose one of them during the test.
For the parameter seen_instruction
, '1' represents 'seen_instruction', while '0' represents 'unseen_scene'. Just choose one of them during the test.
Note: if you run AI2Thor in a headless machine, xvfb
is highly recommended. Here is an example.
xvfb-run -a python eval.py --mode=test_DDN --eval_path=$path_to_saved_model$ --dataset_mode=train --seen_instruction=1 --device=cuda:0 --epoch=500 --eval_ckpt=15
If you have any suggestion or questions, please feel free to contact us: