To run experiments with NTTUD.v1, NTTUD.v2 in "Neural Twins Talk and Alternative Calculations" refer to ZoDIAC repo. Run the experiments in the same environment you use for running experiments for ZoDIAC. You will also need to modify the config file for the run accordingly.
We provide training and evaluation for both the NTT model (Zohourianshahzadi & Kalita, IEEE HCCAI 2020) and the NBT model (Jiasen Lu et al. IEEE CVPR 2018).
First download the repository and download and unzip the data.zip and tools.zip inside the project directory.
This repository contains a Dockerfile for setting up the docker container for COCO experiments (Karpathy's / robust / Novel Splits) on GPU. To build the Docker container, execute the following command from the project root:
docker build -t ntt .
Before running the container, you need to get COCO dataset downloaded and kept somewhere in your filesystem. In order to do this, go to data folder, copy coco_2014.sh into a directory that is not a subset of project directory. For instance, inside bash do:
cd ../..
inside bash when in data folder, and then:
mv neuraltwinstalk/data/coco_2014.sh .
and then run the coco_2014.sh with bash:
sh coco_2014.sh
Declare two environment variables:
$COCO_I
: path to a directory with sub-directories of images astrain2014
,val2014
,test2015
, etc...$COCO_A
: path to a directory with annotation files likeinstances_train2014.json
,captions_train2014.json
etc...
These directories will be attached as "volumes" to our docker container for Neural Twins Talk to use within. Get nvidia-docker and execute this command to run the fresh built docker image.
nvidia-docker run --name ntt_container -it \
-v $COCO_I:/workspace/neuraltwinstalk/data/coco/images \
-v $COCO_A:/workspace/neuraltwinstalk/data/coco/annotations \
--shm-size 16G -p 8888:8888 ntt /bin/bash
Ideally, shared memory size (--shm-size
) of 16GB would be enough. Tune it according to your requirements / machine specifications.
Saved Checkpoints: All checkpoints will be saved in /workspace/neuraltwinstalk/save
. From outside the container, execute this to get your checkpoints from this container into the main filesystem:
The container would expose port 8888, which can be used to host tensorboard visualizations.
docker container cp ntt_container:workspace/neuraltwinstalk/save /path/to/local/filesystem/save
Skip directly to Training and Evaluation section to execute specified commands within the container.
- Python 3.7
- pytorch : pytorch:0.4-cuda9-cudnn7-devel
- Other requirements are handled by DockerFile.
All data is prepared and ready inside data.zip.
Next, go to prepro folder with bash and execute the following command: (This downloads the Stanford-corenlp, the version we need)
sh download_scnlp.sh
Pre-trained models will be available here soon. Stay tuned.
First, modify the cofig file cfgs/normal_coco_res101.yml
with the correct file path.
python main.py --path_opt cfgs/normal_coco_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --mGPUs True --glove_6B_300 True
Train the model or Download Pre-trained model. Extract the tar.zip file and put it under save/
.
python main.py --path_opt cfgs/normal_coco_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/normal_coco_1024_adam --mGPUs True --glove_6B_300 True
Modify the cofig file cfgs/normal_flickr_res101.yml
with the correct file path.
python main.py --path_opt cfgs/normal_flickr_res101.yml --batch_size 80 --cuda True --num_workers 20 --max_epoch 30 --mGPUs True --glove_6B_300 True
Train the model or Download Pre-trained model. Extract the tar.zip file and put it under save/
.
python main.py --path_opt cfgs/normal_flickr_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/normal_flickr30k_1024_adam --mGPUs True --glove_6B_300 True
Modify the cofig file cfgs/normal_flickr_res101.yml
with the correct file path.
python main.py --path_opt cfgs/robust_coco.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --mGPUs True --glove_6B_300 True
Train the model or Download Pre-trained model. Extract the tar.zip file and put it under save/
.
python main.py --path_opt cfgs/robust_coco.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/robust_coco_1024 --mGPUs True --glove_6B_300 True
python main.py --path_opt cfgs/noc_coco_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --mGPUs True --glove_6B_300 True
Train the model or Download Pre-trained model. Extract the tar.zip file and put it under save/
.
python main.py --path_opt cfgs/noc_coco_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/noc_coco_1024_adam --mGPUs True --glove_6B_300 True
All data is prepared and ready inside data.zip.
Next, go to prepro folder with bash and execute the following command: (This downloads the Stanford-corenlp, the version we need)
sh download_scnlp.sh
Pre-trained models will be available here soon. Stay tuned.
First, modify the cofig file cfgs/normal_coco_res101.yml
with the correct file path.
python main.py --path_opt cfgs/normal_coco_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --mGPUs True --glove_6B_300 True --att_model topdown
Train the model or Download Pre-trained model. Extract the tar.zip file and put it under save/
.
python main.py --path_opt cfgs/normal_coco_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/normal_coco_1024_adam --mGPUs True --glove_6B_300 True --att_model topdown
Modify the cofig file cfgs/normal_flickr_res101.yml
with the correct file path.
python main.py --path_opt cfgs/normal_flickr_res101.yml --batch_size 80 --cuda True --num_workers 20 --max_epoch 30 --mGPUs True --glove_6B_300 True --att_model topdown
Train the model or Download Pre-trained model. Extract the tar.zip file and put it under save/
.
python main.py --path_opt cfgs/normal_flickr_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/normal_flickr30k_1024_adam --mGPUs True --glove_6B_300 True --att_model topdown
Modify the cofig file cfgs/normal_flickr_res101.yml
with the correct file path.
python main.py --path_opt cfgs/robust_coco.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --mGPUs True --glove_6B_300 True --att_model topdown
Train the model or Download Pre-trained model. Extract the tar.zip file and put it under save/
.
python main.py --path_opt cfgs/robust_coco.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/robust_coco_1024 --mGPUs True --glove_6B_300 True --att_model topdown
python main.py --path_opt cfgs/noc_coco_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --mGPUs True --glove_6B_300 True --att_model topdown
Train the model or Download Pre-trained model. Extract the tar.zip file and put it under save/
.
python main.py --path_opt cfgs/noc_coco_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/noc_coco_1024_adam --mGPUs True --glove_6B_300 True --att_model topdown
For multiple GPU training simply add --mGPUs Ture
in the command when training the model.
For Karpathy's split on COCO you can run the following:
python demo.py --path_opt cfgs/normal_coco_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/normal_coco_1024_adam4 --mGPUs True --glove_6B_300 True
For other splits, replace the main.py with demo.py in the evaluation commands.
We also provide the ability to train with BERT Embeddings, the original embedding used in experiences in the paper is Glove_6B_300. Feel free to train the models with BERT embeddings as well. In order to train the models with BERT embeddings, simply replace the --glove_6B_300 in the commands to --bert_base_768. Similar to the following:
python main.py --path_opt cfgs/normal_coco_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --mGPUs True --bert_base_768 True
We thank Jiasen Lu et al. for NBT repo and Ruotian Luo for his self-critical.pytorch repo.
If you use this code in your project, please cite the following paper.
@INPROCEEDINGS{9230394,
author={Z. {Zohourianshahzadi} and J. K. {Kalita}},
booktitle={2020 IEEE International Conference on Humanized Computing and Communication with Artificial Intelligence (HCCAI)},
title={Neural Twins Talk},
year={2020},
pages={17-24},
doi={10.1109/HCCAI49649.2020.00009}}