Dataset Setup

you should prepare the following datasets before running the experiments

if you only want to run experiments on one specific dataset, you can focus on the setup for the specific task

VQA-v2

Image Features

We use Grid-Features extracted by the pretrained ResNext152 model based on grid-feats-vqa, with each image being represented as an dynamic number (maximum number equals to 608) of 2048-D features. We first padded each feature into 32 × 32 scale and then pooled it by a kernel size of 2 × 2 with a stride of 2 to get our 16 × 16 feature. We did the same pooling operation to get the smaller feature with scale of 8 × 8. We save the features for each image as a .npy file. We only provide our extracted 8 × 8 features here, you can download the extracted features from OneDrive or BaiduYun with code igr6 The downloaded files containes three files: train2014.zip, val2014.zip, and test2015.zip, corresponding to the features of the train/val/test images for VQA-v2, respectively.

All the image features file should be unzipped to data/vqa/feats folder as the following data structure:

|-- data
	|-- vqa
	|  |-- feats
	|  |  |-- train2014
	|  |  |  |-- COCO_train2014_...jpg.npy
	|  |  |  |-- ...
	|  |  |-- val2014
	|  |  |  |-- COCO_val2014_...jpg.npy
	|  |  |  |-- ...
	|  |  |-- test2015
	|  |  |  |-- COCO_test2015_...jpg.npy
	|  |  |  |-- ...

Extract Feature By Yourself

If you want to train TRAR on 16 × 16 features, you can extract the features by yourself following these steps:

clone our own extension of grid-feats-vqa repo:

$ git clone https://github.com/rentainhe/TRAR-Feature-Extraction.git

check the following tutorial TRAR_Feature_Extraction for more details.

QA Annotations

Download all the annotation json file for VQA-v2 from OneDrive or BaiduYun with code 6fb6

In addition, You can also use VQA samples from the Visual Genome to augment the training samples following openvqa, you can directly download VG question and annotation files from OneDrive or BaiduYun

All the QA annotation files should be unzipped to data/vqa/raw folder as the following data structure:

|-- data
	|-- vqa
	|  |-- raw
	|  |  |-- v2_OpenEnded_mscoco_train2014_questions.json
	|  |  |-- v2_OpenEnded_mscoco_val2014_questions.json
	|  |  |-- v2_OpenEnded_mscoco_test2015_questions.json
	|  |  |-- v2_OpenEnded_mscoco_test-dev2015_questions.json
	|  |  |-- v2_mscoco_train2014_annotations.json
	|  |  |-- v2_mscoco_val2014_annotations.json
	|  |  |-- VG_questions.json
	|  |  |-- VG_annotations.json

CLEVR

We built CLEVR dataset following openvqa

Images, Questions and Scene Graphs

Download all the CLEVR v1.0 from the official site, including all the splits needed for training, validation and testing.

All the image files, question files and scene graphs should be unzipped to data/clevr/raw folder as the following data structure:

|-- data
	|-- clevr
	|  |-- raw
	|  |  |-- images
	|  |  |  |-- train
	|  |  |  |  |-- CLEVR_train_000000.json
	|  |  |  |  |-- ...
	|  |  |  |  |-- CLEVR_train_069999.json
	|  |  |  |-- val
	|  |  |  |  |-- CLEVR_val_000000.json
	|  |  |  |  |-- ...
	|  |  |  |  |-- CLEVR_val_014999.json
	|  |  |  |-- test
	|  |  |  |  |-- CLEVR_test_000000.json
	|  |  |  |  |-- ...
	|  |  |  |  |-- CLEVR_test_014999.json
	|  |  |-- questions
	|  |  |  |-- CLEVR_train_questions.json
	|  |  |  |-- CLEVR_val_questions.json
	|  |  |  |-- CLEVR_test_questions.json
	|  |  |-- scenes
	|  |  |  |-- CLEVR_train_scenes.json
	|  |  |  |-- CLEVR_val_scenes.json

Image Features

Following the previous work, we extract iamge features using a pretrained ResNet-101 model and generate .h5 files, with each file corresponding to one image.

$ cd data/clevr
$ python clevr_extract_feat.py --mode=all --gpu=0

All the processed feature files should be placed in data/clevr/feats folder as the following data structrue:

|-- data
	|-- clevr
	|  |-- feats
	|  |  |-- train
	|  |  |  |-- 1.npz
	|  |  |  |-- ...
	|  |  |-- val
	|  |  |  |-- 1.npz
	|  |  |  |-- ...
	|  |  |-- test
	|  |  |  |-- 1.npz
	|  |  |  |-- ...

FQAs

Q: When running clevr_extract_feat.py comes up ImportError: cannot import name 'imread'

A: Make sure you have already install Pillow first. If it still not work, you should use a lower version of scipy.

$ pip install Pillow
$ pip install scipy==1.2.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DATA.md

DATA.md

Dataset Setup

VQA-v2

CLEVR

FQAs

Files

DATA.md

Latest commit

History

DATA.md

File metadata and controls

Dataset Setup

VQA-v2

CLEVR

FQAs