With major contributions from versatran01.
Data downloader and data converter for DeepMind GQN dataset https://github.com/deepmind/gqn-datasets to use with other libraries than TensorFlow
Don't hesitate to make a pull request.
Dependencies
You need to install:
Download the tfrecord dataset
If you want to download the entire dataset:
gsutil -m cp -R gs://gqn-dataset/<dataset> .
If you want to download a proportion of the dataset only:
python download_gqn.py <dataset> <proportion>
Convert the raw dataset
Command line options:
usage: convert2file.py [-h] [-b BATCH_SIZE] [-n FIRST_N] [-m MODE]
base_dir dataset
Convert gqn tfrecords to gzip files.
positional arguments:
base_dir base directory of gqn dataset
dataset datasets to convert, eg. shepard_metzler_5_parts
optional arguments:
-h, --help show this help message and exit
-b BATCH_SIZE, --batch-size BATCH_SIZE
number of sequences in each output file
-n FIRST_N, --first-n FIRST_N
convert only the first n tfrecords if given
-m MODE, --mode MODE whether to convert train or test
Convert all records with all sequences in sm5 train (400 records, 2000 seq each):
python convert2file.py ~/gqn_dataset shepard_metzler_5_parts
Convert first 20 records with batch size of 128 in sm5 test:
python convert2file.py ~/gqn_dataset shepard_metzler_5_parts -n 20 -b 128 -m test
Size of the datasets:
Names | Sizes |
---|---|
total | 1.45 Tb |
------------- | -------------- |
jaco | 198.97 Gb |
mazes | 136.23 Gb |
rooms_free_camera_no_object_rotations | 255.75 Gb |
rooms_free_camera_with_object_rotations | 598.75 Gb |
rooms_ring_camera | 250.89 Gb |
shepard_metzler_5_parts | 21.09 Gb |
shepard_metzler_7_parts | 23.68 Gb |