Bridging the Gap: Enhancing the Utility of Synthetic Data via Post-Processing Techniques (BMVC 2023)

The official TensorFlow implementation of the BMVC2023 paper: Bridging the Gap: Enhancing the Utility of Synthetic Data via Post-Processing Techniques.

Results

The CAS obtained from the classifiers trained only on generated data. The GaFi pipeline is compared with the previous state of the art, with the Synthetic Baseline and with the accuracy of the classifiers trained on real data.

Installation

Clone the GitHub repository:

git clone https://github.com/sup3rgiu/GaFi-Pipeline.git

Move inside the docker directory:

cd GaFi-Pipeline/docker

Build docker image:

docker build --rm -t gafi_pipeline .

Usage

Train the classifier on real data:

docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 --name GaFiPipeline -it --rm -v /path_to_GaFi-Pipeline:/exp -t gafi_pipeline python train_classifier.py --cfg_file ./configs/CIFAR10/ResNet20.yaml

Train GAN:

docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 --name GaFiPipeline -it --rm -v /path_to_GaFi-Pipeline:/exp -t gafi_pipeline python train_gain.py --cfg_file ./configs/CIFAR10/BigGAN_deep.yaml

Run full pipeline:
N.B.: adjust GAN name if needed. You can do it inside the .yaml file or as cmd argument

docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 --name GaFiPipeline -it --rm -v /path_to_GaFi-Pipeline:/exp -t gafi_pipeline python run_pipeline.py --cfg_file ./configs/CIFAR10/Pipeline.yaml --gan_name GAN_NAME

Iterate through steps 2. and 3. N times, changing the seed each time to obtain N different generators:

docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 --name GaFiPipeline -it --rm -v /path_to_GaFi-Pipeline:/exp -t gafi_pipeline python train_gain.py --cfg_file ./configs/CIFAR10/BigGAN_deep.yaml --seed NEW_SEED

docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 --name GaFiPipeline -it --rm -v /path_to_GaFi-Pipeline:/exp -t gafi_pipeline python run_pipeline.py --cfg_file ./configs/CIFAR10/Pipeline.yaml --gan_name NEW_GAN_NAME

Run the MultiGAN script to obtain a classifier trained on a synthetic dataset sampled from the N different generators:
N.B.: adjust all the GAN names inside the .yaml file or as cmd argument

docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 --name GaFiPipeline -it --rm -v /path_to_GaFi-Pipeline:/exp -t gafi_pipeline python run_multigan.py --cfg_file ./configs/CIFAR10/MultiGAN.yaml

All default parameters defined in the .yaml configuration files can be overridden by specifying the corresponding command-line arguments.
For example, if we want to use the default ./configs/CIFAR10/ResNet20.yaml but train in mixed precision, we can do the following:

docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 --name GaFiPipeline -it --rm -v /path_to_GaFi-Pipeline:/exp -t gafi_pipeline python train_classifier.py --cfg_file ./configs/CIFAR10/ResNet20.yaml --mixed_precision

Or if we want to train the classifier without using random erasing augmentation:

docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 --name GaFiPipeline -it --rm -v /path_to_GaFi-Pipeline:/exp -t gafi_pipeline python train_classifier.py --cfg_file ./configs/CIFAR10/ResNet20.yaml --random_erasing False

All possible arguments are defined in parser.py and can be seen by running the scripts with the -h flag.

Citation

Should you find this repository useful, please consider citing:

@misc{lampis2023bridging,
      title={Bridging the Gap: Enhancing the Utility of Synthetic Data via Post-Processing Techniques}, 
      author={Andrea Lampis and Eugenio Lomurno and Matteo Matteucci},
      year={2023},
      eprint={2305.10118},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
docker		docker
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
run_multigan.py		run_multigan.py
run_pipeline.py		run_pipeline.py
train_classifier.py		train_classifier.py
train_gan.py		train_gan.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bridging the Gap: Enhancing the Utility of Synthetic Data via Post-Processing Techniques (BMVC 2023)

Results

Installation

Usage

Citation

About

Releases

Packages

Languages

sup3rgiu/GaFi-Pipeline

Folders and files

Latest commit

History

Repository files navigation

Bridging the Gap: Enhancing the Utility of Synthetic Data via Post-Processing Techniques (BMVC 2023)

Results

Installation

Usage

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages