Client-driven animated GIF generation framework using an acoustic feature
This repository contains the original implementation of the paper Client-driven animated GIF generation framework using an acoustic feature, published in MTAP 2021.
We present a novel methodology to generate animated GIF images using the computational resources of a client device. The method analyzes an acoustic feature from the climax section of an audio file to estimate the timestamp corresponding to the maximum pitch. Further, it processes a small video segment to generate the GIF instead of processing the entire video. This makes the proposed method computationally efficient, unlike baseline approaches that use entire videos to create GIFs.
- Linux or macOS
- Python 3.6
- CPU or NVIDIA GPU + CUDA CuDNN
- Clone this repo:
git clone https://github.com/iamgmujtaba/gif-acoustic
cd gif-acoustic
- Install TensorFlow and Keras and other dependencies
- For pip users, please type the command
pip install -r requirements.txt
.
- For pip users, please type the command
- Download the GTZAN dataset here
- Extract the downloaded file in the data folder and the structure should look like this:
├── data/
├── gtzan
├── blues
├── classical
.
.
.
├── rock
- Run python train code to train the GTZAN dataset
python train.py
To run the code, you will need to configure HLS Server. Follow the hls-server guide for configuration. To generate segments and extract audio files from multiple videos, run the following script from HLS-Server
python .\main.py -i .\input\ -o .\output\
- Run python HCR_proposed.py code to test the computation processing time using the proposed method on HCR device.
python HCR_proposed.py
- Run python HCR_baseline.py code to test the computation processing time using the baseline method on HCR device.
python HCR_baseline.py
Note: Github does not support animated WebP formats. We have to convert WebP images to GIF to use in Github.
Example of GIFs generated using the proposed method
Officially, Jetson devices do not support installation. MMAPI or GStreamer can be used. Please use the following guide to install FFmpeg. FFmpeg installation on Jetson TX2
If you use this code for your research, please cite our paper.
@article{mujtaba2021,
title={Client-driven animated GIF generation framework using an acoustic feature},
author={Mujtaba, Ghulam and Lee, Sangsoon and Kim, Jaehyoun and Ryu, Eun-Seok},
journal={Multimedia Tools and Applications},
year={2021},
publisher={Springer}}
Copyright (c) 2021, Ghulam Mujtaba. All rights reserved. This code is provided for academic, non-commercial use only. Redistribution and use in source and binary forms, with or without modification, are permitted for academic non-commercial use provided that the following conditions are met:
- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation provided with the distribution.
This software is provided by the authors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the authors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.