Skip to content

Original Keras implementation of the code for the paper "Client-driven animated GIF generation framework using an acoustic feature," at 1171: Real-time 2D/3D Image Processing with Deep Learning (MTAP)

Notifications You must be signed in to change notification settings

iamgmujtaba/gif-acoustic

Repository files navigation

Client-driven animated GIF generation framework using an acoustic feature

This repository contains the original implementation of the paper Client-driven animated GIF generation framework using an acoustic feature, published in MTAP 2021.

We present a novel methodology to generate animated GIF images using the computational resources of a client device. The method analyzes an acoustic feature from the climax section of an audio file to estimate the timestamp corresponding to the maximum pitch. Further, it processes a small video segment to generate the GIF instead of processing the entire video. This makes the proposed method computationally efficient, unlike baseline approaches that use entire videos to create GIFs.

Prerequisite

  • Linux or macOS
  • Python 3.6
  • CPU or NVIDIA GPU + CUDA CuDNN

Getting Started

Installation

  • Clone this repo:
git clone https://github.com/iamgmujtaba/gif-acoustic
cd gif-acoustic
  • Install TensorFlow and Keras and other dependencies
    • For pip users, please type the command pip install -r requirements.txt.

GTZAN dataset train

  • Download the GTZAN dataset here
  • Extract the downloaded file in the data folder and the structure should look like this:
├── data/
   ├── gtzan
      ├── blues
      ├── classical
      .
      .
      .
      ├── rock

Train

  • Run python train code to train the GTZAN dataset
python train.py

Test

To run the code, you will need to configure HLS Server. Follow the hls-server guide for configuration. To generate segments and extract audio files from multiple videos, run the following script from HLS-Server

python .\main.py -i .\input\ -o .\output\
  • Run python HCR_proposed.py code to test the computation processing time using the proposed method on HCR device.
python HCR_proposed.py
  • Run python HCR_baseline.py code to test the computation processing time using the baseline method on HCR device.
python HCR_baseline.py

Experiments

Note: Github does not support animated WebP formats. We have to convert WebP images to GIF to use in Github.

Example of GIFs generated using the proposed method

YouTube

Maroon 5 Sugar Subeme Happier

Baseline

Maroon 5 Sugar Subeme Happier

Proposed

Maroon 5 Sugar Subeme Happier

Jetson Configuration

Officially, Jetson devices do not support installation. MMAPI or GStreamer can be used. Please use the following guide to install FFmpeg. FFmpeg installation on Jetson TX2

Citation

If you use this code for your research, please cite our paper.

@article{mujtaba2021,
  title={Client-driven animated GIF generation framework using an acoustic feature},
  author={Mujtaba, Ghulam and Lee, Sangsoon and Kim, Jaehyoun and Ryu, Eun-Seok},
  journal={Multimedia Tools and Applications},
  year={2021},
  publisher={Springer}}

License

Copyright (c) 2021, Ghulam Mujtaba. All rights reserved. This code is provided for academic, non-commercial use only. Redistribution and use in source and binary forms, with or without modification, are permitted for academic non-commercial use provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation provided with the distribution.

This software is provided by the authors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the authors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.

About

Original Keras implementation of the code for the paper "Client-driven animated GIF generation framework using an acoustic feature," at 1171: Real-time 2D/3D Image Processing with Deep Learning (MTAP)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages