This codebase contains code for the paper "Dynamically Throttleable Neural Networks" published on "Machine Vision and Applications", 2022.
Download the pretrained weights in release.
Put controller_network_2.pkl.latest
inside folder ckpt/controller
.
Put cpm_r3_model_epoch1540.pth
inside ckpt/
.
Put model_110.pkl.latest
inside ckpt/gated_raw_c3d
.
Run pip install -r requirements.txt
to install dependencies.
To run the demo, just clone this repo, then go in the folder through terminal or any IDE of your choice, run python demo.py
. By default, it will use one GPU and enable all the features. I have not write a requirements file for dependencies yet.
The concepts and implementations take from this paper Toward Runtime-Throttleable Neural Networks. For the specific implementation of demo, several attempts have been made. TNN requries convolutional and dense layers to be gated, so gated networks were implemented first. In this repo, there are a gated keypoint detection framework and gated C3D implemented. The experiments are mainly on widthwise nested sequential gates while other options are also available.
Gesture recognition framework is based on a basic C3D network with gated version implemented. Some implementations are from here. Only five gesture classes are used.
Currently, the single-hand keypoints detection model is just used for hand keypoints visualization, but the contextual information can be used for building a data-driven controller. The implementation is based on a variant Global Context for Convolutional Pose Machines of the original paper Convolutional Pose Machines. Its implementation can be found here. Part of the implementations can also be found here.
The project file structure is listed here:
Folder/File | Description | Used in Demo |
---|---|---|
ckpt/ | Store the trained model files and checkpoints. | Yes |
dataloaders/ | Scripts to pre-process and load different datasets. | No |
dataset/ | Store different datasets. | No |
logs/ | Store trainning and evaluation logs. | No |
modules/ | Inside, utils.py contains utility functions for gated network. Others are for keypoint estimation. |
Yes |
network/ | Different implementations of neural networks (gated and non-gated). | Yes |
nnsearch/ | Jesse's codes for throttleable NN (some changes are made). | Yes |
src/ | Some functions for displaying keypoint heatmap. | No |
visualization/ | Store local visualization outputs, images, etc. | No |
bandit_net.py | Controller network implementation. | Yes |
c3d_train.py | Original training script for video action recognition. | No |
conf.text | Configuration file for training/testing keypoint estimation. | No |
cpm_*.py | Keypoint estimation related scripts; the trained model is used for demo visualization. | No |
demo.py | The demo entry point. | Yes |
demo_train.py | Deprecated. This old demo structure uses keypoint heatmaps for gesture recognition. | No |
demo_train_cpm.py | Deprecated. This is for training gated CPM which is not working as expected. | No |
gate_test.py | Script for checking gated layer implementation. | No |
inference.py | Original inference script for video action recognition. | No |
model_util.py | Samyak's code copy from Jesse's nnsearch/ folder. | No |
mypath.py | Training configuration and paths for gesture recognition. | No |
policy.py | Samyak's code copy from Jesse's nnsearch/ folder on RL policy. | No |
qnet.py | Samyak's code example for contextual Q learning network. | No |
raw_c3d_*.py | The training/evaluation scripts for the demo's gesture recognition part. | Train |
shape_flop_util.py | Probably deprecated. Functions to Calculate the flops of defined layers. | No |
testthrottle.py | Deprecated. Samyak's original codes for controller. | No |
throttle_train.py | For training the controller network; currently not working well. | No |
train.py | Probably same with c3d_train.py. Original training script for video action recognition. | No |
weights_experiment.py | Speed test for gated structure using all zero/random weights on CPU/GPU. | No |
The dataset for training the keypoint estimation is from CMU Hand Dataset, both real and synthetic dataset are used.
The dataset for training the gesture recognition model is from 20BN-jester V1.
Demo from Latent AI link.
If you find our work helpful, please cite:
@article{liu2022dynamically,
title={Dynamically throttleable neural networks},
author={Liu, Hengyue and Parajuli, Samyak and Hostetler, Jesse and Chai, Sek and Bhanu, Bir},
journal={Machine Vision and Applications},
volume={33},
number={4},
pages={59},
year={2022},
publisher={Springer}
}