Skip to content

A modified QKeras with support to affine uniform quantization for activation layers and more

License

Notifications You must be signed in to change notification settings

LucaUrbinati44/qkeras-mod

Repository files navigation

qkeras-mod

This is an extension of the original QKeras repository (commit hash number 4d61681d71c27a872dce96926a3f20908e0c7854).

The main novelty of this extension is the added support to affine uniform quantization [1][2][3] for activation layers by introducing a new layer called quantized_bits_featuremaps.

It also provides some modifications to AutoQKeras, such as the support to Batch Normalization fusion.

The explanation of all the modifications is in [4].

How to start

Create a new conda environment using the provided qkeras-env.yml environment, replace the original files of QKeras with the following ones, and then follow qkeras-mod-explained.ipynb:

  • autoqkeras/autoqkeras_internal.py -->

    <your_qkeras-env_installation_path>/autoqkeras/autoqkeras_internal.py

  • autoqkeras/forgiving_metrics/forgiving_bits.py -->

    <your_qkeras-env_installation_path>/autoqkeras/forgiving_metrics/forgiving_bits.py

  • qkeras/quantizers.py ->

    <your_qkeras-env_installation_path>/qkeras/quantizers.py

  • qkeras/utils.py -->

    <your_qkeras-env_installation_path>/qkeras/utils.py

If using this code, please cite our work

@ARTICLE{urbinati2024access,
author={Urbinati, Luca and Casu, Mario R.},
journal={IEEE Access}, 
title={High-Level Design of Precision-Scalable DNN Accelerators Based on Sum-Together Multipliers}, 
year={2024},
volume={12},
number={},
pages={44163-44189},
doi={10.1109/ACCESS.2024.3380472}}

References

[1] B. Jacob et al., "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference," arXiv:1712.05877 [cs, stat], Dec. 2017. Available: http://arxiv.org/abs/1712.05877

[2] Mao, Lei. "Quantization for Neural Networks". Lei Mao’s Log Book, May 17, 2020, https://leimao.github.io/article/Neural-Networks-Quantization/

[3] H. Wu, P. Judd, X. Zhang, M. Isaev, and P. Micikevicius, “Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation,” arXiv:2004.09602 [cs, stat], Apr. 2020, Accessed: Dec. 22, 2021. [Online]. Available: http://arxiv.org/abs/2004.09602.

[4] Terlizzi, M. Alessio, "Mixed-precision Quantization and Inference of MLPerf Tiny DNNs on Precision-Scalable Hardware Accelerators", 2023, Politecnico di Torino. Accessed: Dec. 7, 2023. [Online]. Available: https://webthesis.biblio.polito.it/26664/.

About

A modified QKeras with support to affine uniform quantization for activation layers and more

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages