权重量化的深度神经网络模型压缩算法

doi:10.19665/j.issn1001-2400.2019.02.022

Abstract

Abstract:

There is a large number of weight parameters in deep neural network models. In order to reduce the storage space of deep neural network models, a compression algorithm for weights quantization is proposed. In the forward propagation process, a four-value filter is utilized for quantizing full-precision weights into four states as 2, 1, -1, and -2 to encode weights efficiently. In order to obtain an accurate four-value weights model, the L₂ distance between full-precision weights and scaled four-value weights is minimized. To further improve the compression of the model, 16 four-value weights are encoded and compressed using a 32-bit binary number. Experimental results on the datasets of MNIST, CIFAR-10 and CIFAR-100 show that the model compression ratio of the algorithm is the same as that for the TWN (Ternary Weight Network), which is 6.74%, 6.88% and 6.62%, respectively. Also, the accuracy rate is increased by 0.06%, 0.82% and 1.51%. The results indicate that the algorithm can provide efficient and accurate compression of deep neural network models.

Key words: weights quantization, compression, four-value filter, storage space, full-precision

CLC Number:

TP391

CHEN Yun,CAI Xiaodong,LIANG Xiaoxi,WANG Meng. Compression algorithm for weights quantized deep neural network models[J].Journal of Xidian University, 2019, 46(2): 132-138.

Figures/Tables 11

References 21

[1]	SZEGEDY C, LIU W, JIA Y , et al. Going Deeper with Convolutions[C]// Proceedings of the 2015 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2015: 1-9.
[2]	HINTON G, DENG L, YU D , et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition[J]. IEEE Signal Processing Magazine, 2012,29(6):82-97.
[3]	HUANG Z, SINISCALCHI S M, LEE C H . A Unified Approach to Transfer Learning of Deep Neural Networks with Applications to Speaker Adaptation in Automatic Speech Recognition[J]. Neurocomputing, 2016,218:448-459. doi: 10.1016/j.neucom.2016.09.018
[4]	HE K M, ZHANG X Y, REN S Q , et al. Deep Residual Learning for Image Recognition[C]// Proceedings of the 2016 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2016: 770-778.
[5]	LI W, ZHAO R, XIAO T , et al. DeepReID: Deep Filter Pairing Neural Network for Person Re-identification[C]// Proceedings of the 2014 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2014: 152-159.
[6]	REN S Q, HE K M, GIRSHICK R , et al. Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks[C]// Advances in Neural Information Processing Systems: 2015. Vancouver, Canada: Neural Information Processing Systems Foundation, 2015: 91-99.
[7]	LIU W, ANGUELOV D, ERHAN D , et al. SSD: Single Shot Multibox Detector[C]// Lecture Notes in Computer Science: 9905. Heidelberg: Springer Verlag, 2016: 21-37.
[8]	KRIZHEVSKY A, SUTSKEVER I, HINTON G E . Imagenet Classification with Deep Convolutional Neural Networks[C]// Advances in Neural Information Processing Systems: 2012. Vancouver, Canada: Neural Information Processing Systems Foundation, 2012: 1097-1105.
[9]	IOFFE S, SZEGEDY C . Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift[C]// Proceedings of the 2015 32nd International Conference on Machine Learning. Lille: International Machine Learning Society, 2015: 448-456.
[10]	SZEGEDY C, IOFFE S, VANHOUCKE V , et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning[C]// Proceedings of the 2017 31st AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2017: 4278-4284.
[11]	HUANG G, LIU Z, VAN DER MAATEN L , et al. Densely Connected Convolutional Networks[C]// Proceedings of the 2017 30th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2261-2269.
[12]	XIE S N, GIRSHICK R, DOLLAR P , et al. Aggregated Residual Transformations for Deep Neural Networks[C]// Proceedings of the 2017 30th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 5987-5995.
[13]	HAN S, MAO H Z, DALLY W J . Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding[CP/OL]. [2018-04-12]. https://arxiv.org/pdf/1510.00149.pdf. doi: 10.1145/2351676.2351678
[14]	HAN S, POOL J, TRAN J , et al. Learning Both Weights and Connections for Efficient Neural Network[C]// Advances in Neural Information Processing Systems: 2015. Vancouver, Canada: Neural Information Processing Systems Foundation, 2015: 1135-1143.
[15]	FRANCOIS C . Xception: Deep Learning with Depthwise Separable Convolutions[C]// Proceedings of the 2017 30th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1800-1807.
[16]	IANDOLA F N, MOSKEWICZ M W, ASHRAF K , et al. SqueezeNet: AlexNet-level Accuracy with 50x Fewer Parameters and <1MB Model Size[CP/OL]. [2018-04-12]. https://arxiv.org/pdf/1602.07360v2.pdf.
[17]	HUBARA I, COURBARIAUX M, SOUDRY D , et al. Binarized Neural Networks[C]// Advances in Neural Information Processing Systems: 2016. Vancouver, Canada: Neural Information Processing Systems Foundation, 2016: 4114-4122.
[18]	COURBARIAUX M, BENGIO Y, DAVID J P Binaryconnect: Training Deep Neural Networks with Binary Weights during Propagations[C]// Advances in Neural Information Processing Systems: 2015. Vancouver, Canada: Neural Information Processing Systems Foundation, 2015: 3123-3131.
[19]	RASTEGARI M, ORDONEZ V, REDMON J , et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks[C]// Lecture Notes in Computer Science: 9908. Heidelberg: Springer Verlag, 2016: 525-542.
[20]	LI F F, ZHANG B, LIU B . Ternary Weight Networks[CP/OL]. [2018-04-12]. https://arxiv.org/pdf/1605.04711.pdf.
[21]	LECUN Y, BOTTOU L, BENGIO Y , et al. Gradient-based Learning Applied to Document Recognition[J]. Proceedings of the IEEE, 1998,86(11):2278-2324. doi: 10.1109/5.726791

超参数	数值
初始学习率	0.01
权重衰减	0.0005
动量	0.9
每次迭代训练图片的数量	50

算法	Top1/%	压缩率/%
全精度网络	99.41	-
二值权重网络^[19]	98.81	3.15
三值权重网络^[20]	99.35	6.74
文中算法	99.41	6.74

超参数	数值
初始学习率	0.01
权重衰减	0.0001
动量	0.9
每次迭代训练图片的数量	100

算法	Top1/%	压缩率/%
全精度网络	81.22	-
二值权重网络^[19]	27.28	3.49
三值权重网络^[20]	78.82	6.88
文中算法	79.64	6.88

算法	Top1/%	压缩率/%
全精度网络	37.84	-
二值权重网络^[19]	8.25	3.31
三值权重网络^[20]	34.68	6.62
文中算法	36.19	6.62

Compression algorithm for weights quantized deep neural network models

RichHTML

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 11

References 21

Related Articles 15

Metrics

Comments

Recommended 10

[1]	JIANG Lin,FENG Ru,DENG Junyong,LI Yuancheng. Analysis of the relationship between the performance of the BC and the format of the graph data [J]. Journal of Xidian University, 2021, 48(6): 57-66.
[2]	KONG Xin,CHEN Gang,GONG Guoliang,LU Huaxiang,Mao Wenyu. High performance multiply-accumulator for the convolutional neural networks accelerator [J]. Journal of Xidian University, 2020, 47(4): 55-63.
[3]	ZHU Jia,LIU Hongxia. Lossless high compression ratio circuit design [J]. Journal of Xidian University, 2019, 46(2): 35-40.
[4]	ZHENG Ling;QIU Zhiliang;SUN Shiyong;PAN Weitao;WANG Weina;ZHANG Zhiyi. Two-step multiple flow table construction algorithm in the software-defined network [J]. Journal of Xidian University, 2018, 45(5): 25-31.
[5]	CHENG Fei;LIU Kai;DING Wenwen;SHI Huan;ZHANG Baijian. Spectrum analysis compression algorithm of measure report data [J]. Journal of Xidian University, 2016, 43(4): 178-183.
[6]	NIE Yongkang;LEI Jie;LI Yunsong;SONG Changhe;WU Xianyun. VLSI design of the JPEG-LS near-lossless image encoder [J]. Journal of Xidian University, 2016, 43(4): 75-80.
[7]	WEI Xiaoran;GENG Guohua;ZHANG Yuhe. Connectivity compression of triangle meshes based on geometric parameter predict [J]. J4, 2015, 42(5): 194-199.
[8]	LIANG Wei;ZENG Ping;ZHENG Haihong;LUO Xuemei. Multispectral image compression algorithm based on composite transform [J]. J4, 2015, 42(4): 33-40.
[9]	XIE Kun;ZENG Ping;ZHENG Haihong;GUO Tao. Robust watermarking utilizing adaptive order dither block truncation coding [J]. J4, 2015, 42(3): 198-204.
[10]	LI Xueshi;XING Mengdao;SHAO Peng;SUN Guangcai;LIANG Yi;BAO Zheng. Spectrum compression space-time adaptive processing method for TOPS SAR-GMTI [J]. J4, 2015, 42(1): 1-9.
[11]	GUO Junjie;WANG Xinghua;WANG Xing;SI Jianhua;ZHANG Wenxia. New smart noise jamming of radar signal frequency modulation [J]. J4, 2013, 40(4): 155-160.
[12]	SHEN Dong;ZHANG Linrang;LIU Xin;LIU Nan. Reducing the waveform auto-correlation sidelobes and cross-correlation of MIMO radar by using chaotic sequences [J]. J4, 2012, 39(5): 42-46+60.
[13]	WANG Ju;FANG Dingyi;CHEN Xiaojiang;XING Tianzhang;ZHANG Yuan;GAO Baojian. Data compression of wireless sensor networks in the heritage monitor [J]. J4, 2012, 39(1): 157-162.
[14]	LIU Yunfo;LIU Zheng;XIE Rong. Adaptive pulse compression for MIMO radar in cross correlation interference [J]. J4, 2011, 38(4): 89-94.
[15]	MA Hong-fei;XIA Yu;GUO Ze-hua. New type low complexity psychoacoustic model [J]. J4, 2010, 37(5): 842-845+878.

四值权重	二进制编码
+2	11
+1	10
-1	01
-2	00