权重量化的深度神经网络模型压缩算法

doi:10.19665/j.issn1001-2400.2019.02.022

摘要/Abstract

摘要：

深度神经网络模型通常存在大量的权重参数,为了减少其对存储空间的占用,提出权重量化的深度神经网络模型压缩算法。在前向传播过程中,使用一个四值滤波器将全精度权重量化为2、1、-1和-2四种状态,以进行高效的权重编码。最小化全精度权重与缩放后四值权重的L₂距离,以获得精确的四值权重模型。使用一个32位二进制数对16个四值权重进行编码压缩,以大幅度压缩模型。在MNIST、CIFAR-10和CIFAR-100数据集上的实验表明,该算法分别获得了6.74%、6.88%和6.62%的模型压缩率,与三值权重网络的相同,但准确率分别提升了0.06%、0.82%和1.51%。结果表明,该算法可提供高效、精确的深度神经网络模型压缩。

关键词: 权重量化, 压缩, 四值滤波器, 存储空间, 全精度

Abstract:

There is a large number of weight parameters in deep neural network models. In order to reduce the storage space of deep neural network models, a compression algorithm for weights quantization is proposed. In the forward propagation process, a four-value filter is utilized for quantizing full-precision weights into four states as 2, 1, -1, and -2 to encode weights efficiently. In order to obtain an accurate four-value weights model, the L₂ distance between full-precision weights and scaled four-value weights is minimized. To further improve the compression of the model, 16 four-value weights are encoded and compressed using a 32-bit binary number. Experimental results on the datasets of MNIST, CIFAR-10 and CIFAR-100 show that the model compression ratio of the algorithm is the same as that for the TWN (Ternary Weight Network), which is 6.74%, 6.88% and 6.62%, respectively. Also, the accuracy rate is increased by 0.06%, 0.82% and 1.51%. The results indicate that the algorithm can provide efficient and accurate compression of deep neural network models.

Key words: weights quantization, compression, four-value filter, storage space, full-precision

中图分类号:

TP391

陈昀,蔡晓东,梁晓曦,王萌. 权重量化的深度神经网络模型压缩算法[J]. 西安电子科技大学学报, 2019, 46(2): 132-138.

CHEN Yun,CAI Xiaodong,LIANG Xiaoxi,WANG Meng. Compression algorithm for weights quantized deep neural network models[J]. Journal of Xidian University, 2019, 46(2): 132-138.

图/表 11

表1

图1

图2

图3

图4

表2

表3

图5

表4

表5

表6

参考文献 21

[1]	SZEGEDY C, LIU W, JIA Y , et al. Going Deeper with Convolutions[C]// Proceedings of the 2015 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2015: 1-9.
[2]	HINTON G, DENG L, YU D , et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition[J]. IEEE Signal Processing Magazine, 2012,29(6):82-97.
[3]	HUANG Z, SINISCALCHI S M, LEE C H . A Unified Approach to Transfer Learning of Deep Neural Networks with Applications to Speaker Adaptation in Automatic Speech Recognition[J]. Neurocomputing, 2016,218:448-459. doi: 10.1016/j.neucom.2016.09.018
[4]	HE K M, ZHANG X Y, REN S Q , et al. Deep Residual Learning for Image Recognition[C]// Proceedings of the 2016 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2016: 770-778.
[5]	LI W, ZHAO R, XIAO T , et al. DeepReID: Deep Filter Pairing Neural Network for Person Re-identification[C]// Proceedings of the 2014 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2014: 152-159.
[6]	REN S Q, HE K M, GIRSHICK R , et al. Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks[C]// Advances in Neural Information Processing Systems: 2015. Vancouver, Canada: Neural Information Processing Systems Foundation, 2015: 91-99.
[7]	LIU W, ANGUELOV D, ERHAN D , et al. SSD: Single Shot Multibox Detector[C]// Lecture Notes in Computer Science: 9905. Heidelberg: Springer Verlag, 2016: 21-37.
[8]	KRIZHEVSKY A, SUTSKEVER I, HINTON G E . Imagenet Classification with Deep Convolutional Neural Networks[C]// Advances in Neural Information Processing Systems: 2012. Vancouver, Canada: Neural Information Processing Systems Foundation, 2012: 1097-1105.
[9]	IOFFE S, SZEGEDY C . Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift[C]// Proceedings of the 2015 32nd International Conference on Machine Learning. Lille: International Machine Learning Society, 2015: 448-456.
[10]	SZEGEDY C, IOFFE S, VANHOUCKE V , et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning[C]// Proceedings of the 2017 31st AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2017: 4278-4284.
[11]	HUANG G, LIU Z, VAN DER MAATEN L , et al. Densely Connected Convolutional Networks[C]// Proceedings of the 2017 30th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2261-2269.
[12]	XIE S N, GIRSHICK R, DOLLAR P , et al. Aggregated Residual Transformations for Deep Neural Networks[C]// Proceedings of the 2017 30th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 5987-5995.
[13]	HAN S, MAO H Z, DALLY W J . Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding[CP/OL]. [2018-04-12]. https://arxiv.org/pdf/1510.00149.pdf. doi: 10.1145/2351676.2351678
[14]	HAN S, POOL J, TRAN J , et al. Learning Both Weights and Connections for Efficient Neural Network[C]// Advances in Neural Information Processing Systems: 2015. Vancouver, Canada: Neural Information Processing Systems Foundation, 2015: 1135-1143.
[15]	FRANCOIS C . Xception: Deep Learning with Depthwise Separable Convolutions[C]// Proceedings of the 2017 30th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1800-1807.
[16]	IANDOLA F N, MOSKEWICZ M W, ASHRAF K , et al. SqueezeNet: AlexNet-level Accuracy with 50x Fewer Parameters and <1MB Model Size[CP/OL]. [2018-04-12]. https://arxiv.org/pdf/1602.07360v2.pdf.
[17]	HUBARA I, COURBARIAUX M, SOUDRY D , et al. Binarized Neural Networks[C]// Advances in Neural Information Processing Systems: 2016. Vancouver, Canada: Neural Information Processing Systems Foundation, 2016: 4114-4122.
[18]	COURBARIAUX M, BENGIO Y, DAVID J P Binaryconnect: Training Deep Neural Networks with Binary Weights during Propagations[C]// Advances in Neural Information Processing Systems: 2015. Vancouver, Canada: Neural Information Processing Systems Foundation, 2015: 3123-3131.
[19]	RASTEGARI M, ORDONEZ V, REDMON J , et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks[C]// Lecture Notes in Computer Science: 9908. Heidelberg: Springer Verlag, 2016: 525-542.
[20]	LI F F, ZHANG B, LIU B . Ternary Weight Networks[CP/OL]. [2018-04-12]. https://arxiv.org/pdf/1605.04711.pdf.
[21]	LECUN Y, BOTTOU L, BENGIO Y , et al. Gradient-based Learning Applied to Document Recognition[J]. Proceedings of the IEEE, 1998,86(11):2278-2324. doi: 10.1109/5.726791

超参数	数值
初始学习率	0.01
权重衰减	0.0005
动量	0.9
每次迭代训练图片的数量	50

算法	Top1/%	压缩率/%
全精度网络	99.41	-
二值权重网络^[19]	98.81	3.15
三值权重网络^[20]	99.35	6.74
文中算法	99.41	6.74

超参数	数值
初始学习率	0.01
权重衰减	0.0001
动量	0.9
每次迭代训练图片的数量	100

算法	Top1/%	压缩率/%
全精度网络	81.22	-
二值权重网络^[19]	27.28	3.49
三值权重网络^[20]	78.82	6.88
文中算法	79.64	6.88

算法	Top1/%	压缩率/%
全精度网络	37.84	-
二值权重网络^[19]	8.25	3.31
三值权重网络^[20]	34.68	6.62
文中算法	36.19	6.62