西安电子科技大学学报 ›› 2021, Vol. 48 ›› Issue (6): 23-31.doi: 10.19665/j.issn1001-2400.2021.06.004
• 智能嵌入式系统结构与软件关键技术专栏 • 上一篇 下一篇
收稿日期:
2021-06-30
出版日期:
2021-12-20
发布日期:
2022-02-24
通讯作者:
寇晓丽,李雁妮
作者简介:
刘佳玮(1998—),男,西安电子科技大学硕士研究生,E-mail: 基金资助:
LIU Jiawei(),ZHANG Wenhui(),KOU Xiaoli(),LI Yanni()
Received:
2021-06-30
Online:
2021-12-20
Published:
2022-02-24
Contact:
Xiaoli KOU,Yanni LI
摘要:
当前,深度学习在各种应用领域已取得了巨大的成功,但深度神经网络模型的鲁棒性和性能极易受到带有细微扰动的对抗样本的攻击。针对现有对抗样本去噪防御算法破坏干净样本的有用信息致使模型分类精度下降的缺陷,基于在目标模型上添加增强型输入去噪器,以及基于凸包理论所提出的隐层干净样本有损信息恢复器,提出了一种新的增强型对抗样本攻击防御算法。该算法首先在模型的输入层训练一个去噪器,去噪器的输入为干净样本和对抗样本的并集,期望去噪器去除对抗扰动的同时避免对干净样本的遗忘。其次,考虑到去噪器会破坏干净样本含有的扰动信息,故而在模型的隐层中训练一个恢复器,恢复器的输入为干净样本和对抗样本隐向量的凸组合,期望恢复器将位于错误分类空间的样本重新映射回正确分类空间,以此训练出更具鲁棒性的模型。在多个标准数据集上的大量对比仿真实验表明:所提出的去噪器和恢复器能有效地提升模型的鲁棒性,其对抗样本防御性能优于众多现有代表性的对抗样本防御算法。
中图分类号:
刘佳玮,张文辉,寇晓丽,李雁妮. 增强型深度对抗样本攻击防御算法[J]. 西安电子科技大学学报, 2021, 48(6): 23-31.
LIU Jiawei,ZHANG Wenhui,KOU Xiaoli,LI Yanni. Harnessing adversarial examples via input denoising and hidden information restoring[J]. Journal of Xidian University, 2021, 48(6): 23-31.
表1
ID+HIR算法在CIFAR-10较小图像数据集对抗鲁棒性对比结果%"
防御算法 | 样本 | |||||||
---|---|---|---|---|---|---|---|---|
干净样本 | FGSM(ε=4/8/16) | CW(ε=4/8/16) | BIM(ε=4/8/16) | |||||
VGG16 | ResNet18 | VGG16 | ResNet18 | VGG16 | ResNet18 | VGG16 | ResNet18 | |
无防御 | 92.6 | 93.2 | 80.3/64.6/46.1 | 82.8/67.6/45.2 | 83.5/77.0/69.3 | 84.5/76.2/70.3 | 82.7/57.1/30.9 | 83.2/55.1/31.5 |
NRP[ | 79.4 | 78.7 | 76.7/74.9/72.2 | 74.8/72.5/68.1 | 78.5/78.5/77.5 | 77.1/76.7/75.8 | 77.4/74.8/71.4 | 75.6/72.5/68.3 |
Fast AT[ | 81.5 | 83.8 | 88.1/78.7/56.0 | 89.0/79.1/55.0 | 91.8/90.8/87.8 | 91.2/90.7/87.4 | 90.1/90.8/87.1 | 91.0/90.7/87.5 |
ComDefend[ | 85.2 | 87.1 | 82.7/78.8/70.8 | 82.2/77.3/69.9 | 84.5/84.5/82.6 | 84.6/84.5/81.9 | 83.2/80.3/76.6 | 84.0/79.8/74.9 |
FS[ | 91.0 | 92.0 | 82.5/65.3/48.8 | 83.4/66.7/49.1 | 89.5/88.1/83.3 | 89.7/88.9/82.7 | 85.2/62.2/41.6 | 86.3/66.9/43.9 |
JPEG[ | 71.0 | 74.1 | 66.6/63.0/54.3 | 66.2/64.1/55.0 | 70.1/69.8/68.2 | 69.9/69.9/68.5 | 67.5/64.2/60.2 | 68.2/63.1/59.4 |
JPEG[ | 80.2 | 82.3 | 74.4/68.2/51.0 | 73.5/69.2/51.8 | 79.0/77.8/75.1 | 79.2/77.2/74.4 | 75.7/70.5/63.3 | 76.3/71.2/64.3 |
JPEG[ | 85.9 | 84.9 | 78.4/68.5/47.5 | 79.2/69.7/48.1 | 84.8/83.0/78.5 | 85.9/84.2/78.8 | 80.5/72.9/58.8 | 81.0/73.4/58.9 |
TVM[ | 88.8 | 86.9 | 81.5/76.7/52.7 | 80.5/75.2/51.8 | 87.0/86.2/81.8 | 87.8/86.6/81.0 | 83.1/74.0/60.3 | 82.2/73.8/59.7 |
ID+HIR(Our) | 91.2 | 92.1 | 89.7/86.8/73.8 | 90.0/86.2/75.4 | 90.8/90.9/90.6 | 91.2/90.8/90.2 | 90.4/88.7/82.6 | 90.2/88.0/83.1 |
表2
ID+HIR算法在CIFAR-100数据集上的对抗鲁棒性对比结果表%"
防御算法 | 样本 | |||||||
---|---|---|---|---|---|---|---|---|
干净样本 | FGSM(ε=4/8/16) | CW(ε=4/8/16) | BIM(ε=4/8/16) | |||||
VGG16 | ResNet18 | VGG16 | ResNet18 | VGG16 | ResNet18 | VGG16 | ResNet18 | |
无防御 | 72.6 | 76.4 | 52.1/45.3/39.6 | 51.3/42.4/37.9 | 63.2/61.4/54.8 | 62.0/59.3/51.3 | 53.4/50.1/29.3 | 52.6/48.4/28.2 |
NRP[ | 65.2 | 65.4 | 65.4/62.1/61.2 | 69.2/68.5/68.3 | 68.1/68.7/66.3 | 71.8/70.9/70.3 | 69.3/67.7/59.3 | 72.1/70.7/66.4 |
Fast AT[ | 65.8 | 70.1 | 70.2/62.3/55.1 | 74.1/70.2/65.2 | 72.1/70.9/69.2 | 74.2/74.1/73.6 | 72.0/70.3/68.2 | 75.5/73.7/72.9 |
ComDefend[ | 68.7 | 71.4 | 66.2/61.4/60.2 | 70.2/69.8/67.2 | 70.0/69.7/65.2 | 73.8/71.2/69.1 | 68.2/66.3/60.1 | 73.2/71.7/67.5 |
FS[ | 71.3 | 74.1 | 67.2/57.1/54.2 | 54.7/44.5/40.1 | 70.2/70.1/70.5 | 70.4/67.2/65.6 | 68.3/62.5/57.3 | 71.3/69.5/64.1 |
JPEG[ | 55.4 | 55.7 | 47.2/42.3/39.1 | 49.2/41.3/39.1 | 57.2/57.1/56.80.5 | 56.1/54.1/50.4 | 57.7/55.4/50.0 | 58.2/56.2/51.2 |
JPEG[ | 58.8 | 60.5 | 49.2/43.7/38.29.1 | 51.5/44.7/42.7 | 64.2/63.4/62.7 | 65.2/62.6/62.1 | 66.0/61.6/58.2 | 65.1/62.4/59.7 |
JPEG[ | 68.2 | 71.8 | 58.9/47.2/45.2 | 55.2/47.5/45.4 | 69.3/68.9/67.2 | 68.3/67.3/63.5 | 67.3/58.2/45.5 | 63.3/59.2/47.5 |
TVM[ | 69.1 | 72.3 | 61.7/51.2/49.7 | 57.1/49.3/51.6 | 70.1/68.2/67.7 | 69.8/68.2/63.1 | 67.9/61.3/50.2 | 64.2/60.8/51.7 |
ID+HIR (Our) | 71.8 | 74.8 | 71.2/68.3/63.7 | 75.5/72.3/68.7 | 72.2/71.2/70.6 | 74.9/74.7/74.2 | 71.5/69.8/64.5 | 75.5/73.1/69.3.1 |
表3
HIR模型消融实验结果表%"
防御算法 | 样本 | |||||
---|---|---|---|---|---|---|
FGSM(?=4/8/16) | CW(?=4/8/16) | |||||
CIFAR-10 | SVHN | MNIST | CIFAR-10 | SVHN | MNIST | |
无防御 | 52.7/44.0/34.0 | 52.1/40.0/27.4 | -/-/41.2 | 17.1/14.2/9.0 | 37.1/17.2/9.5 | -/-/74.3 |
ID | 86.7/90.5/92.2 | 92.6/95.3/96.1 | -/-/98.9 | 81.8/82.1/73.2 | 95.8/93.5/94.6 | -/-/98.0 |
ID+HIR | 88.3/91.5/92.2 | 93.5/96.2/96.4 | -/-/99.1 | 85.0/85.2/90.0 | 96.2/94.8/95.7 | -/-/98.7 |
[1] | JOSHI A, MUKHERJEE A, SARKAR S, et al. Semantic Adversarial Attacks:Parametric Transformations That Fool Deep Classifiers[C]// Proceeding of the IEEE/CVF International Conference on Computer Vision (ICCV).Piscataway:IEEE, 2019:4773-4783. |
[2] | JIA R, LIANGP. Adversarial Examples for Evaluating Reading Comprehension Systems[C]// Proceeding of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP).Stroudsburg:ACL, 2017:2021-2031. |
[3] | SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing Properties of Neural Networks[C]// Proceeding of the 2nd International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2014:1-10. |
[4] | MADRY A, MAKELOV A, SCHMIDTL, et al. Towards Deep Learning Models Resistant to Adversarial Attacks[C]// Proceeding of the 6th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2018:1-28. |
[5] | SHAFAHI A, HUANG W R, STUDER C, et al. Are Adversarial Examples Inevitable[C]// Proceeding of the 7th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2019:1-17. |
[6] | SHAFAHI A, NAJIBI M, GHIASI A, et al. Adversarial Training for Free![C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems (NIPS).New York:ACM, 2019:3358-3369. |
[7] | WANG Y, ZOU D, YI J, et al. Improving Adversarial Robustness Requires Revisiting Misclassified Examples[C]// Proceeding of the 7th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2019:1-14. |
[8] | ZHENG H, ZHANG Z, GU J, et al. Efficient Adversarial Training With Transferable Adversarial Examples[C]// Proceeding of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway:IEEE, 2020:1181-1190. |
[9] | DING G W, SHARMA Y, LUIK Y C, et al. MMA Training:Direct Input Space Margin Maximization through Adversarial Training[C]// Proceeding of the 7th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2019:1-28. |
[10] | JIA X, WEI X, CAO X, et al. ComDefend:An Efficient Image Compression Model to Defend Adversarial Examples[C]// Proceeding of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway:IEEE, 2019:6084-6092. |
[11] | LIAO F, LIANG M, DONG Y, et al. Defense Against Adversarial Attacks Using High-Level Representation Guided Denoiser[C]// Proceeding of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway:IEEE, 2018:1778-1787. |
[12] | SONG Y, KIM T, NOWOZIN S, et al. PixelDefend:Leveraging Generative Models to Understand and Defend against Adversarial Examples[C]// Proceeding of the 6th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2018:1-20. |
[13] | GU S, RIGAZIO L. Towards Deep Neural Network Architectures Robust to Adversarial Examples[C]// In Proceeding of the 3th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2015:1-9. |
[14] | XU W, DAVID E, QI Y. Feature Squeezing:Detecting Adversarial Examples in Deep Neural Networks[C]// In Proceeding of the 25th Annual Network and Distributed System Security Symposium(NDSS) 2018. |
[15] | GUO C, RANA M, CISSE M, et al. Countering Adversarial Images using Input Transformations[C]// Proceeding of the 6th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2018:1-12. |
[16] | NASEER M, KHAN S, HAYATM, et al. A Self-Supervised Approach for Adversarial Robustness[C]// Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway:IEEE, 2020:262-271. |
[17] | 张树栋, 高海昌, 曹曦文, 等. 针对ASR系统的快速有目标自适应对抗攻击[J]. 西安电子科技大学学报, 2021, 48(1):168-175. |
ZHANG Shudong, GAO Haichang, CAO Xiwen, et al. Adaptive Fast and Targeted Adversarial Attack for Speech Recognition[J]. Journal of Xidian University, 2021, 48(1):168-175. | |
[18] | XIE C, WU Y, MAATEN L, et al. Feature Denoising for Improving Adversarial Robustness[C]// Proceeding of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway:IEEE, 2019:501-509. |
[19] | SALMAN H, SUN M, YANG G, et al. Denoised Smoothing:A Provable Defense for Pretrained Classifiers[C]// Proceeding of 33th Neural Information Processing Systems (NIPS).New York:ACM, 2020:21945-21957. |
[20] | JEONG, J, SHIN J. Consistency Regularization for Certified Robustness of Smoothed Classifiers[C]// // Proceeding of 33th Neural Information Processing Systems (NIPS).New York:ACM, 2020:6-12. |
[21] | RONNEBERGER O, FISCHER P, BROX T. U-net:Convolutional Networks for Biomedical Image Segmentation[C]// In Proceeding of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI).Heidelberg:Springer, 2015:234-241. |
[22] | ZHANG H, CISSE M, YANN N, et al. Mixup:Beyond Empirical Risk Minimization[C]// Proceeding of the 6th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2018:1-13. |
[23] | 陈开周. 最优化计算方法[M]. 西安: 西北电讯工程学院出版社, 1985: 22. |
[24] | HE K, ZHANG X, REN S, et al. Deep Residual Learning for Image Recognition[C]// Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway:IEEE, 2016:770-778. |
[25] | GOODFELLOW I J, SHLENS J, SZEGEDY C. Explaining and Harnessing Adversarial Examples[C]// Proceeding of the 3rd International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2015:1-11. |
[26] | CARLINI N, WAGNER D. Towards Evaluating the Robustness of Neural Networks[C]// Proceeding of the 2017 IEEE Symposium on Security and Privacy (SP).Piscataway:IEEE, 2017:39-57. |
[27] | KOS J, FISCHER I, SONG D. Adversarial Examples for Generative Models[C]// Proceeding of the 2018 IEEE Security and Privacy Workshops (SPW).Piscataway:IEEE, 2018:36-42. |
[28] | LIUS, DENG W. Very Deep Convolutional Neural Network Based Image Classification Using Small Training Sample Size[C]// Proceeding of the 3rd IAPR Asian Conference on Pattern Recognition (ACPR).Piscataway:IEEE, 2015:730-734. |
[29] | WONG E, RICE L, KOLTER J Z. Fast is Better Than Free:Revisiting Adversarial Training[C]// Proceeding of the 8th International Conference on Learning Representations (ICLR).La Jolla:ICLR, 2020:1-17. |
[1] | 宋建锋,苗启广,王崇晓,徐浩,杨瑾. 注意力机制的多尺度单目标跟踪算法[J]. 西安电子科技大学学报, 2021, 48(5): 110-116. |
[2] | 张宇浩,程培涛,张书豪,王秀美. 一种自适应权重学习的轻量超分辨率重建网络[J]. 西安电子科技大学学报, 2021, 48(5): 15-22. |
[3] | 李鹏,冯存前,许旭光,唐子翔. 一种利用贝叶斯优化的弹道目标微动分类网络[J]. 西安电子科技大学学报, 2021, 48(5): 139-148. |
[4] | 闫佳,曹玉东,任佳兴,陈冬昊,李晓会. 深度非对称压缩型哈希算法[J]. 西安电子科技大学学报, 2021, 48(5): 212-221. |
[5] | 宁阳,杜建超,韩硕,杨传凯. 改进DeeplabV3+的火焰分割与火情分析方法[J]. 西安电子科技大学学报, 2021, 48(5): 38-46. |
[6] | 周鹏,杨军. 采用神经网络架构搜索的遥感影像分割方法[J]. 西安电子科技大学学报, 2021, 48(5): 47-57. |
[7] | 戚艳军,孔月萍,王佳婧,朱旭东. 一种LSTM与CNN相结合的步态识别方法[J]. 西安电子科技大学学报, 2021, 48(5): 78-85. |
[8] | 回海生,张雪英,吴泽林,李凤莲. 一种主辅路径注意力补偿的脑卒中病灶分割方法[J]. 西安电子科技大学学报, 2021, 48(4): 200-208. |
[9] | 宋剑桥,王峰,牛锦,师泽洲,马军辉. 一种面向时空神经网络的潜在情绪识别方法[J]. 西安电子科技大学学报, 2021, 48(4): 159-167. |
[10] | 孙豪杰,李苗钰,章盼盼,许鹏飞. 用于面瘫分级的自监督非对称特征学习方法[J]. 西安电子科技大学学报, 2021, 48(3): 115-122. |
[11] | 张华,高浩然,杨兴国,李文敏,高飞,温巧燕. TargetedFool:一种实现有目标攻击的算法[J]. 西安电子科技大学学报, 2021, 48(1): 149-159. |
[12] | 张树栋,高海昌,曹曦文,康帅. 针对ASR系统的快速有目标自适应对抗攻击[J]. 西安电子科技大学学报, 2021, 48(1): 168-175. |
[13] | 杨宏宇,曾仁韵. 一种深度学习的网络安全态势评估方法[J]. 西安电子科技大学学报, 2021, 48(1): 183-190. |
[14] | 张璐,孙蓉,刘景伟. 分布式存储中的克隆piggybacking框架[J]. 西安电子科技大学学报, 2020, 47(6): 139-147. |
[15] | 胡建伟,赵伟,崔艳鹏,崔俊洁. 一种改进ASTNN网络的PHP代码漏洞挖掘方法[J]. 西安电子科技大学学报, 2020, 47(6): 164-173. |
|