西安电子科技大学学报 ›› 2021, Vol. 48 ›› Issue (6): 23-31.doi: 10.19665/j.issn1001-2400.2021.06.004

• 智能嵌入式系统结构与软件关键技术专栏 • 上一篇    下一篇

增强型深度对抗样本攻击防御算法

刘佳玮(),张文辉(),寇晓丽(),李雁妮()   

  1. 西安电子科技大学 计算机科学与技术学院,陕西 西安 710071
  • 收稿日期:2021-06-30 出版日期:2021-12-20 发布日期:2022-02-24
  • 通讯作者: 寇晓丽,李雁妮
  • 作者简介:刘佳玮(1998—),男,西安电子科技大学硕士研究生,E-mail: liujw@stu.xidian.edu.cn|张文辉(1996—),男,西安电子科技大学硕士研究生,E-mail: wenhui110920@gmail.com
  • 基金资助:
    国家自然科学基金面上项目(61472296)

Harnessing adversarial examples via input denoising and hidden information restoring

LIU Jiawei(),ZHANG Wenhui(),KOU Xiaoli(),LI Yanni()   

  1. School of Computer Science and Technology,Xidian University,Xi’an 710071,China
  • Received:2021-06-30 Online:2021-12-20 Published:2022-02-24
  • Contact: Xiaoli KOU,Yanni LI

摘要:

当前,深度学习在各种应用领域已取得了巨大的成功,但深度神经网络模型的鲁棒性和性能极易受到带有细微扰动的对抗样本的攻击。针对现有对抗样本去噪防御算法破坏干净样本的有用信息致使模型分类精度下降的缺陷,基于在目标模型上添加增强型输入去噪器,以及基于凸包理论所提出的隐层干净样本有损信息恢复器,提出了一种新的增强型对抗样本攻击防御算法。该算法首先在模型的输入层训练一个去噪器,去噪器的输入为干净样本和对抗样本的并集,期望去噪器去除对抗扰动的同时避免对干净样本的遗忘。其次,考虑到去噪器会破坏干净样本含有的扰动信息,故而在模型的隐层中训练一个恢复器,恢复器的输入为干净样本和对抗样本隐向量的凸组合,期望恢复器将位于错误分类空间的样本重新映射回正确分类空间,以此训练出更具鲁棒性的模型。在多个标准数据集上的大量对比仿真实验表明:所提出的去噪器和恢复器能有效地提升模型的鲁棒性,其对抗样本防御性能优于众多现有代表性的对抗样本防御算法。

关键词: 深度学习, 对抗样本, 输入去噪器, 隐层信息恢复器

Abstract:

Although deep learning has achieved great success in various applications,the deep neural networks (DNNs) are vulnerable to the attack of adversarial samples with imperceptive perturbation information,which makes the robustness and performance of DNNs decrease greatly.To overcome the weakness of the existing denoising algorithms against adversarial samples,which destroys the information on clean samples,leading to reduction in CNN sclassification accuracy,this paper presents a novel enhanced denoising algorithm ID+HIR(Input Denoising andHidden Information Restoring)for adversarial samples.Our ID+HIR is made up of an enhanced input denoising and hidden lossy information restoring based on the theory of convex hull.The algorithm first trains a denoiser on the input layer of the model,with the input of the denoiser being the concatenation of clean and adversarial samples,and the denoiser is expected to remove the adversarial perturbations while avoiding the forgetting of clean samples.Since the denoiser destroys the perturbation information contained in the clean samples,a restorer is trained in the hidden layer of the model,with the input of the restorer being a convex combination of the hidden vectors of the clean and adversarial samples,expecting the restorer to remap the samples located in the incorrect classification space back to the correct classification space,thus training a more robust model.Extensive comparative simulation experiments on several standard datasets show that the denoiser and the recoverer proposed in this paper can effectively improve the robustness of the model,and extensive experiments on benchmark datasets show that our proposed algorithm ID+HIR is superior to the competitive baselines.

Key words: deep learning, adversarial samples, input denoising, hidden information restoring

中图分类号: 

  • TP183
Baidu
map