西安电子科技大学学报 ›› 2024, Vol. 51 ›› Issue (1): 135-146.doi: 10.19665/j.issn1001-2400.20230213

• 计算机科学与技术 • 上一篇    下一篇

结合自注意力与卷积的真实场景图像篡改定位

钟浩1,2(), 边山1,2,3(), 王春桃1,2()   

  1. 1.华南农业大学 数学与信息学院,广东 广州 510642
    2.广州市智慧农业重点实验室,广东 广州 510642
    3.广东省信息安全技术重点实验室,广东 广州 510006
  • 收稿日期:2022-12-07 出版日期:2023-09-06 发布日期:2023-09-06
  • 通讯作者: 边山(1986—),女,副教授,博士,E-mail:bianshan@scau.edu.cn
  • 作者简介:钟浩(1995—),男,华南农业大学硕士研究生,E-mail:zhneo@outlook.com
    王春桃(1979—),男,教授,博士,E-mail:wangct@scau.edu.cn
  • 基金资助:
    国家自然科学基金(61702199);国家自然科学基金(62172165);国家自然科学基金(61872152);广东省基础与应用基础研究重大项目(2019B030302008);广东省自然科学基金(2022A1515010325);广州市科技计划项目(202102020582);广州市科技计划项目(201902010081)

Real world image tampering localization combining the self-attention mechanism and convolutional neural networks

ZHONG Hao1,2(), BIAN Shan1,2,3(), WANG Chuntao1,2()   

  1. 1. College of Mathematics and Informatics,South China Agricultural University,Guangzhou 510642,China
    2. Guangzhou Key Laboratory of Intelligent Agriculture,Guangzhou 510642,China
    3. Guangdong Provincial Key Laboratory of Information Security Technology,Guangzhou 510006,China
  • Received:2022-12-07 Online:2023-09-06 Published:2023-09-06

摘要:

图像是移动互联网时代传播信息的重要载体,恶意图像篡改是潜在的网络安全威胁之一。与自然场景中在物体尺度上的图像篡改不同,真实场景中的图像篡改存在于伪造的资质证书、文案、屏幕截图等,这些篡改图像通常会经过精心的手工篡改干预,因此其篡改特征与自然场景篡改特征存在差异,更具有多样性,对其篡改区域的定位更具有挑战性。针对该场景复杂且多样的篡改特征,丰富的关系信息是重要的,文中通过卷积神经网络进行自适应特征提取,并利用逆向连接的全自注意力模块进行多阶段特征关注,最后融合多阶段注意力关注结果进行篡改区域定位。所提方法在真实场景图像篡改定位任务中取得了优于对比方法的性能,其中F1指标比主流方法MVSS-Net高出约8.98%,AUC指标高出约3.58%。此外,所提方法在自然场景图像篡改定位任务中也达到了主流方法的性能,并提供了自然场景篡改特征与真实场景篡改特征存在差异的佐证。在两种场景中的实验结果表明,所提方法能够有效地定位出篡改图像的篡改区域,且在复杂的真实场景中的定位效果更显著。

关键词: 图像篡改定位, 伪造检测, 数字图像取证, 计算机视觉, 自注意力机制, 卷积神经网络

Abstract:

Image is an important carrier of information dissemination in the era of the mobile Internet,making malicious image tampering one of the potential cybersecurity threats.Different from the image tampering on the object scale in the natural scene,image tampering in the real world exists in forged qualification certificates,forged documentation,forged screenshots,etc.The tampered images in the real world usually involve elaborate manual tampering interventions,so their tampering features are different from those in the natural scene and are more diverse,making the localization of tampered areas in the real world more challenging.Rich dependency information is important in considering the complex and diverse tampering features in the real world.Therefore,in this paper,the convolutional neural network is used for adaptive feature extraction and the reversely connected fully self-attention module is adopted for multi-stage feature attention.Finally,the tamper area is located by merging the multi-stage attentional results.The proposed method outperforms the comparison methods in the real world image tampering localization task with the F1 metric 8.98% higher than that of the mainstream method MVSS-Net and the AUC metric 3.58% higher.Besides,the proposed method also achieves the performance of mainstream methods in the natural scene image tampering localization task,and the evidence that the natural scene tampering features are inconsistent with the real world tampering features is provided.Experimental results in two scenes show that the proposed method can effectively locate the tampered area of the tampered images,and that it is more effective in complicated real world.

Key words: image tampering localization, fake detection, digital image forensics, computer vision, self-attention mechanism, convolutional neural networks

中图分类号: 

  • TP391.41
Baidu
map