西安电子科技大学学报 ›› 2022, Vol. 49 ›› Issue (6): 76-85.doi: 10.19665/j.issn1001-2400.2022.06.010

• 计算机科学与技术 & 人工智能 • 上一篇    下一篇

联合在线分类的双注意力RGBT孪生网络跟踪

张兆宇1(),田春娜1(),周恒1(),田西兰2()   

  1. 1.西安电子科技大学 电子工程学院,陕西 西安 710071
    2.中国电子科技集团第38研究所 数字技术研发中心,安徽 合肥 230088
  • 收稿日期:2021-10-27 出版日期:2022-12-20 发布日期:2023-02-09
  • 作者简介:张兆宇(1997—),男,西安电子科技大学硕士研究生,E-mail:zyzhang_5@stu.xidian.edu.cn|田春娜(1980—),女,教授,E-mail:chnatian@xidian.edu.cn|周 恒(1996—),男,西安电子科技大学博士研究生,E-mail:hengzhou@stu.xidian.edu.cn|田西兰(1981—),女,高级工程师,博士,E-mail:18655189340@163.com
  • 基金资助:
    国家自然科学基金(62173265);国家自然科学基金(61571354)

Online classification jointed RGBT tracking based on the dual attention Siamese network

ZHANG Zhaoyu1(),TIAN Chunna1(),ZHOU Heng1(),TIAN Xilan2()   

  1. 1. School of Electronic Engineering,Xidian University,Xi’an 710071,China
    2. Digital Technology R & D Center,the 38th Research Institute of China Electronics Technology Group Corporation,Hefei 230088,China
  • Received:2021-10-27 Online:2022-12-20 Published:2023-02-09

摘要:

可见光和热红外成像机理不同,因此可以捕获的目标信息也不同。基于可见光和热红外的双模视觉跟踪器,可以综合利用两种模态内在的信息关联性和互补性,降低单模态信息的局限性和不确定性,提高视觉系统的鲁棒跟踪能力。针对现有算法中图像融合或特征拼接的方式不能充分挖掘可见光与红外图像的关联和互补信息等问题,设计了一种端到端学习的红外与可见光双模孪生网络跟踪器,网络同时学习可见光和热红外图像的深度特征,通过模态内与模态间的双注意力机制,对两种模态的特征进行自适应融合,最终实现可见光和热红外双模视觉跟踪;同时,针对孪生网络对目标与语义背景区分能力不足的问题,引入在线分类模块,通过分类器在线学习,减少干扰物对跟踪的影响,适应目标在跟踪过程中的变化。实验结果表明,所提算法能够有效地提高跟踪器的性能,在可见光与热红外跟踪基准数据集GTOT上的精确率和成功率分别约为90.6%和73.8%,分别比基线算法的提高了约5.5%和4.3%。故所提出的方法相比其他先进的跟踪算法,总体性能更好。

关键词: 目标跟踪, 可见光/热红外, 孪生网络, 注意力机制, 深度学习

Abstract:

The imaging mechanism of visible light and that of thermal infrared are different.Visible light and thermal infrared images reflect different information on the object.A dual-modal visual tracker based on visible light and thermal infrared sequences can comprehensively utilize the inherent correlation and complementarity of two modals,which reduces limitations and uncertainties of single-modal information,and improves the robustness of the visual tracking system.We propose an end-to-end dual-modal tracking algorithm with the Siamese network based on infrared and visible light sequences.The network learns the depth features from the visible light and thermal infrared frames at the same time,and then adaptively fuses the two-model features through intra-modal and cross-modal dual attention mechanisms,which leads to more robust tracking.At the same time,in view of the insufficiency of the Siamese network in distinguishing the target and semantic background,we incorporate the online classification module into the tracking framework.The online learned classifier reduces the interference and adapts to the target changes during tracking.According to experimental results,the proposed algorithm effectively improves the performance of the tracker.Its precision rate and success rate are 90.6% and 73.8% on the RGBT benchmark dataset GTOT,which are 5.5% and 4.3% higher than those of the baseline algorithm.The overall performance is better than that of other advanced tracking algorithms.

Key words: object tracking, RGB/Thermal infrared, Siamese network, attention mechanism, deep learning

中图分类号: 

  • TP391
Baidu
map