基于非局部支持注意力的小样本目标检测算法

doi:10.16180/j.cnki.issn1007-7820.2024.08.011

摘要/Abstract

摘要：

基于元学习的小样本目标检测算法研究的关键之处,是更好地利用支持分支的信息来更有效地帮助查询分支完成对新类目标的识别,较多算法在查询分支加入支持分支信息时只在深度方向进行融合,忽略了特征之间的空间位置关系。文中提出基于非局部支持注意力的小样本目标检测算法模型,该方法不仅在候选框特征中加入了支持信息,还将支持信息与送入候选框生成网络的特征进行融合,同时考虑了特征之间的空间位置关系,在检测模块中加入负支持样本的信息帮助模型区分异类目标。该模型在COCO2017数据集的基类和新类上均获得了良好的检测效果。在增量式学习的情况下,相比改进前,在新类AP(Average Precision)/AP₅₀/AP₇₅上分别增加了3.3/3.8/4.7 mAP(mean Average Precision),在基类AP/AP₅₀/AP₇₅上分别增加了2.7/0.5/3.3 mAP,并且超过了相同设置下SOTA(Sort-Of-The-Art)模型DAnA(Dual-Awareness Attention)的表现。

关键词: 目标检测, 小样本学习, 元学习, 增量式学习, 特征融合, 注意力, 非局部, 微调

Abstract:

The key point of current research on few-shot object detection based on meta-learning is how to make better use of the information of support branch to help query branch to recognize novel objects more effectively. However, many current methods fuse the features from support branch and query branch in the depth direction, ignoring the spatial position relationship between features. Therefore, this study proposes non-local support attention network. This method not only adds support information into the proposals features, but also fuses the support information with the features that fed into region proposal network. The spatial position relationship between features is considered at the same time. It also adds the information of negative supports to the detection module to help the model distinguish the objects from different categories. This method obtains good performance on base classes and novel classes of COCO2017 dataset, particularly under the case of incremental learning. Compared with method before improvement, 3.3/3.8/4.7 mAP is increased in AP/AP50/AP75 of novel classes. 2.7/0.5/3.3 mAP is increased in AP/AP50/AP75 of the base classes, and outperformed the performance of SOTA(Sort-Of-The-Art) model DAnA(Dual-Awareness Attention) under the same setting.

Key words: object detection, few-shot learning, meta-learning, incremental learning, feature fusion, attention, non-local, fine-tuning

中图分类号:

TP391

谢熙君, 李菲菲. 基于非局部支持注意力的小样本目标检测算法[J]. 电子科技, 2024, 37(8): 75-83.

XIE Xijun, LI Feifei. Non-Local Support Attention Network for Few-Shot Object Detection[J]. Electronic Science and Technology, 2024, 37(8): 75-83.

图/表 12

图1

图2

图3

图4

表1

表2

表3

表4

图5

图6

图7

图8

参考文献 23

[1]	赵晋, 李菲菲. 一种基于GAN的轻量级水墨画风格迁移模型[J]. 电子科技, 2023, 36(2):81-86.
	Zhao Jin, Li Feifei. A GAN-based lightweight style transfer model for ink painting[J]. Electronic Science and Technology, 2023, 36(2):81-86.
[2]	左斌, 李菲菲. 基于注意力机制和Inf-Net的新冠肺炎图像分割方法[J]. 电子科技, 2023, 36(2):22-28.
	Zuo Bin, Li Feifei. An effective segmentation method for COVID-19 CT image based on attention mechanism and Inf-Net[J]. Electronic Science and Technology, 2023, 36(2):22-28.
[3]	Koch G, Zemel R, Salakhutdinov R. Siamese neural netw-orks for one-shot image recognition[C]. Lille: ICML Deep Learning Workshop, 2015:956-963.
[4]	Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning[J]. Advances in Neural Information Processing Systems, 2017, 31(2):4080-4090.
[5]	Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning[J]. Advances in Neural Information Processing Systems, 2016, 30(2):3637-3645.
[6]	Sung F, Yang Y, Zhang L, et al. Learning to compare:Relation network for few-shot learning[C]. Salt Lake City: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018:1199-1208.
[7]	Kang B, Liu Z, Wang X, et al. Few-shot object detection via feature reweighting[C]. Seoul: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019:8420-8429.
[8]	Yan X, Chen Z, Xu A, et al. Meta R-CNN:Towards general solver for instance-level low-shot learning[C]. Seoul: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019:9577-9586.
[9]	Perez-Rua J M, Zhu X, Hospedales T M, et al. Incrementalfew-shot object detection[C]. Seattle: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020:13846-13855.
[10]	Xiao Y, Marlet R. Few-shot object detection and viewpoint estimation for objects in the wild[C]. Online: Proceedings of the European Conference on Computer Vision, 2020:192-210.
[11]	Fan Q, Zhuo W, Tang C K, et al. Few-shot object detection with attention-RPN and multirelation detector[C]. Seattle: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020:4013-4022.
[12]	Redmon J, Farhadi A. YOLO9000:Better,faster,stronger[C]. Honolulu: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017:7263-7271.
[13]	Ren S, He K, Girshick R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. Advances in Neural Information Processing Systems, 2015, 28(2):91-99.
[14]	Zhou X, Wang D, Krähenbühl P. Objects as points[EB/OL].(2019-04-16) [2023-03-13] https://arxiv.org/abs/1904.07850.
[15]	Wang X, Girshick R, Gupta A, et al. Non-local neural networks[C]. Salt Lake City: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018:7794-7803.
[16]	Chen T I, Liu Y C, Su H T, et al. Dual-awareness attent-ion for few-shot object detection[J]. IEEE Transactions on Multimedia, 2021, 25(7):291-301.
[17]	Chen H, Wang Y, Wang G, et al. Lstd:A low-shot transfer detector for object detection[C]. New Orleans: Proceedings of the Conference on Artificial Intelligence, 2018:2836-2843.
[18]	Wu J, Liu S, Huang D, et al. Multiscale positive sample refinement for few-shot object detection[C]. Online: Proceedings of the European Conference on Computer Vision, 2020:456-472.
[19]	Wang X, Huang T E, Darrell T, et al. Frustratingly simple few-shot object detection[C]. Vienna: Proceedings of the International Conference on Machine Learning, 2020:9861-9870.
[20]	Han G, Huang S, Ma J, et al. Meta faster R-CNN:Towards accurate few-shot object detection with attentive feature alignment[C]. Vancouver: Proceedings of the Conference on Artificial Intelligence, 2022, 36(1):780-789.
[21]	Lee H, Lee M, Kwak N. Few-shot object detection by attending to per-sample-prototype[C]. Waikoloa: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022:2445-2454.
[22]	Li B, Yang B, Liu C, et al. Beyond max-margin:Class margin equilibrium for few-shot object detection[C]. Online: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021:7363-7372.
[23]	Fan Z, Yu J G, Liang Z, et al. FGN:Fully guided network for few-shot instance segmentation[C]. Seattle: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020:9172-9181.

方法	10 shot AP	10 shot AP₇₅	30 shot AP	30 shot AP₇₅
FeatureReweighting^[7]	5.6	4.6	9.1	7.6
Meta R-CNN^[8]	8.7	6.6	12.4	10.8
MPSR^[18]	9.8	9.7	14.1	14.2
TFA w/cos^[19]	10.0	9.3	13.7	13.4
AttentionRPN^[11]	11.1	10.6	-	-
Meta Faster R-CNN^[20]	11.3	9.8	15.9	14.7
FSDetView^[10]	12.5	9.8	14.7	12.2
FSDetView+PsP^[21]	13.4	9.1	17.1	14.7
CME^[22]	15.1	16.4	16.9	17.8
NSA-Net	18.7	17.6	18.8	17.3

方法	新类AP	新类AP₅₀	新类AP₇₅	基类AP	基类AP₅₀	基类AP₇₅
Faster R-CNN^[13]	N/A	N/A	N/A	34.3	58.3	35.6
Meta R-CNN^[8]	11.1	25.3	8.5	28.6	52.5	28.4
FGN^[23]	10.5	22.5	8.8	25.5	46.4	25.5
AttentionRPN^[11]	10.1	23.0	8.3	22.4	40.8	22.2
DAnA-FasterRCNN^[16]	14.0	28.9	13.0	29.4	50.6	30.3
NSA-Net	14.4	29.2	13.2	31.3	53.0	31.7

方法	1 shot 新类AP	5 shot 新类 AP	1 shot 新类 AP₅₀	5 shot 新类 AP₅₀	1 shot 新类 AP₇₅	5 shot 新类 AP₇₅	1 shot 基类 AP	5 shot 基类 AP	1 shot 基类 AP₅₀	5 shot 基类 AP₅₀	1 shot 基类 AP₇₅	5 shot 基类 AP₇₅
Meta R-CNN^[13]	8.7	11.2	19.9	25.9	6.8	8.6	27.3	28.5	50.4	52.3	27.3	28.2
FGN^[23]	8.0	10.9	17.3	24.0	6.9	9.0	24.7	26.9	44.3	47.6	25.0	27.4
AttentionRPN^[11]	8.7	10.6	19.8	24.4	7.0	8.3	20.6	23.0	37.2	42.0	20.5	22.4
DAnA-FasterRCNN^[16]	11.9	14.4	25.6	30.4	10.4	13.0	27.8	32.0	46.3	54.1	27.7	32.9
NSA-Net	13.2	14.1	27.0	29.1	11.8	12.6	28.4	31.4	48.5	53.5	29.2	32.1

	Attention for Regressor	Convolution Attention	Non-local Support Attention	AP	AP₅₀	AP₇₅
a				11.1	24.6	8.5
b	√			11.3	24.8	8.5
c			√	14.3	28.8	13.2
d	√	√		12.1	26.0	9.1
e	√		√	14.4	29.2	13.2