融合感受野增强和注意力机制的交通标志检测算法

doi:10.16180/j.cnki.issn1007-7820.2024.06.002

摘要/Abstract

摘要：

针对目标检测算法在交通标志检测中存在的不足,文中提出了一种融合感受野增强模块和注意力机制的交通标志检测算法。该算法在YOLOv5(You Only Look Once version 5)算法的基础上改进,选用感受野模块(Receptive Field Block,RFB)替换原骨干网络中的空间金字塔池化(Spatial Pyramid Pooling,SPP)模块,在特征融合网络中嵌入高效通道注意模块(Efficient Channel Attention Module,ECAM)和卷积块注意模块(Convolutional Block Attention Module,CBAM),选用矩阵非极大值抑制(Matrix Non-Maximum Suppression,Matrix NMS)筛选候选框以提升算法的检测精度和检测速度。实验结果表明,在模型参数量与原网络相比未变化的前提下,该算法的均值平均精度达到了82.31%,与原算法相比提升了8.59%,检测速度达到了51.89 frame·s^-1,且该算法在各个测试场景中未出现错检漏检现象,证明其泛化能力优于原算法,可以实时检测交通标志。

关键词: 交通标志实时检测, 增强感受野, 注意力机制, 特征融合, 矩阵非极大值抑制, YOLOv5, 深度学习, 实时检测

Abstract:

In view of the number of shortcomings of target detection algorithm in traffic sign detection, this study proposes a traffic sign detection algorithm that incorporating receptive field enhancement module and attention mechanism. The algorithm is improved on basis of YOLOv5(You Only Look Once version 5) algorithm, the RFB (Receptive Field Block) is used to replace the SPP(Spatial Pyramid Pooling) in the original backbone, the attention mechanism modules ECAM(Efficient Channel Attention Module) and CBA (Convolutional Block Attention Module) are embedded in the feature fusion network, and the Matrix NMS (Matrix Non-Maximum Suppression) is used to sift the candidate bounding-boxes. The experimental results show that there is no change in the number of model parameters when compared with the original network, meanwhile, mean average precision of the algorithm reaches 82.31%, which is 8.59% higher than the original network, and the detection speed reaches 51.89 frame·s^-1. In addition, there is no false detection or missing detection in each test scenario, which proves that the generalization ability of the algorithm is also better than original algorithm, and the algorithm can perform real-time detection of traffic signs.

Key words: real-time traffic sign detection, enhanc receptive field, attention mechanism, feature fusion, matrix non- maximum suppression, YOLOv5, deep learning, real-time detection

中图分类号:

TP391

叶雨新, 巨志勇, 赖颖. 融合感受野增强和注意力机制的交通标志检测算法[J]. 电子科技, 2024, 37(6): 8-16.

YE Yuxin, JU Zhiyong, LAI Ying. Traffic Sign Detection Algorithm Incorporating Receptive Field Enhancement Module and Attention Mechanism[J]. Electronic Science and Technology, 2024, 37(6): 8-16.

图/表 12

图1

图2

图3

图4

图5

图6

表1

表2

表3

表4

图7

表5

参考文献 19

[1]	He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1904-1916. doi: 10.1109/TPAMI.2015.2389824 pmid: 26353135
[2]	Ren S, He K, Girshick R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[3]	Vuola A O, Akram S U, Kannala J. Mask-RCNN and U-net ensembled for nuclei segmentation[C]. Venice: IEEE the Sixteenth International Symposium on Biomedical Imaging,2019:1120-1136.
[4]	Zhang J, Xie Z, Sun J, et al. A cascaded R-CNN with multiscale attention and imbalanced samples for traffic sign detection[J]. IEEE Access, 2020(8):29742-29754.
[5]	Redmon J, Divvala S, Girshick R, et al. You only look once:Unified,real-time object detection[C]. Las Vegas: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:747-769.
[6]	Liu W, Anguelov D, Erhan D, et al. SSD:Single shot multibox detector[C]. Amsterdam: European Conference on Computer Vision,2016:1017-1120.
[7]	田智慧, 孙盐盐, 魏海涛. 基于SSD模型的交通标志检测算法[J]. 计算机应用与软件, 2021, 38(12):201-206.
	Tian Zhihui, Sun Yanyan, Wei Haitao. Traffic sign dtection algorithm based on SSD model[J]. Computer Applications and Software, 2021, 38(12):201-206.
[8]	廖璐明, 张伟. 基于改进VGG16网络的混合批量训练交通标志识别[J]. 电子科技, 2021, 34(8):8-13.
	Liao Luming, Zhang Wei. Batch mixed training trafic sign recognition based on improved VGG16 nework[J]. Electronic Science and Technology, 2021, 34(8):8-13.
[9]	Purkait P, Zhao C, Zach C. SPP-Net:Deep absolute pse regression with synthetic views[EB/OL].(2017-12-09)[2022-12-09]https://arxiv.org/abs/1712.03452.
[10]	Liu S, Huang D. Receptive field block net for accuate and fast object detection[C]. Munich: Proceedins of the European Conference on Computer Vision,2018:699-723.
[11]	Wang Q, Wu B, Zhu P, et al. ECA-Net:Efficient chanel attention for deep convolutional neural networks[EB/OL].(2020-04-07)[2022-12-09]https://arxiv.org/abs/1910.03151.
[12]	Hu J, Shen L, Sun G. Squeeze-and-excitation network[C]. Cham: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:2101-2237.
[13]	Woo S, Park J, Lee J Y, et al. Cbam:Convolutional bock attention module[C]. Munich: Proceedings of the European Conference on Computer Vision,2018:477-489.
[14]	Wang X, Zhang R, Kong T, et al. Solov2:Dynamic ad-fast instance segmentation[EB/OL].(2020-10-23)[2022-12-09] https://arxiv.org/abs/2003.10152.
[15]	Zhu Z, Liang D, Zhang S, et al. Traffic-sign detection and classification inthe wild[C]. Las Vegas: Proceeings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:1015-1187.
[16]	Lin T Y, Maire M, Belongie S, et al. Microsoft coco:Common objects in context[C]. Zurich: European Conference on Computer Vision,2014:117-137.
[17]	Everingham M, Van Gool L, Williams C K I, et al. The pascal visual object classes challenge[J]. International Journal of Computer Vision, 2010, 88(2):303-338.
[18]	Zhang J, Huang M, Jin X, et al. A real-time Chinese traffic sign detection algorithm based on modified YOLOv2[J]. Algorithms, 2017, 10(4):127-139.
[19]	张莹, 刘子龙, 万伟. 基于Faster R-CNN的无人机车辆目标检测[J]. 电子科技, 2021, 34(11):11-20.
	Zhang Ying, Liu Zilong, Wan Wei. UAVvehicle target detection based on Faster R-CNN[J]. Electronic Science and Technology, 2021, 34(11):11-20.

类别	标志
指示类	i2、i4、i5、i10、il60、il80、il100、ip
警告类	p3、p5、p10、p11、p12、p19、p23、p26、p27、pr40、 pr50、pa14、pm20、pl30、pl60、pl80、pl40、pl50、 pl100、pl120、ph4、ph4.5、ph5、pb、pg、pn、pne
禁止类	w13、w55、w57、w59

改进策略	不同算法模型
RFB	-	ü	-	-	ü	ü
CBAM+ECA	-	-	ü	-	ü	ü
Matrix-NMS	-	-	-	ü	-	ü
mAP/%	73.72	77.82	76.89	72.45	80.57	82.31
检测速度/ frame·s^-1	44.94	44.41	42.38	57.34	41.05	51.89

交通标志类别	F1-score		提升率 /%
交通标志类别	YOLOv5	本文	提升率 /%
i2	0.71	0.81	14.08
i4	0.88	0.87	-1.14
i5	0.89	0.91	2.25
i10	0.67	1.00	49.25
il60	0.87	0.91	4.60
il80	0.80	0.84	5.00
il100	0.86	0.88	2.33
ip	0.70	0.75	7.14
p3	0.55	0.76	38.18
p5	0.91	0.90	-1.10
p10	0.50	0.77	54.00
p11	0.56	0.63	12.50
p12	0.36	0.54	50.00
p19	0.50	0.73	46.00
p23	0.80	0.83	3.75
p26	0.69	0.79	14.49
p27	0.18	0.67	272.22
pa14	0.86	1.00	16.28
pb	0.92	0.92	0.00
pg	0.90	0.83	-7.78
ph4.5	0.72	0.75	4.17
ph4	0.36	0.67	86.11
ph5	0.24	0.67	179.17
pl30	0.56	0.59	5.36
pl40	0.69	0.73	5.80
pl50	0.54	0.65	20.37
pl60	0.56	0.70	25.00
pl80	0.74	0.81	9.46
pl100	0.82	0.85	3.66
pl120	0.65	0.80	23.08
pm20	0.79	0.81	2.53
pn	0.81	0.82	1.23
pne	0.94	0.94	0.00
pr40	0.94	0.97	3.19
pr50	0.67	0.67	0.00
w13	0.38	0.59	55.26
w55	0.44	0.73	65.91
w57	0.71	0.77	8.45
w59	0.76	0.86	13.16

模型	mAP/%	检测时间/ms	检测速度/ frame·s^-1
YOLOv5	73.72	22.25	44.94
本文模型	82.31	19.27	51.89

算法模型	参数量 /10⁶	mAP /%	检测时间 /ms	检测速度 /frame·s^-1
Faster-RCNN	137.099	87.44	161.70	6.18
Centernet	32.665	68.09	30.13	33.18
SSD	26.285	71.63	22.88	43.70
Efficientdet	3.874	64.51	25.36	39.42
YOLOv5	47.057	73.72	22.25	44.94
本文模型	47.139	82.31	19.27	51.89