基于改进YOLOv3算法的行人检测研究

doi:10.16180/j.cnki.issn1007-7820.2021.01.002

摘要/Abstract

摘要：

YOLOv3算法在单一物体目标检测时使用Darknet53作为主干,网络出现冗余现象,导致参数过多,检测速度变慢,传统的边界框损失函数影响检测定位准确性。针对这一问题,文中提出了改进YOLOv3算法的行人检测方法。通过构造以Darknet19为主干网络多尺度融合的新型网络,加快训练速度和检测速度,还通过引入广义交并比损失函数来提高检测精确度。实验结果表明,在行人检测数据集如INRIA行人数据集中,相比于原始算法,文中所提算法的精确度提高了5%。和Faster R-CNN相比,在保证准确率的情况下,采用文中算法使单张图片的检测速度达到了每张0.015 s。

关键词: 目标检测, 广义交并比, YOLOv3, 多尺度融合, 行人检测, INRIA数据集

Abstract:

The YOLOv3 algorithm uses Darknet53 as the backbone in the target detection (pedestrian detection) of a single object, and the network appears redundant, which results in too many parameters and slow detection speed. Additionally, the traditional bounding box loss function makes the detection and positioning inaccurate. To solve these problems, the improved YOLOv3 backbone network is proposed in the current study. A new multi-scale fusion network based on Darknet19 is constructed to accelerate the training speed and detection speed, and a generalized intersection over union loss function is introduced to improve the detection accuracy. The experimental results show that the proposed algorithm improves the accuracy of the original algorithm by 5% in the pedestrian detection dataset such as the INRIA pedestrian dataset. Compared with Faster R-CNN , the detection speed of a single image reaches 0.015 s per image under the condition of good accuracy.

Key words: target detection, generalized intersection over union, YOLOv3, multi-scale fusion, pedestrian detection, INRIA data set

中图分类号:

TN247

叶飞, 刘子龙. 基于改进YOLOv3算法的行人检测研究[J]. 电子科技, 2021, 34(1): 5-9.

YE Fei, LIU Zilong. Pedestrian Detection Based on Improved YOLOv3 Algorithm[J]. Electronic Science and Technology, 2021, 34(1): 5-9.

图/表 10

图1

图2

图3

表1

图4

图5

图6

图7

表2

表3

参考文献 16

[1]	黄成都, 黄文广, 闫斌. 基于Codebook 背景建模的视频行人检测[J]. 传感器与微系统, 2017,36(3):144-146.
	Huang Chengdu, Huang Wenguang, Yan Bin. Pedestrian detection based on Codebook background modeling in video[J]. Transducer and Microsystem Technologies, 2017,36(3):144-146.
[2]	高修祥, 瞿成明. 基于HOG与残差网络的行人检测算法[J]. 黑龙江工业学院学报(综合版), 2019,19(4):72-77.
	Gao Xiuxiang, Zhuo Chengming. Pedestrian detection algorithm based on HOG and residual network[J]. Journal of Heilongjiang University of Technology(Comprehensive Edition), 2019,19(4):72-77.
[3]	刘燕德, 曾体伟, 陈洞滨, 等. 一种级联两阶段分类的行人检测方法[J]. 电子测量技术, 2018,41(19):1-6.
	Liu Yande, Zeng Tiwei, Chen Dongbin, et al. Pedestrian detection based on cascade two stage classification[J]. Electronic Measurement Technology, 2018,41(19):1-6.
[4]	张亚须, 龙晖, 云利军. 基于改进DPM模型的行人检测方法研究[J]. 大理大学学报, 2018,3(6):13-18.
	Zhang Yaxu, Long Hui, Yun Lijun. Research on pedestrian detection method based on improved DPM model[J]. Journal of Dali University, 2018,3(6):13-18.
[5]	高华, 邬春学, 鲁俊. 基于动态加权可变形部件模型的行人检测[J]. 电子科技, 2016,29(9):1-3.
	Gao Hua, Wu Chunxue, Lu Jun. Pedestrian detection based on deformable part model with dynamic weights adjustment[J]. Electronic Science and Technology, 2016,29(9):1-3.
[6]	Girshick R. Fast R-CNN[C]. Santiago:IEEE International Conference on Computer Vision, 2015.
[7]	Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015,39(6):1137-1145.
[8]	龚静, 曹立, 亓琳, 等. 基于YOLOv2算法的运动车辆目标检测方法研究[J]. 电子科技, 2018,31(6):5-8,12.
	Gong Jing, Cao Li, Qi Lin, et al. Moving vehicle target detection based on YOLOv2 algorithm[J]. Electronic Science and Technology, 2018,31(6):5-8,12.
[9]	Redmon J, Farhadi A. YOLOv3:an incremental improvement[C]. Wellington:IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[10]	王文豪, 高利, 吴绍斌, 等. 行人检测综述[J]. 摩托车技术, 2019(1):29-32.
	Wang Wenhao, Gao Li, Wu Shaobin, et al. Review of pedestrian detection[J]. Motorcycle Technology, 2019(1):29-32.
[11]	Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]. Long Beach:IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019.
[12]	Huang G, Liu Z, Weinberger K Q, et al. Densely connected convolutional networks[C]. Las Vegas:IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[13]	He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]. Las Vegas:IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[14]	Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015,39(6):1137-1149.
[15]	Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]. Las Vegas:IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[16]	Wang H B, Zhang Z D. Text detection algorithm based on improved YOLOv3[C]. Beijing:IEEE the Ninth International Conference on Electronics Information and Emergency Communication, 2019.

参数	参数描述	值
Momentum	冲量常数	0.9
Decay	权重衰减系数	0.000 5
Batchsize	批量大小	64
L_r	学习率	0.000 1
Iteration	迭代次数	20 000

算法	精确率/%	召回率/%	每张检测时间 /s
Yolov3-Darnet19	70.23	82.20	0.014
Proposed method	73.12	83.45	0.015

算法	精确率/%	召回率/%	每张检测时间 /s
HOG+SVM	57.80	50.90	0.474
YOLOv3	68.13	78.42	0.051
Faster R-CNN	80.64	85.14	0.416
Proposed method	73.12	83.45	0.015