一种改进的雾天图像行人和车辆检测算法

doi:10.19665/j.issn1001-2400.2020.04.010

摘要/Abstract

摘要：

由于雾天图像数据集不足、雾天表现形式多样等因素,使得基于深度学习的目标检测网络在雾天图像行人和车辆检测中容易出现过拟合,造成鲁棒性不佳和准确率不高等问题。针对上述问题,在检测网络中加入雾浓度判别模块以提高网络的适应性和鲁棒性,通过引入可变形卷积和注意力机制以提升卷积神经网络的特征提取能力,通过模拟合成雾天图像的方式扩充数据集以加快网络的收敛速度。实验结果表明,改进后的网络针对雾天图像行人和车辆检测,其检测平均准确率相较于基于候选框的检测网络有约2%~4%的提高,且未显著地增加网络的训练参数和计算量。

关键词: 雾天图像目标检测, 深度学习, 基于候选框的检测网络

Abstract:

In order to improve the accuracy of the foggy-image pedestrian and vehicle detection, a novel and practical Foggy-image pedestrian and vehicle detection network (FPVDNet) based on the Faster R-CNN is proposed. First, a foggy-density discriminating module (FDM) is proposed to influence the density of the foggy images. In this way, the prediction from the FDM could determine the subsequent operations for different densities of the fog (No-fog, Light fog, and Dense fog). Then, the squeeze and excitation module (SE Module) is designed to use the attention mechanism to improve the feature extraction capability of the network. Meanwhile, the method of the deformable convolution network is applied to add offsets and learn the offsets from target tasks to enhance the transformation modeling capacity of CNNs. Finally, for lack of the annotated fog image dataset, it is necessary to generate a simulated fog image training dataset through the atmospheric scattering model. The simulated foggy image inherits the annotation of the clear image and increases the information on the fog density. Experiments by the proposed FPVDNet are carried out on the 1, 500 real-fog images and 500 real-clear images, with experimental results showing that, compared with the original Faster R-CNN, the mean average detection accuracies are improved 2%~4% by using the FPVDNet.

Key words: fog image object detection, deep learning, Faster R-CNN

中图分类号:

TP301.6

汪昱东,郭继昌,王天保. 一种改进的雾天图像行人和车辆检测算法[J]. 西安电子科技大学学报, 2020, 47(4): 70-77.

WANG Yudong,GUO Jichang,WANG Tianbao. Algorithm for foggy-image pedestrian and vehicle detection[J]. Journal of Xidian University, 2020, 47(4): 70-77.

图/表 9

图1

图2

图3

图4

表1

图5

表2

图6

表3

参考文献 20

[1]	REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: Unified, Real-time Object Detection[C]// Proceedings of the 2016 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2016: 779-788.
[2]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single Shot MultiBox Detector[C]// Lecture Notes in Computer Science: 9905, Heidelberg: Springer Verlag, 2016: 21-37.
[3]	ZHANG S F, WEN L Y, BIAN X, et al. Single-shot Refinement Neural Network for Object Detection[C]// Proceedings of the 2018 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2018: 4203-4212.
[4]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Region Based Convolutional Networks for Accurate Object Detection and Segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016,38(1):142-158. pmid: 26656583
[5]	GIRSHICK R. , Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1440-1448.
[6]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards Real-time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031 pmid: 27295650
[7]	ZHANG X W, LI C, LI B, et al. Too Far to See? Not Really! —Pedestrian Detection with Scale-aware Localization Policy[J]. IEEE Transactions on Image Processing, 2018,27(8):3703-3715. pmid: 29698203
[8]	HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition[C]// Proceedings of the 2016 IEEE Computer Society Conference on Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2016: 770-778.
[9]	徐谦, 李颖, 王刚. 基于深度学习的行人和车辆检测[J]. 吉林大学学报(工学版), 2019,49(5):1661-1667.
	XU Qian, LI Ying, WANG Gang. Pedestrian-vehicle Detection Based on Deep Learning[J]. Journal of Jilin University(Engineering and Technology Edition), 2019,49(5):1661-1667.
[10]	王璐璐, 张为, 孙琦龙. 一种改进RetinaNet的室内人员检测算法[J]. 西安电子科技大学学报, 2019,46(5):69-74.
	WANG Lulu, ZHANG Wei, SUN Qilong. Indoor Human Detection Algorithm Based on the Improved RetinaNet[J]. Journal of Xidian University, 2019,46(5):69-74.
[11]	彭博, 蔡晓禹, 唐聚, 等. 基于改进Faster R-CNN的无人机视频车辆自动检测[J]. 东南大学学报(自然科学版), 2019,49(6):1199-1204.
	PENG Bo, CAI Xiaoyu, TANG Ju, et al. Automatic Vehicle Detection with UAV Videos Based on Modified Faster R-CNN[J]. Journal of Southeast University(Natural Science Edition), 2019,49(6):1199-1204.
[12]	郭青山. 基于卷积神经网络的航拍车辆检测与跟踪[D]. 绵阳: 西南科技大学, 2019.
[13]	LI C Y, GUO C L, GUO JC, et al. PDR-net: Perception-inspired Single Image Dehazing Network with Refinement[J]. IEEE Transactions on Multimedia, 2020,22(3):704-716. doi: 10.1109/TMM.6046
[14]	CAI B L, XU X M, JIA K, et al. DehazeNet: an End-to-end System for Single Image Haze Removal[J]. IEEE Transactions on Image Processing, 2016,25(11):5187-5198. pmid: 28873058
[15]	HU J, SHEN L, SUN G, et al. Squeeze-and-excitation Networks[C]// Proceedings of the 2018 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2018: 7132-7141.
[16]	DAI J F, QI H Z, XIONG Y W, et al. Deformable Convolutional Networks[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 764-773.
[17]	CANTOR A. Optics of the Atmosphere-scattering by Molecules and Particles[J]. IEEE Journal of Quantum Electrons, 1978,14(9):698-699.
[18]	GEIGER A, LENZ P, URTASUN R. Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite[C]// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2012: 3354-3361.
[19]	LI B Y, REN W Q, FU D P, et al. Benchmarking Single Image Dehazing and Beyond[J]. IEEE Transactions on Image Processing, 2019,28(1):492-505. doi: 10.1109/TIP.83
[20]	CHEN K, WANG J Q, PANG J M, et al. MMDetection: Open MMLab Detection Toolbox and Benchmark[J/OL]. [2019-08-08].https://arxiv.org/abs/1906.07155.

算法	可变卷积	SE 模块	训练数据集	平均准确率	车辆	行人
Faster R-CNN	使用	使用	①	0.731	0.772	0.690
			②	0.801	0.831	0.771
			③	0.790	0.821	0.759
			④	0.782	0.833	0.731
			⑤	0.795	0.836	0.754
FPVDNet			⑥	0.818	0.858	0.778

算法	可变卷积	SE 模块	平均准确率	车辆	行人
Faster R-CNN	未使用	未使用	0.744	0.799	0.688
	使用	未使用	0.774	0.826	0.721
	未使用	使用	0.760	0.814	0.705
	使用	使用	0.799	0.835	0.760
FPVDNet	未使用	未使用	0.771	0.815	0.726
	使用	未使用	0.796	0.850	0.742
	未使用	使用	0.782	0.849	0.714
	使用	使用	0.821	0.869	0.773

算法	可变卷积	SE 模块	检测速度/(帧·秒^-1)	参数量
Faster R-CNN	未使用	未使用	10.1	38.01×10⁶
	使用	未使用	4.8	134.20×10⁶
	未使用	使用	8.7	38.67×10⁶
	使用	使用	4.5	147.60×10⁶
FPVDNet	未使用	未使用	10.0	38.40×10⁶
	使用	未使用	4.5	135.10×10⁶
	未使用	使用	8.6	39.32×10⁶
	使用	使用	4.3	148.30×10⁶