基于改进MobileNet网络的多类别垃圾分类算法

doi:10.16180/j.cnki.issn1007-7820.2024.04.006

摘要/Abstract

摘要：

针对垃圾数量繁多及一张图片包含多个垃圾物体的情况,文中提出基于改进MobileNet网络的垃圾检测与分类算法。将MobileNet网络融合进YOLOv5(You Only Look Once v5)目标检测算法,同时在主干部分引入卷积注意力模块(Convolutional Block Attention Modul,CBAM)筛选有意义的信息,利用视觉Transformer聚合形成图像特征,并加入使用了加权双向特征金字塔网络区别不同特征的贡献度,引入高效通道注意力(Efficient Channel Attention,ECA)模块对图像特征进行组合并传递给预测层。最后,为了在垃圾目标之间有遮挡的情况下获得更好的性能,使用软性非极大值抑制(soft-Non Maximum Suppression,soft-NMS)方法,并利用Alpha-IoU(Alpha-Intersection over Union)损失函数对提取的特征进行预测。实验结果表明,所提方法能够实现多目标多类别垃圾的定位与识别, mAP(mean Average Percision)值达到了90.31%,相较于YOLOv5网络提升了4.95%,处理速度缩短了约2.4 s。相较于融合ResNet(Residual Network)网络的Faster R-CNN(Region-based Convolutional Neural Network)算法,所提算法在保证准确率的前提下提升了处理效率。

关键词: 垃圾分类, 目标检测, 视觉Transformer, MobileNet, 图像识别, 特征集成, 数据增强, 平均准确率

Abstract:

view of the large amount of garbage and the fact that a picture contains multiple garbage objects, this study proposes a garbage detection and classification algorithm based on the improved MobileNet network, which integrates the MobileNet network into YOLOv5(You Only Look Oncev5) target detection algorithm. At the same time, the CBAM(Convolutional Block Attention Modul) module is introduced in the backbone to filter meaningful information, and the vision transformer is used to aggregate and form image features. In addition, the weighted bidirectional feature pyramid network is used to distinguish the contribution of different features. At the same time, the ECA(Efficient Channel Attention) module is introduced to combine the image features and transmit them to the prediction layer. Finally, in order to obtain better performance when there is occlusion between garbage targets, soft-NMS(soft-Non Maximum Suppression) method and Alpha-IoU(Alpha-Intersection over Union) loss function is used to predict the extracted features. The experimental results show that the method proposed in this study can realize the location and recognition of multi-target and multi-category garbage., and the mAP(mean Average Percision) value reaches 90.31%, which is 4.95% higher than that of YOLOv5 network, and the processing speed is shortened by about 2.4 seconds. Compared with the Faster R-CNN(Region-based Convolutional Neural Network) algorithm which integrates ResNet(Residual Network) network, the algorithm proposed in this study improves the processing efficiency on the premise of ensuring the accuracy.

Key words: garbage classification, target detection, vision Transformer, MobileNet, image recognition, feature integration, data enhancement, average accuracy

中图分类号:

TP391.4

梁陈烨, 张轩雄. 基于改进MobileNet网络的多类别垃圾分类算法[J]. 电子科技, 2024, 37(4): 38-46.

LIANG Chenye, ZHANG Xuanxiong. Research on Multiclass Garbage Classification Algorithm Based on Improved MobileNet Network[J]. Electronic Science and Technology, 2024, 37(4): 38-46.

图/表 17

图1

图2

图3

图4

图5

图6

图7

图8

图9

图10

图11

图12

图13

图14

图15

图16

表1

参考文献 20

[1]	Arulananth T S, Baskar M, Sree R D, et al. Smart garbage segregation for the smart city management systems[C]. Kuching: Proceedings of the Fourteenth Asia-Pacific Physics Conference, 2021:507-512.
[2]	Mittal G, Yagnik K B, Garg M, et al. Spot garbage:Smart ph-one app to detect garbage using deep learning[C]. Heidel-berg: Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing, 2016:940-945.
[3]	陈昱辰, 曾令超, 张秀妹, 等. 基于图像LBP特征与Adab-oost分类器的垃圾分拣识别方法[J]. 南方农机, 2021, 52(21):136-138,144.
	Chen Yuchen, Zeng Lingchao, Zhang Xiumei, et al. Waste sorting recognition method based on image LBP features and Adaboost classifier[J]. China Southern Agricultural Machinery, 2021, 52(21):136-138,144.
[4]	Wang Y, Feng W. Design and implementation of garbage classification system based on deep learning[C]. Singapore: The Seventh International Conference on Control,Automation and Robotics, 2021:601-605.
[5]	Wang Y, Zhang X. Autonomous garbage detection for intelligent urban management[C]. Tokyo: MATEC Web of Conferences, 2018:1189-1193.
[6]	张伟, 刘娜, 江洋, 等. 基于YOLO神经网络的垃圾检测与分类[J]. 电子科技, 2022, 35(10):45-50.
	Zhang Wei, Liu Na, Jiang Yang, et al. Garbage detection andclassification based on YOLO neural network[J]. Electronic Science and Technology, 2022, 35(10):45-50.
[7]	Woo S, Park J, Lee J Y, et al. Cbam:Convolutional block attention module[C]. Munich: Proceedings of the European Conference on Computer Vision, 2018:395-401.
[8]	Howard A, Sandler M, Chu G, et al. Searching for mobilen-etv3[C]. Seoul: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019:570-576.
[9]	Howard A G, Zhu M, Chen B, et al. MobileNets:Efficient convolutional neural networks for mobile vision applications[C]. Holunono: Computer Vision and Pattern Recognition, 2017:897-905.
[10]	Sandler M, Howard A, Zhu M, et al Mobilenetv2:Inverted residuals and linear bottlenecks[C]. Singapore: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018:370-376.
[11]	He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]. Chongqing: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016:988-993.
[12]	Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks[C]. Quebec City: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Sttistics, 2011:315-320.
[13]	Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]. Singapore: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018:627-631.
[14]	Nader A, Azar D. Searching for activation functions using a self-adaptive evolutionary algorithm[C]. New York: Genetic and Evolutionary Computation Conference, 2020:770-776.
[15]	Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017, 139(5):30-36.
[16]	Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words:Transformers for image recognition atscale[C]. Online: International Conference on Learning Representations, 2020:1513-1520.
[17]	Tan M, Pang R, Le Q V. Efficientdet:Scalable and efficient object detection[C]. Seattle: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020:593-600.
[18]	Wang Q, Wu B, Zhu P, et al. Supplementary material for E-CA-Net:Efficient channel attention for deep convolutional networks[C]. Seattle: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020:362-368.
[19]	Bodla N, Singh B, Chellappa R, et al. Soft-NMS-improving object detection with one line of code[C]. Chengdu: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017:209-215.
[20]	He J, Erfani S, Ma X, et al. Alpha-IoU:A family of power intersection over union losses for bounding box regression[J]. Advances in Neural Information Processing Systems, 2021, 41(5):20230-20242.

模型	mAP/%	处理时间/s
本文模型	90.31	1.061
YOLOv5	85.36	3.483
CBAM-YOLO	88.44	3.023
R-CNN(ResNet50)	87.47	4.995