西安电子科技大学学报 ›› 2021, Vol. 48 ›› Issue (5): 92-99.doi: 10.19665/j.issn1001-2400.2021.05.012

• • 上一篇    下一篇

采用空洞卷积的多尺度融合草图识别模型

杨云航(),闵连权()   

  1. 中国人民解放军战略支援部队信息工程大学 地理空间信息学院,河南 郑州 450001
  • 收稿日期:2020-12-28 出版日期:2021-10-20 发布日期:2021-11-09
  • 通讯作者: 闵连权
  • 作者简介:杨云航(1996—),男,中国人民解放军战略支援部队信息工程大学硕士研究生,E-mail: 884481846@qq.com
  • 基金资助:
    国家自然科学基金(41471337)

Multi-scalefusion sketch recognition model by dilated convolution

YANG Yunhang(),MIN Lianquan()   

  1. School of Geospatial Information,PLA Strategic Support Force Information Engineering University,Zhengzhou 450001,China
  • Received:2020-12-28 Online:2021-10-20 Published:2021-11-09
  • Contact: Lianquan MIN

摘要:

针对现有的基于深度学习的草图识别方法大多将普通卷积作为草图特征提取的主要手段。而忽略了草图对象的稀疏性特点,提出了一种通过空洞卷积实现草图特征提取的草图识别模型。该模型将空洞卷积和普通卷积融合,利用空洞卷积不增加卷积核有效单元数量即可扩大感受野的特性,实现对草图结构特征的初步提取。考虑到空洞卷积的稀疏采样方式使得远距离卷积得到的信息之间没有相关性,对分类结果会产生影响,于是在使用空洞卷积对图像特征进行稀疏提取的同时,使用具有相同大小感受野的普通卷积对输入图像特征进行密集提取,最后将两种不同卷积方式输出的特征在通道维度上进行拼接。这种方法不仅发挥了空洞卷积的稀疏采样特性,也充分利用到不同卷积方式带来的多尺度信息优势。实验结果表明,该模型在TU-Berlin SKetch数据集取得了72.6%的识别准确率,相较于目前主流的草图识别方法,效果更加明显。

关键词: 空洞卷积, 多尺度融合, 草图识别, 卷积神经网络, 感受野

Abstract:

Focused on the issue that existing sketch recognition methods based on deep learning still use ordinary convolution as the main method of sketch feature extraction,ignoring the sparsity characteristics of sketch objects,this paper proposes a sketch recognition model based on dilated convolution.This model combines the dilated convolution and ordinary convolution by using the dilated convolution’s characteristics of expanding the receptive field without increasing the number of effective units of the convolution kernel,to realize the preliminary extraction of the structural features of the sketch.Due to the sparsely sampled input signal of the dilated convolution,there is no correlation between the information obtained by the long-distance convolution,which will affect the classification result.Therefore,the model uses the dilated convolution and ordinary convolution to extract the input image features separately,and finally adds the feature output by the two different convolution methods in the channel dimension.This method not only takes advantage of the sparse sampling characteristics of the dilated convolution,but also makes full use of the advantages of multi-scale information from different convolution methods.Experimental results show that this model has achieved a recognition accuracy of 72.6% on the TU-Berlin SKetch dataset,indicating that it has certain advantages over the current mainstream sketch recognition methods.

Key words: dilated convolution, multi-scale fusion, sketch recognition, convolutional neural network, receptive filed

中图分类号: 

  • TP391
Baidu
map