Journal of Xidian University ›› 2023, Vol. 50 ›› Issue (6): 148-160.doi: 10.19665/j.issn1001-2400.20230308

• Information and Communications Engineering & Computer Science and Technology • Previous Articles     Next Articles

Research on the fast implementation method of Winograd transposed convolution

LI Zhao1(),HUANG Chengcheng1(),HE Yizhi1(),SU Xiaojie2()   

  1. 1. School of Computer Science and Technology,Shandong University of Technology,Zibo 255000,China
    2. School of Automation,Chongqing University,Chongqing 400044,China
  • Received:2022-11-04 Online:2023-12-20 Published:2024-01-22

Abstract:

The Winograd transposed convolution algorithm is a widely used convolution acceleration method for Field Programmable Gate Array(FPGA).It can solve the zero-padding problem of transposed convolution by performing the Winograd convolution after grouping.However,this method requires grouping operation on the input feature map and convolution kernel,and needs to reorganize the operation results to generate a complete output feature map.The complex calculation of element coordinates increases the difficulty of design.To solve the above problems,a Winograd transposed convolution method based on the unified transformation matrix is proposed,which uses the unified transformation matrix instead of grouping the input feature map and convolution kernel,and effectively solves the problems of overlapping summation,zero padding,convolution kernel inversion,decomposition and reorganization.And under the guidance of the Winograd transpose convolution method based on the unified transformation matrix,combined with data reuse,the double buffer and the pipeline,the design of a transposed convolution accelerator on FPGA is completed.The Gaussian-Poisson generative adversarial network is selected for experimental verification,and compared with the mainstream transposed convolution method.Experimental results show that the proposed method can effectively reduce the resource consumption and power consumption,and that the effective performance of the accelerator is 1.13x~23.92x higher than that of the existing transposed convolution methods.

Key words: unified transformation matrix, Winograd transposed convolution, field programmable gate array, accelerator

CLC Number: 

  • TP18

Baidu
map