时域有限差分法中的GPU加速高效CPML方案

doi:10.3969/j.issn.1001-2400.2015.01.031

Abstract

Abstract:

To overcome computational redundancy and memory-access redundancy of the traditional GPU-accelerated CPML technique, a novel division-free and minimum-access CPML scheme is proposed. In the proposed scheme, the division operators in the CPML method are merged into a series of fixed coefficients by optimally rearranging the iteration process of CPML and then, the reduplicate memory accesses are eliminated by updating the FDTD and CPML operation in the PML region jointly. Experimental results show that the proposed structure can save up to 70% operation time compared with the traditional GPU-CPML technique and 44% of field updating in the PML region, without any loss of accuracy.

Key words: finite difference time domain method, convolution perfectly matched layer, graphics processing unit, parallel computing, compute unified device architecture

CLC Number:

TP391.9

BAI Bing;NIU Zhongqi. High performance CPML acceleration scheme with GPU for FDTD[J].J4, 2015, 42(1): 194-199+212.

References

［1］ Taflove A, Hagness S C. Computational Electrodynamics: the Finite Difference Time Domain Method ［M］. 3rd Edition. Norwood: Artech House Publishers, 2005: 273-328.
［2］姜彦南, 葛德彪, 杨利霞, 等. 二维半空间时域有限差分瞬态场外推方法［J］. 西安电子科技大学学报, 2014, 41(2): 178-184.
Jiang Yannan, Ge Debiao, Yang Lixia, et al. Extrapolation to the Transient Ffield for FDTD in Ttwo-dimensional Hhalf-space［J］. Journal of Xidian University, 2014, 41(2): 178-184.
［3］张连波, 郭立新, 苟雪银, 等. 三层粗糙面电磁散射的矩量法研究［J］. 西安电子科技大学学报,2013, 40(6): 147-154.
Zhang Lianbo, Guo Lixin, Gou Xueyin, et al. Method of Moment Investigation on Electromagnetic Scattering from the Three-layered Rough Interfaces［J］. Journal of Xidian University, 2013, 40(6): 147-154.
［4］ Berenger J P. A Perfectly Matched Layer for the Absorption of Electromagnetic Waves［J］. Journal of Computational Physics, 1994, 114(2): 185-200.
［5］ Kuzuoglu M, Mittra R. Frequency Dependence of the Constitutive Parameters of Causal Perfectly Matched Anisotropic Absorbers ［J］. Microwave and Guided Wave Letters, 1996, 6(12): 447-449.
［6］ Roden J, Gedney S D. Convolution PML(CPML): an Efficient FDTD Implementation of the CFS-PML for Arbitrary Media ［J］. Microwave and Optical Technology Letters, 2000, 27(5): 334-339.
［7］ Zygiridis T T, Kantartzis N V, Tsiboukis T D. GPU-Accelerated Efficient Implementation of FDTD Methods with Optimum Time-step Selection［J］. IEEE Transactions on Magnetics, 2014, 50(2): 477-480.
［8］ Sypek P, Dziekonski A, Mrozowski M. How to Render FDTD Computations More Effective Using a Graphics Accelerator［J］. IEEE Transactions on Magnetics, 2009, 45(3): 1324-1327.
［9］ Inman M J, Elsherbeni A Z, Maloney J G, et al. GPU Based FDTD Solver with CPML Boundaries［C］//Antennas and Propagation Society International Symposium. Piscataway: IEEE, 2007: 5255-5258.
［10］胡媛, 李康, 孔凡敏, 等. 基于CUDA架构的三维CPML-FDTD并行方法［J］. 计算机工程与应用, 2011, 47(25): 220-223.
Hu Yuan, Li Kang, Kong Fanmin, et al. Three Dimensional CPML-FDTD Parallel Algorithm Based on CUDA［J］. Computer Engineering and Applications, 2011, 47(25): 220-223.
［11］ Chen Guanbo, Moghaddam M. GPU Accelerated 3D Nonlinear Time Domain Inversion of Realistic Breast Phantoms with Multiparameter Optimization［C］//Radio Science Meeting(Joint with AP-S Symposium). Piscataway: IEEE, 2013: 206.

[1]	ZHAO Boran;ZHANG Li;SHI Guangming;HUANG Rong;XU Xinran. Design of the programmable neural network processor based on the transport triggered architecture [J]. Journal of Xidian University, 2018, 45(4): 92-98.
[2]	JING Guobin;ZHANG Yunji;SUN Guangcai;XING Mengdao;BAO Zheng. Fast method for SAR echo simulation of a three-dimensional ground scene [J]. Journal of Xidian University, 2017, 44(3): 1-7.
[3]	REN Jilin;CHE Shuling;ZHENG Zheng. Decoding of regular LDPC codes accelerated by the GPU [J]. Journal of Xidian University, 2017, 44(3): 25-30.
[4]	ZHOU Hang;CAI Zhiming;WANG Ximin. Implementation and optimization of the wideband matched filter on the GPU [J]. J4, 2015, 42(3): 135-140+191.
[5]	WU Yong;WANG Jun;ZHANG Peichuan;CAO Yunhe. Parallel clutter suppression algorithm for passive radar in CUDA [J]. J4, 2015, 42(1): 104-111.
[6]	QIN Hua;ZHOU Mo;CHA Hao;ZUO Wei. Research on multi-GPU parallel technology in software radar signal processing [J]. J4, 2013, 40(3): 145-151.
[7]	DING Hai1;CHU Qing-xin2. Second-order accurate FDTD technique with subgridding modeling in TE mode [J]. J4, 2009, 36(1): 162-165.
[8]	SUN Hong-yuan1;2;XIE Wei-xin2;LU Ke-zhong2;YANG Xun1. A new genetic algorithm for horizontal overcastting [J]. J4, 2007, 34(5): 758-762.

High performance CPML acceleration scheme with GPU for FDTD

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 8

Metrics

Comments

Recommended 10