规则LDPC码在GPU上的加速译码

doi:10.3969/j.issn.1001-2400.2017.03.005

西安电子科技大学学报

规则LDPC码在GPU上的加速译码

任计林;车书玲;郑征

(西安电子科技大学综合业务网理论及关键技术国家重点实验室，陕西西安 710071)

收稿日期:2016-07-20 出版日期:2017-06-20 发布日期:2017-07-17
作者简介:任计林(1990-)，男，西安电子科技大学硕士研究生，E-mail: jlren@stu.xidian.edu.cn
基金资助:
国家自然科学基金资助项目(61101148)；中央高校基本科研业务费专项资金资助项目(K5051301008)

Decoding of regular LDPC codes accelerated by the GPU

REN Jilin;CHE Shuling;ZHENG Zheng

(State Key Lab. of Integrated Service Networks, Xidian Univ., Xi'an 710071, China)

Received:2016-07-20 Online:2017-06-20 Published:2017-07-17

摘要/Abstract

摘要：

针对图形处理器高速并行的特点和规则低密度奇偶校验码译码过程中的可并行部分，提出了使用图形处理器来加速规则低密度奇偶校验码译码的方法．该方法在图形处理器上采用以节点的边并行代替节点并行进行译码，提高了线程利用率; 同时，在译码过程中采用图形处理器高速的片上内存——共享内存和寄存器来存储数据，使数据存储减少对全局内存的依赖，缩短数据访问时间．仿真结果显示，使用边并行和片上内存，译码速度约是图形处理器不使用文中优化方法的低密度奇偶校验码译码程序的5.32～10.41倍．

关键词: 低密度奇偶校验码, 图形处理器, 统一计算设备架构, 并行计算, 共享内存, 寄存器

Abstract:

To take advantage of the high speed parallel feature of the GPU and the parallel section in the regular LDPC codes decoding process, a method is proposed by which the GPU is used to accelerate decoding of regular LDPC codes. In this method, edges of nodes are used in parallel decoding instead of nodes themselves to improve the utilization of threads. At the same time, the use of the high-speed on-chip GPU memory-shared memory and registers to store data makes data reduce dependence on global memory and shorten access time. Simulation results show that, by using parallel computing on edges and the on-chip memory, the decoding speed can be 5.32 ～ 10.41 times relative to the LDPC codes decoding program that does not use the optimization method of this paper based on the GPU.

Key words: low density parity-check(LDPC) codes, graphic processing unit(GPU), compute unified device architecture(CUDA), parallel computing, shared memory, registers

任计林;车书玲;郑征. 规则LDPC码在GPU上的加速译码[J]. 西安电子科技大学学报, 2017, 44(3): 25-30.

REN Jilin;CHE Shuling;ZHENG Zheng. Decoding of regular LDPC codes accelerated by the GPU[J]. Journal of Xidian University, 2017, 44(3): 25-30.

参考文献

［1］ GALLAGER R. Low-density-parity-check-code［J］. IRE Transactions on Information Theory, 1962, 8(1): 21-28.
［2］ MACKAY D J C, NEAL R M. Shannon Limit Performance of Low Density Parity Check Codes［J］. Electronics Letters, 1997, 33(6): 457-458.
［3］ VERMA S, SHARMA S. FPGA Implementation of Low Complexity LDPC Iterative Decoder［J］. International Journal of Electronics, 2016, 103(7): 1112-1126.
［4］ BONCALO O, AMARICAI A, MIHANCEA P F, et al. Memory Trade-offs in Layered Self-corrected Min-sum LDPC Decoders［J］. Analog Integrated Circuits and Signal Processing, 2016, 87(2): 169-180.
［5］ LIU Z, WANG Y P, Lu L L. Parallel Algorithmic Optimization and Achievement for LDPC Encoding and Decoding on CUDA Platform［C］//Proceeding of the 2015 International Conference on Computational Science and Computational Intelligence. Piscataway: IEEE, 2016: 632-636.
［6］ LIN Y, NIU W S. High Throughput LDPC Decoder on GPU［J］. IEEE Communications Letters, 2014, 18(2): 344-347.
［7］ JI H, CHO J, SUNG W. Memory Access Optimized Implementation of Cyclic and Quasi-cyclic LDPC Codes on a GPGPU［J］. Journal of Signal Processing Systems, 2011, 64(1): 149-159.
［8］ HOU Y, LIU R K, PENG H, et al. High Throughput Pipeline Decoder for LDPC Convolutional Codes on GPU［J］. IEEE Communications Letters, 2015, 19(12): 2066-2069.
［9］ KANG S, MOON J. Parallel LDPC Decoder Implementation on GPU Based on Unbalanced Memory Coalescing［C］//Proceeding of the IEEE International Conference on Communications. Piscataway: IEEE, 2012: 3692-3697.
［10］ WANG G, WU M, SUN Y, et al. GPU Accelerated Scalable Parallel Decoding of LDPC Codes［C］//Proceeding of the Asilomar Conference on Signals, Systems and Computers. Piscataway: IEEE, 2011: 2053-2057.
［11］ FARBER R. CUDA Application Design and Development［M］. San Francisco: Morgan Kaufmann, 2011: 111-112.

[1]	翟畅,林中朝,赵勋旺,张玉. 一种使用八叉树的半空间MLFMA区域分解算法[J]. 西安电子科技大学学报, 2021, 48(6): 144-150.
[2]	张娅妹,周林,陈辰,陈启望,贺玉成. 瑞利衰落信道中SC-LDPC码滑窗译码算法[J]. 西安电子科技大学学报, 2020, 47(6): 78-83.
[3]	张娅妹,周林,陈辰,郭荣新,贺玉成. 窗口可变的空间耦合LDPC码滑窗译码算法[J]. 西安电子科技大学学报, 2020, 47(3): 128-134.
[4]	孙岳,李蓓蕾,梁彩虹,李颖. 块衰落信道下串联多链空间耦合LDPC码设计[J]. 西安电子科技大学学报, 2019, 46(2): 1-5.
[5]	郭军军, 白硕栋, 慕建君, 荆心, 肖锋. 一种有效的LDPC码伪码字搜索算法[J]. 西安电子科技大学学报, 2018, 45(6): 162-166.
[6]	张昭基;李颖. 适用于突发删除信道的非对称空间耦合LDPC码[J]. 西安电子科技大学学报, 2017, 44(5): 1-6.
[7]	张旋;慕建君;焦晓鹏. 一种MLC闪存存储系统的比特翻转译码算法[J]. 西安电子科技大学学报, 2017, 44(5): 75-80+146.
[8]	范亚楠;王丽冲;姚秀娟;孟新. 交叠的分层置信度传播LDPC译码算法[J]. 西安电子科技大学学报, 2017, 44(2): 88-94.
[9]	徐恒舟;孙成;白宝明. 循环差族构造的多元LDPC码[J]. 西安电子科技大学学报, 2017, 44(1): 6-11+28.
[10]	周航;蔡志明;王希敏. 宽带信号匹配滤波的GPU实现及性能优化[J]. J4, 2015, 42(3): 135-140+191.
[11]	刘镇弢;李涛;黄虎才;韩俊刚;沈绪榜 . 一种用于实时图像处理的众核结构设计[J]. J4, 2015, 42(2): 95-101.
[12]	白冰;牛中奇. 时域有限差分法中的GPU加速高效CPML方案[J]. J4, 2015, 42(1): 194-199+212.
[13]	刘洋;李颖;李京娥. 译码转发中继信道下双层延长LDPC码设计[J]. J4, 2014, 41(5): 13-17.
[14]	马克祥;刘毅;胡建华;孙吉成;张海林. LDPC码的一种低复杂度译码算法及关键电路设计[J]. J4, 2013, 40(6): 6-12.
[15]	秦华;周沫;察豪;左炜. 软件雷达信号处理的多GPU并行技术[J]. J4, 2013, 40(3): 145-151.

规则LDPC码在GPU上的加速译码

Decoding of regular LDPC codes accelerated by the GPU

PDF (PC)

赞

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 10