电子科技 ›› 2021, Vol. 34 ›› Issue (5): 54-60.doi: 10.16180/j.cnki.issn1007-7820.2021.05.010

• • 上一篇    下一篇

一种基于NoC多核系统的矩阵乘法映射技术

汪杨,王晓蕾,袁子昂,袁儒明   

  1. 合肥工业大学 电子科学与应用物理学院,安徽 合肥 230009
  • 出版日期:2021-05-15 发布日期:2021-05-24
  • 作者简介:汪杨(1996-),男,硕士研究生。研究方向:集成电路设计与测试。|王晓蕾(1978-),女,副教授。研究方向:集成电路设计理论。
  • 基金资助:
    国家自然科学基金(61874156)

A Matrix Multiplication Mapping Technology Based on NOC Multi-Core System

WANG Yang,WANG Xiaolei,YUAN Ziang,YUAN Ruming   

  1. School of Electronic Science and Applied Physics,Hefei University of Technology,Hefei 230009,China
  • Online:2021-05-15 Published:2021-05-24
  • Supported by:
    National Natural Science Foundation of China(61874156)

摘要:

矩阵乘法是现代信号处理的基本运算,提高数据的并行处理能力对提升矩阵乘法的运算性能具有重要现实意义。文中在基于NoC多核系统中针对不同维度的矩阵乘法的密集型计算进行任务调度以及资源分配,实现了多种适应于不同矩阵乘法的映射方案,其峰值性能可达5 078 MFLOPS。同时,文中设计的运算单元相对独立且可重构,对任意维度的矩阵乘法具有良好的扩展性和通用性,解决了通用矩阵乘法器在固定结构中受到I/O带宽和计算资源的限制而产生的运算效率较低和扩展性较差的缺陷。不同维度矩阵乘法的实验结果分析证实了文中设计的运算性能和正确性。

关键词: 矩阵乘法, 并行计算, NoC多核, 密集型, 任务调度, 资源分配, 通用性, I/O带宽

Abstract:

Matrix multiplication is the basic operation of modern signal processing. Improving the parallel processing capacity of data has important practical significance for improving the operation performance of matrix multiplication. In this study, task scheduling and resource allocation are carried out for the intensive computing of matrix multiplication in different dimensions based on NOC multi-core system, and a variety of mapping algorithms suitable for different matrix multiplication are implemented, and the peak performance can reach 5078 MFLOPS. The designed operation unit is relatively independent and reconfigurable, which has good expansibility and generality for matrix multiplication of any dimension. It overcomes the limitation of I/O bandwidth and computing resources in fixed structure, which leads to low efficiency and poor expansibility. Through the analysis of the experimental results of matrix multiplication of different dimensions, the correctness and high performance of the design are verified.

Key words: matrix multiplication, parallel computing, NoC multi-core, intensive, task scheduling, resource allocation, generality, I/O bandwidth

中图分类号: 

  • TN492
Baidu
map