西安电子科技大学学报 ›› 2021, Vol. 48 ›› Issue (6): 84-95.doi: 10.19665/j.issn1001-2400.2021.06.011

• 智能嵌入式系统结构与软件关键技术专栏 • 上一篇    下一篇

面向超级计算机系统的大规模图遍历优化

谭雯1(),甘新标1(),白皓1(),肖调杰1(),陈旭光1(),雷书梦2(),刘杰1()   

  1. 1.国防科技大学 计算机学院,湖南 长沙,410073
    2.湖南信息学院 通识教育学院,湖南 长沙,410217
  • 收稿日期:2021-08-16 出版日期:2021-12-20 发布日期:2022-02-24
  • 通讯作者: 甘新标
  • 作者简介:谭 雯(1996—),女,国防科技大学硕士研究生,E-mail: lingXiTW@qq.com|白 皓(1993—),男,国防科技大学硕士研究生,E-mail: baihaobbg@163.com|肖调杰(1991—),男,助理研究员,E-mail: xiaotiaojie@nudt.edu.cn|陈旭光(1991—),男,助理研究员,E-mail: chenxuguang@nudt.edu.cn|雷书梦(1990—),女,讲师,E-mail: leishumeng2020@163.com|刘 杰(1969—),男,研究员,E-mail: liujie@nudt.edu.cn
  • 基金资助:
    国家重点研究与发展计划(2018YFB0204301);国家自然科学基金(61902411);国家数值风洞项目(NNW2019ZT6-B21);国家数值风洞项目(NNW2019ZT6-B20);国家数值风洞项目(NNW2019ZT5-A10);PDL基金(6142110190206);PDL基金(6142110180203);湖南省自然科学基金(2020JJ4669)

Optimization of large-scale graph traversal for supercomputers

TAN Wen1(),GAN Xinbiao1(),BAI Hao1(),XIAO Tiaojie1(),CHEN Xuguang1(),LEI Shumeng2(),LIU Jie1()   

  1. 1. College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China
    2. College of General Education,Information College of Hunan,Changsha 410217,China
  • Received:2021-08-16 Online:2021-12-20 Published:2022-02-24
  • Contact: Xinbiao GAN

摘要:

现实中的数据问题通常被抽象为图。在大数据时代,图数据趋于复杂,这是因为数据量大幅提升,所需要的计算规模迅速增长。大规模的图数据问题对超算平台的存储运算能力具有广泛需求,并对此提出了更高的要求。为了高效地处理大规模图数据,发挥天河超级计算机实验平台的图处理能力,基于现实世界中图结构的小世界性和无尺度性特征,面向评测超级计算机图处理能力的重要基准Graph500,提出一种主要应用于大规模图的图遍历优化方法。这一方法结合了天河平台的体系结构特征,在图结构上应用了顶点排序和优先缓存策略,即将图中顶点按度数从高到低排序,令程序在图遍历阶段优先访问高度数邻居顶点,并将部分关键高度数顶点缓存至天河系统核组内的高速缓存中,以此来减少Graph500基准程序中的无效访存,降低进程间的通信开销,提高访存带宽利用率,从而有效地提升Graph500基准测试程序在天河平台上的性能。面向天河超级计算机系统实验平台提出的应用顶点排序与优先缓存优化方法的VS-Graph500程序,其加速的效果显著,可扩展性好。当图测试规模为237时,全系统稳定测试性能为2 547.13 GTEPS,超过2020年11月Graph500国际排名榜上第7名的数据。

关键词: Graph500基准, 图结构, 顶点排序, 优先缓存, 超级计算机系统

Abstract:

In the big data era,with the significant development of graph data,the demand for computing resources is growing rapidly.Supercomputers are applied to process large-scale graph data,which puts forward higher requirements for the storage and computing capabilities of supercomputers.In order to efficiently process large-scale graph data and evaluate the graph processing capabilities of the Tianhe supercomputer,in this paper we propose a graph traversal optimization technique for improving the efficiency of the benchmark program of Graph500,an important benchmark for evaluating graph processing capabilities of supercomputer.The technique mainly adopts the vertex sorting and priority caching strategy,where the vertices in the graph are sorted by degree in a descending order and some key vertices are stored in the cache of the core group of the Tianhe system.Therefore,this technique cuts down on invalid memory access and reduces the communication overhead between processes for maximizing the usage of the bandwidth for the supercomputer system.In order to validate graph traversal based on vertex sorting and buffering,an optimized graph500 version named VS-graph500 is customized for the Tianhe supercomputer,experimental results demonstrate that the VS-graph500 has a significant acceleration and good scalability in the supercomputers testing system,and attains a stable testing performance at 2547.13EGTEPS when the graph testing scale is 37,which is superior to the 7th in Graph500 list in June 2020.

Key words: Graph500, graph structures, vertex sorting, buffer storage, supercomputers

中图分类号: 

  • TP391
Baidu
map