Optimization of large-scale graph traversal for supercomputers

被引：0

作者：

Tan W. ^{[1
]}

Gan X. ^{[1
]}

Bai H. ^{[1
]}

Xiao T. ^{[1
]}

Chen X. ^{[1
]}

Lei S. ^{[2
]}

Liu J. ^{[1
]}

机构：

[1] College of Computer Science and Technology, National University of Defense Technology, Changsha

[2] College of General Education, Information College of Hunan, Changsha

来源：

Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University | 2021年 / 48卷 / 06期

关键词：

Buffer storage; Graph structures; Graph500; Supercomputers; Vertex sorting;

D O I：

10.19665/j.issn1001-2400.2021.06.011

中图分类号：

学科分类号：

摘要：

In the big data era, with the significant development of graph data, the demand for computing resources is growing rapidly. Supercomputers are applied to process large-scale graph data, which puts forward higher requirements for the storage and computing capabilities of supercomputers. In order to efficiently process large-scale graph data and evaluate the graph processing capabilities of the Tianhe supercomputer, in this paper we propose a graph traversal optimization technique for improving the efficiency of the benchmark program of Graph500, an important benchmark for evaluating graph processing capabilities of supercomputer. The technique mainly adopts the vertex sorting and priority caching strategy, where the vertices in the graph are sorted by degree in a descending order and some key vertices are stored in the cache of the core group of the Tianhe system. Therefore, this technique cuts down on invalid memory access and reduces the communication overhead between processes for maximizing the usage of the bandwidth for the supercomputer system. In order to validate graph traversal based on vertex sorting and buffering, an optimized graph500 version named VS-graph500 is customized for the Tianhe supercomputer, experimental results demonstrate that the VS-graph500 has a significant acceleration and good scalability in the supercomputers testing system, and attains a stable testing performance at 2547.13EGTEPS when the graph testing scale is 37, which is superior to the 7th in Graph500 list in June 2020. © 2021, The Editorial Board of Journal of Xidian University. All right reserved.

引用

页码：84 / 95

页数：11

共 22 条

[1] DODDS P S, MUHAMAD R, WATTS D., An Experimental Study of Search in Global Social Networks[J], Science, 301, 5634, pp. 827-829, (2003)
[2] BEUTEL A, FALOUTSOS C., User Behavior Modeling and Fraud Detection[J], Intelligent Systems IEEE, 2, 31, pp. 84-86, (2016)
[3] BARABASI A-L, ALBERT R., Emergence of Scaling in Random Networks[J], Science, 286, 5439, pp. 509-512, (1999)
[4] AGARWAL V, PETRINI F, PASETTO D, Et al., Scalable Graph Exploration on Multicore Processors, Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1-11, (2010)
[5] UENO K, SUZUMURA T., Highly Scalable Graph Search for the Graph500 Benchmark, Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing, pp. 149-160, (2012)
[6] BEAMER S, BULUC A, ASANOVIC K, Et al., Distributed Memory Breadth-First Search Revisited:Enabling Bottom-Up Search, Proceedings of 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum, pp. 1618-1627, (2013)
[7] UENO K, SUZUMURA T, MARUYAMA N, Et al., Efficient Breadth-First Search on Massively Parallel and Distributed-Memory Machines[J], Data Science and Engineering, 2, pp. 22-35, (2017)
[8] BADER D, MADDURI K., Designing Multithreaded Algorithms for Breadth-First Search and St-Connectivity on the Cray MTA-2, Proceedings of 2006 International Conference on Parallel Processing, pp. 523-530, (2006)
[9] LIN H, TANG X, YU B, Et al., Scalable Graph Traversal on Sunway Taihulight with Ten Million Core, Proceedings of 2017 IEEE International Parallel and Distributed Processing Symposium, pp. 635-645, (2017)
[10] LIN H, ZHU X, YU B, Et al., ShenTu:Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds, International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 706-716, (2018)

← 1 2 3 →