Automatic tuning to performance modelling of matrix polynomials on multicore and multi-GPU systems

被引：2

作者：

Boratto, Murilo ^{[1
]}

Alonso, Pedro ^{[2
]}

Gimenez, Domingo ^{[3
]}

Lastovetsky, Alexey ^{[4
]}

机构：

[1] Univ Estado Bahia, Nucleo Arquitetura Comp & Sistemas Operacionais, Salvador, BA, Brazil

[2] Univ Politecn Valencia, Dept Sistemas Informat & Comp, Valencia, Spain

[3] Univ Murcia, Dept Sistemas Informat, Murcia, Spain

[4] Univ Coll Dublin, Sch Comp Sci, Heterogeneous Comp Lab, Dublin, Ireland

来源：

JOURNAL OF SUPERCOMPUTING | 2017年 / 73卷 / 01期

关键词：

Automatic tuning; Matrix polynomials; Performance; Multicore; Multi-GPU;

D O I：

10.1007/s11227-016-1694-y

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Automatic tuning methodologies have been used in the design of routines in recent years. The goal of these methodologies is to develop routines which automatically adapt to the conditions of the underlying computational system so that efficient executions are obtained independently of the end-user experience. This paper aims to explore programming routines that can automatically be adapted to the computational system conditions thanks to these automatic tuning methodologies. In particular, we have worked on the evaluation of matrix polynomials on multicore and multi-GPU systems as a target application. This application is very useful for the computation of matrix functions like the sine or cosine but, at the same time, the application is very time consuming since the basic computational kernel, which is the matrix multiplication, is carried out many times. The use of all available resources within a node in an easy and efficient way is crucial for the end user.

引用

页码：227 / 239

页数：13

共 50 条

[41] Autonomous Execution for Multi-GPU Systems: Compiler Support
Koç University, Istanbul, Turkey
不详
CA, United States
Proc. SC -W: Workshops Int. Conf. High Perform. Comput., Netw., Storage Anal., (1129-1140):
[42] Efficient breadth first search on multi-GPU systems
Mastrostefano, Enrico
Bernaschi, Massimo
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (09) : 1292 - 1305
[43] Dynamic load balancing on heterogeneous multi-GPU systems
Acosta, Alejandro
Blanco, Vicente
Almeida, Francisco
COMPUTERS & ELECTRICAL ENGINEERING, 2013, 39 (08) : 2591 - 2602
[44] Tensor Movement Orchestration in Multi-GPU Training Systems
Lin, Shao-Fu
Chen, Yi-Jung
Cheng, Hsiang-Yun
Yang, Chia-Lin
2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, 2023, : 1140 - 1152
[45] Gossip: Efficient Communication Primitives for Multi-GPU Systems
Kobus, Robin
Juenger, Daniel
Hundt, Christian
Schmidt, Bertil
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
[46] MGPUSim: Enabling Multi-GPU Performance Modeling and Optimization
Sun, Yifan
Baruah, Trinayan
Mojumder, Saiful A.
Dong, Shi
Gong, Xiang
Treadway, Shane
Bao, Yuhui
Hance, Spencer
McCardwell, Carter
Zhao, Vincent
Barclay, Harrison
Ziabari, Amir Kavyan
Chen, Zhongliang
Ubal, Rafael
Abelian, Jose L.
Kim, John
Joshi, Ajay
Kaeli, David
PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, : 197 - 209
[47] Solving Multiple Tridiagonal Systems on a Multi-GPU Platform
Dieguez, Adrian P.
Amor, Margarita
Doallo, Ramon
2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018), 2018, : 759 - 763
[48] Optimization of Large-Scale Sparse Matrix-Vector Multiplication on Multi-GPU Systems
Gao, Jianhua
Ji, Weixing
Wang, Yizhuo
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2024, 21 (04)
[49] Automatic Parallelization of Kernels in Shared-Memory Multi-GPU Nodes
Cabezas, Javier
Vilanova, Lluis
Gelado, Isaac
Jablin, Thomas B.
Navarro, Nacho
Hwu, Wen-mei W.
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 3 - 13
[50] WORKLOAD-AWARE AUTOMATIC PARALLELIZATION FOR MULTI-GPU DNN TRAINING
Shin, Sungho
Jo, Youngmin
Choi, Jungwook
Venkataramani, Swagath
Srinivasan, Vijayalakshmi
Sung, Wonyong
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1453 - 1457

← 1 2 3 4 5 →