Loop tiling for optimization of locality and parallelism

被引：0

作者：

Liu, Song ^{[1
]}

Wu, Weiguo ^{[1
]}

Zhao, Bo ^{[1
]}

Jiang, Qing ^{[1
]}

机构：

[1] School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an,710049, China

来源：

Jisuanji Yanjiu yu Fazhan/Computer Research and Development | 2015年 / 52卷 / 05期

关键词：

Economic and social effects - Memory architecture - Optimal systems - Codes (symbols) - Ion beams - Iterative methods;

D O I：

10.7544/issn1000-1239.2015.20131387

中图分类号：

学科分类号：

摘要：

Loop tiling is a widely used loop transformation for exposing/exploiting parallelism and data locality in modern computer architecture. It is mainly divided into two categories: fixed and parameterized. These two types of tiling technologies are systematically summarized and their advantages and disadvantages are analyzed comprehensively. Since the tile size would significantly affect the performance of the tiled code, various methods of optimal tile size selection are described. Besides, various kinds of technologies applied to multi-level tiling, parallelism exploration and imperfectly nested loops are surveyed in this paper. Based on the detailed analysis of the current researches on loop tiling technologies, several conclusions are drawn as follows: 1) How to balance the trade-off between computation complexity and generation efficiency of tiled code has not been completely solved, and how to use loop boundaries to efficiently bound the iteration spaces for data locality enhancement also needs further study. 2) Optimal tile size selection is still a difficult and open question, and it would be significant to understand the influence of different level tile size in hierarchical memory system on performance. 3) From the perspective of application, how to automatically generate effective tiled code for arbitrarily nested loops needs further research. On the other hand, how to take full advantage of shared hierarchical memory and multi-core architectures to achieve high degree of parallelism for tiled code is another interesting direction. ©, 2015, Science Press. All right reserved.

引用

页码：1160 / 1176

共 50 条

[1] With-loop fusion for data locality and parallelism
Grelck, Clemens
Hinckfuss, Karsten
Scholz, Sven-Bodo
IMPLEMENTATION AND APPLICATION OF FUNCTIONAL LANGUAGES, 2006, 4015 : 178 - +
[2] Aggressive loop fusion for improving locality and parallelism
Xue, JL
PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, 2005, 3758 : 224 - 238
[3] NESTED-LOOPS TILING FOR PARALLELIZATION AND LOCALITY OPTIMIZATION
Parsa, Saeed
Hamzei, Mohammad
COMPUTING AND INFORMATICS, 2017, 36 (03) : 566 - 596
[4] Loop-synthesizing transformation for maintaining parallelism and enhancing locality
Lee, S
Aso, H
2003 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS, PROCEEDINGS, 2003, : 156 - 163
[5] A parametrized loop fusion algorithm for improving parallelism and cache locality
Singhai, SK
McKinley, KS
COMPUTER JOURNAL, 1997, 40 (06): : 340 - 355
[6] Exposing Parallelism and Locality in a Runtime Parallel Optimization Framework
Penry, David A.
Richins, Daniel J.
Harris, Tyler S.
Greenland, David
Rehme, Koy D.
PROCEEDINGS OF THE 2010 COMPUTING FRONTIERS CONFERENCE (CF 2010), 2010, : 117 - 118
[7] Diamond Tiling: Tiling Techniques to Maximize Parallelism for Stencil Computations
Bondhugula, Uday
Bandishti, Vinayaka
Pananilath, Irshad
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (05) : 1285 - 1298
[8] Hexagonal Loop Tiling for Jacobi Computation Optimization Method
Qu, Bin
Liu, Song
Zhang, Zeng-Yuan
Ma, Jie
Wu, Wei-Guo
Ruan Jian Xue Bao/Journal of Software, 2024, 35 (08): : 3721 - 3738
[9] Data locality and parallelism optimization using a constraint-based approach
Ozturk, Ozcan
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2011, 71 (02) : 280 - 287
[10] Tiling Stencil Computations to Maximize Parallelism
Bandishti, Vinayaka
Pananilath, Irshad
Bondhugula, Uday
2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,

← 1 2 3 4 5 →