From Loop Fusion to Kernel Fusion: A Domain-Specific Approach to Locality Optimization

被引:0
|
作者
Qiao, Bo [1 ]
Reiche, Oliver [1 ]
Hannig, Frank [1 ]
Teich, Juergen [1 ]
机构
[1] Friedrich Alexander Univ Erlangen Nurnberg FAU, Erlangen, Germany
关键词
ALGORITHM;
D O I
10.5281/zenodo.2240193
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Optimizing data-intensive applications such as image processing for GPU targets with complex memory hierarchies requires to explore the tradeoffs among locality, parallelism, and computation. Loop fusion as one of the classical optimization techniques has been proven effective to improve locality at the function level. Algorithms in image processing are increasing their complexities and generally consist of many kernels in a pipeline. The inter-kernel communications are intensive and exhibit another opportunity for locality improvement at the system level. The scope of this paper is an optimization technique called kernel fusion for data locality improvement. We present a formal description of the problem by defining an objective function for locality optimization. By transforming the fusion problem to a graph partitioning problem, we propose a solution based on the minimum cut technique to search fusible kernels recursively. In addition, we develop an analytic model to quantitatively estimate potential locality improvement by incorporating domain-specific knowledge and architecture details. The proposed technique is implemented in an image processing DSL and source-to-source compiler called Hipacc, and evaluated over six image processing applications on three Nvidia GPUs. A geometric mean speedup of up to 2.52 can be observed in our experiments(1).
引用
收藏
页码:242 / 253
页数:12
相关论文
共 50 条
  • [1] Declarative Loop Tactics for Domain-specific Optimization
    Chelini, Lorenzo
    Zinenko, Oleksandr
    Grosser, Tobias
    Corporaal, Henk
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2019, 16 (04)
  • [2] Domain-Specific Fusion of Objective Video Quality Metrics
    Chadha, Aaron
    Katsavounidis, Ioannis
    Bhunia, Ayan Kumar
    Stejerean, Cosmin
    Khan, Mohammad Umar Karim
    Andreopoulos, Yiannis
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1387 - 1395
  • [3] Domain-Specific Architecture for IMU Array Data Fusion
    Waheed, Owais Talaat
    Elfadel, Ibrahim M.
    2019 IFIP/IEEE 27TH INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2019, : 129 - 134
  • [4] Fusion of Domain-Specific and Trainable Features for Gender Recognition From Face Images
    Azzopardi, George
    Greco, Antonio
    Saggese, Alessia
    Vento, Mario
    IEEE ACCESS, 2018, 6 : 24171 - 24183
  • [5] Segmentation Fusion for Building Detection Using Domain-Specific Information
    Karadag, Ozge Oztimur
    Senaras, Caglar
    Vural, Fatos T. Yarman
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2015, 8 (07) : 3305 - 3315
  • [6] Aggressive loop fusion for improving locality and parallelism
    Xue, JL
    PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, 2005, 3758 : 224 - 238
  • [7] With-loop fusion for data locality and parallelism
    Grelck, Clemens
    Hinckfuss, Karsten
    Scholz, Sven-Bodo
    IMPLEMENTATION AND APPLICATION OF FUNCTIONAL LANGUAGES, 2006, 4015 : 178 - +
  • [8] Fusion kernel optimization algorithm
    Chen, Bo
    Liu, Hong-Wei
    Bao, Zheng
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2007, 34 (04): : 509 - 513
  • [9] Guaranteed optimization for domain-specific programming
    Veldhuizen, TL
    DOMAIN-SPECIFIC PROGRAM GENERATION, 2003, 3016 : 307 - 324
  • [10] Domain-Specific Quantum Architecture Optimization
    Lin, Wan-Hsuan
    Tan, Bochen
    Niu, Murphy Yuezhen
    Kimko, Jason
    Cong, Jason
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2022, 12 (03) : 624 - 637