Automatic Kernel Fusion for Image Processing DSLs

被引:11
|
作者
Qiao, Bo [1 ]
Reiche, Oliver [1 ]
Hannig, Frank [1 ]
Teich, Juergen [1 ]
机构
[1] Friedrich Alexander Univ Erlangen Nurnberg FAU, Hardware Software Codesign, Dept Comp Sci, Erlangen, Germany
关键词
Domain-Specific Languages; Image Processing; Kernel Fusion; GPUs; LANGUAGE; COMPILER;
D O I
10.1145/3207719.3207723
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Programming image processing algorithms on hardware accelerators such as graphics processing units (GPUs) often exhibits a trade-off between software portability and performance portability. Domain-specific languages (DSLs) have proven to be a promising remedy, which enable optimizations and generation of efficient code from a concise, high-level algorithm representation. The scope of this paper is an optimization framework for image processing DSLs in the form of a source-to-source compiler. To cope with the inter-kernel communication bound via global memory for GPU applications, kernel fusion is investigated as a primary optimization technique to improve temporal locality. In order to enable automatic kernel fusion, we analyze the fusibility of each kernel in the algorithm, in terms of data dependencies, resource utilization, and parallelism granularity. By combining the obtained information with the domain-specific knowledge captured in the DSL, a method to automatically fuse the suitable kernels is proposed and integrated into an open source DSL framework. The novel kernel fusion technique is evaluated on two filter-based image processing applications, for which speedups of up to 1.60 are obtained for an NVIDIA Geforce 745 graphics card target.
引用
收藏
页码:76 / 85
页数:10
相关论文
共 50 条
  • [1] Auto-vectorization for Image Processing DSLs
    Reiche, Oliver
    Kobylko, Christof
    Hannig, Frank
    Teich, Juergen
    ACM SIGPLAN NOTICES, 2017, 52 (05) : 21 - 30
  • [2] Kernel Fusion for Better Image Deblurring
    Mai, Long
    Liu, Feng
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 371 - 380
  • [3] Synthesizing JIT Compilers for In-Kernel DSLs
    Van Geffen, Jacob
    Nelson, Luke
    Dillig, Isil
    Wang, Xi
    Torlak, Emina
    COMPUTER AIDED VERIFICATION, PT II, 2020, 12225 : 564 - 586
  • [4] Kernel regression for image processing and reconstruction
    Takeda, Hiroyuki
    Farsiu, Sina
    Milanfar, Peyman
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2007, 16 (02) : 349 - 366
  • [5] Kernel Fusion/Decomposition for Automatic GPU-Offloading
    Mishra, Alok
    Kong, Martin
    Chapman, Barbara
    CGO 2019 - Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization, 2019, : 283 - 284
  • [6] Kernel Fusion/Decomposition for Automatic GPU-Offloading
    Mishra, Alok
    Kong, Martin
    Chapman, Barbara
    PROCEEDINGS OF THE 2019 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO '19), 2019, : 283 - 284
  • [7] Automatic production of end user documentation for DSLs
    Le Moulec, Gwendal
    Blouin, Arnaud
    Gouranton, Valerie
    Arnaldi, Bruno
    COMPUTER LANGUAGES SYSTEMS & STRUCTURES, 2018, 54 : 337 - 357
  • [8] Automatic multimodal medical image fusion
    Zhang, ZF
    Yao, J
    Bajwa, S
    Gudas, T
    SMCIA/03: PROCEEDINGS OF THE 2003 IEEE INTERNATIONAL WORKSHOP ON SOFT COMPUTING IN INDUSTRIAL APPLICATIONS, 2003, : 161 - 166
  • [9] Automatic image enhancement by picture fusion
    Castorina, A
    Capra, A
    Curti, S
    Ardizzone, E
    Lo Verde, V
    Digital Photography, 2005, 5678 : 230 - 238
  • [10] Automatic multimodal medical image fusion
    Zhang, ZF
    Yao, J
    Bajwa, S
    Gudas, T
    CBMS 2003: 16TH IEEE SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, 2003, : 42 - 49