ASPaS: A Framework for Automatic SIMDization of Parallel Sorting on x86-based Many-core Processors

被引:15
|
作者
Hou, Kaixi [1 ]
Wang, Hao [1 ]
Feng, Wu-chun [1 ]
机构
[1] Virginia Tech, Dept Comp Sci, Blacksburg, VA 24060 USA
基金
美国国家科学基金会;
关键词
sort; merge; transpose; vectorization; SIMD; ISA; MIC; AVX; AVX-512;
D O I
10.1145/2751205.2751247
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the difficulty that modern compilers have in vectorizing applications on vector-extension architectures, programmers resort to manually programming vector registers with intrinsics in order to achieve better performance. However, the continued growth in the width of registers and the evolving library of intrinsics make such manual optimizations tedious and error-prone. Hence, we propose a framework for the Automatic SIMDization of Parallel Sorting (ASPaS) on x86-based multicore and manycore processors. That is, ASPaS takes any sorting network and a given instruction set architecture (ISA) as inputs and automatically generates vectorized code for that sorting network. By formalizing the sort function as a sequence of comparators and the transpose and merge functions as sequences of vector-matrix multiplications, ASPaS can map these functions to operations from a selected "pattern pool" that is based on the characteristics of parallel sorting, and then generate the vectorized code with the real ISA intrinsics. The performance evaluation of our ASPaS framework on the Intel Xeon Phi coprocessor illustrates that automatically generated sorting codes from ASPaS can outperform the sorting implementations from STL, Boost, and Intel TBB.
引用
收藏
页码:383 / 392
页数:10
相关论文
共 50 条
  • [41] Efficient backprojection-based synthetic aperture radar computation with many-core processors
    Park, Jongsoo
    Tang, Ping Tak Peter
    Smelyanskiy, Mikhail
    Kim, Daehyun
    Benson, Thomas
    SCIENTIFIC PROGRAMMING, 2013, 21 (3-4) : 165 - 179
  • [42] Efficient Backprojection-based Synthetic Aperture Radar Computation with Many-core Processors
    Park, Jongsoo
    Tang, Ping Tak Peter
    Smelyanskiy, Mikhail
    Kim, Daehyun
    Benson, Thomas
    2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,
  • [43] IMPROVING THROUGHPUT OF POWER-CONSTRAINED MANY-CORE PROCESSORS BASED ON UNRELIABLE DEVICES
    Wang, Hao
    Kim, Nam Sung
    IEEE MICRO, 2013, 33 (04) : 16 - 24
  • [44] Scalable energy-efficient parallel sorting on a fine-grained many-core processor array
    Stillmaker, Aaron
    Bohnenstiehl, Brent
    Stillmaker, Lucas
    Baas, Bevan
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 138 : 32 - 47
  • [45] A Many-Core based Execution Framework for IEC 61131-3
    Becker, Matthias
    Sandstrom, Kristian
    Behnam, Moris
    Nolte, Thomas
    IECON 2015 - 41ST ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2015, : 4525 - 4530
  • [46] Automatic SoC Design Flow on Many-core Processors: a Software Hardware Co-Design Approach for FPGAs
    Liu, Ling
    Morozov, Oleksii
    Han, Yuxing
    Gutknecht, Juerg
    Hunziker, Patrick
    FPGA 11: PROCEEDINGS OF THE 2011 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD PROGRAMMABLE GATE ARRAYS, 2011, : 37 - 40
  • [47] Task Parallel Framework and Its Application in Nested Parallel Algorithms on the SW26010 Many-core Platform
    Sun Q.
    Li L.-S.
    Zhao H.-T.
    Zhao H.
    Wu C.-M.
    Wu, Chang-Mao (changmaowu@foxmail.com), 1600, Chinese Academy of Sciences (32): : 2352 - 2364
  • [48] Model-Based Development Considering Self-Driving Systems for Many-Core Processors
    Yoshinaka, Ryo
    Azumi, Takuya
    2020 25TH IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2020, : 337 - 344
  • [49] TAPP: Temperature-Aware Application Mapping for NoC-Based Many-Core Processors
    Zhu, Di
    Chen, Lizhong
    Pinkston, Timothy M.
    Pedram, Massoud
    2015 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2015, : 1241 - 1244
  • [50] An FPGA-based Approach to Evaluate Thermal and Resource Management Strategies of Many-core Processors
    Mettler, Marcel
    Rapp, Martin
    Khdr, Heba
    Mueller-Gritschneder, Daniel
    Henkel, Jorg
    Schlichtmann, Ulf
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2022, 19 (03)