ASPaS: A Framework for Automatic SIMDization of Parallel Sorting on x86-based Many-core Processors

被引：15

作者：

Hou, Kaixi ^{[1
]}

Wang, Hao ^{[1
]}

Feng, Wu-chun ^{[1
]}

机构：

[1] Virginia Tech, Dept Comp Sci, Blacksburg, VA 24060 USA

来源：

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15) | 2015年

基金：

美国国家科学基金会;

关键词：

sort; merge; transpose; vectorization; SIMD; ISA; MIC; AVX; AVX-512;

D O I：

10.1145/2751205.2751247

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Due to the difficulty that modern compilers have in vectorizing applications on vector-extension architectures, programmers resort to manually programming vector registers with intrinsics in order to achieve better performance. However, the continued growth in the width of registers and the evolving library of intrinsics make such manual optimizations tedious and error-prone. Hence, we propose a framework for the Automatic SIMDization of Parallel Sorting (ASPaS) on x86-based multicore and manycore processors. That is, ASPaS takes any sorting network and a given instruction set architecture (ISA) as inputs and automatically generates vectorized code for that sorting network. By formalizing the sort function as a sequence of comparators and the transpose and merge functions as sequences of vector-matrix multiplications, ASPaS can map these functions to operations from a selected "pattern pool" that is based on the characteristics of parallel sorting, and then generate the vectorized code with the real ISA intrinsics. The performance evaluation of our ASPaS framework on the Intel Xeon Phi coprocessor illustrates that automatically generated sorting codes from ASPaS can outperform the sorting implementations from STL, Boost, and Intel TBB.

引用

页码：383 / 392

页数：10

共 50 条

[31] Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors
Li, Mingzhen
Liu, Yi
Yang, Hailong
Hu, Yongmin
Sun, Qingxiao
Chen, Bangduo
You, Xin
Liu, Xiaoyan
Luan, Zhongzhi
Qian, Depei
50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2021,
[32] Parallel Evolutionary Algorithms for Stock Market Trading Rule Selection on Many-Core Graphics Processors
Lipinski, Piotr
NATURAL COMPUTING IN COMPUTATIONAL FINANCE, VOL 4, 2011, 380 : 79 - 92
[33] Design and Optimization of Parallel Algorithm for Kalman Filter on SW26010 Many-Core Processors
Yang, Aiqiang
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2022, 31 (04)
[34] The Research On The Software Architecture Of Network Packet Processing Based On The Many-core Processors
Wu Kehe
Cheng Rui
Zhang Yingqiang
Mu Hongtao
PROCEEDINGS OF 2016 IEEE 7TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2016), 2016, : 555 - 559
[35] MPI hardware framework for many-core based embedded systems
Mendonca Pereira, Rodrigo Vinicius
Seman, Laio Oriel
Berejuck, Marcelo Daniel
de Melo, Douglas Rossi
Morales, Analucia Schiaffino
Bezerra, Eduardo Augusto
INTERNATIONAL JOURNAL OF SENSOR NETWORKS, 2021, 35 (01) : 42 - 56
[36] A novel sorting algorithm for many-core architectures based on adaptive bitonic sort
Peters, Hagen
Schulz-Hildebrandt, Ole
Luttenberger, Norbert
2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2012, : 227 - 237
[37] DAG Scheduling Considering Parallel Execution for High-Load Processing on Clustered Many-core Processors
Okamura, Ryo
Azumi, Takuya
2022 IEEE/ACM 26TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED SIMULATION AND REAL TIME APPLICATIONS (DS-RT), 2022,
[38] Scalable High-Performance Parallel Design for Network Intrusion Detection Systems on Many-Core Processors
Jiang, Haiyang
Zhang, Guangxing
Xie, Gaogang
Salamatian, Kave
Mathy, Laurent
2013 ACM/IEEE SYMPOSIUM ON ARCHITECTURES FOR NETWORKING AND COMMUNICATIONS SYSTEMS (ANCS), 2013, : 137 - 146
[39] Architecture supported synchronization-based cache coherence protocol for many-core processors
Huang, He
Liu, Lei
Song, Feng-Long
Ma, Xiao-Yu
Jisuanji Xuebao/Chinese Journal of Computers, 2009, 32 (08): : 1618 - 1630
[40] A Parallel Many-core CUDA-based Graph Labeling Computation
Quer, Stefano
ICSOFT: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE TECHNOLOGIES, 2020, : 597 - 605

← 1 2 3 4 5 →