Fine-Grained Energy-Efficient Sorting on a Many-Core Processor Array

被引:4
|
作者
Stillmaker, Aaron [1 ]
Stillmaker, Lucas [1 ]
Baas, Bevan [1 ]
机构
[1] Univ Calif Davis, Dept Elect & Comp Engn, Davis, CA 95616 USA
关键词
parallel processing; external sorting; streaming sorting; fine-grained many-core; processor array; modular programing;
D O I
10.1109/ICPADS.2012.93
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data centers require significant and growing amounts of power to operate, and with increasing numbers of data centers worldwide, power consumption for enterprise workloads is a significant concern. Sorting is a key computational kernel in large database systems, and the development of energy efficient sorting capabilities would therefore significantly reduce data center power usage. We propose highly parallel sorting algorithms and mappings using a modular design for a fine-grained many-core system that greatly decreases the amount of energy consumed to perform sorts of arbitrarily large data sets. The memory, computational, and nearest-neighbor inter-processor communication hardware of the many-core processor array require relatively small die area. We present the design and implementation of several sorting variants that perform the first phase of an external sort. They are built using program kernels operating on independent processors in a many-core array with 256 bytes of data memory and fewer than 128 instructions per processor. The algorithms employed are simple and the vast majority of processors contain identical programs. Compared to a quicksort implementation on an Intel Core 2 Duo T9600 the highest throughput design achieves up to 27x higher throughput per chip area, and the most energy efficient sort yields a 330x reduction in energy dissipated per sorted block. Compared to a radix sort implementation on a GPU, the highest throughput design achieves up to 22x higher throughput per chip area, and the most energy efficient sort yields a 750x reduction in energy dissipated per sorted block.
引用
收藏
页码:652 / 659
页数:8
相关论文
共 50 条
  • [1] Scalable energy-efficient parallel sorting on a fine-grained many-core processor array
    Stillmaker, Aaron
    Bohnenstiehl, Brent
    Stillmaker, Lucas
    Baas, Bevan
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2020, 138 : 32 - 47
  • [2] Thermal Management of a Many-Core Processor under Fine-Grained Parallelism
    Keceli, Fuat
    Moreshet, Tali
    Vishkin, Uzi
    [J]. EURO-PAR 2011: PARALLEL PROCESSING WORKSHOPS, PT I, 2012, 7155 : 249 - 259
  • [3] Display Stream Compression Decoders for Fine-Grained Many-Core Processor Arrays
    Wu, Shifu
    Baas, Bevan M.
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (05) : 1730 - 1734
  • [4] Study on Fine-grained Synchronization in Many-Core Architecture
    Yu, Lei
    Liu, Zhiyong
    Fan, Dongrui
    Song, Fenglong
    Zhang, Junchao
    Yuan, Nan
    [J]. SNPD 2009: 10TH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCES, NETWORKING AND PARALLEL DISTRIBUTED COMPUTING, PROCEEDINGS, 2009, : 524 - 529
  • [5] AsAP: A fine-grained many-core platform for DSP applications
    Baas, Bevan
    Yu, Zhiyi
    Meeuwsen, Michael
    Sattari, Omar
    Apperson, Ryan
    Work, Eric
    Webb, Jeremy
    Lai, Michael
    Mohsenin, Tinoosh
    Truong, Dean
    Cheung, Jason
    [J]. IEEE MICRO, 2007, 27 (02) : 34 - 45
  • [6] Energy-efficient canonical Huffman decoders on many-core processor arrays and FPGAs
    Sarangi, Satyabrata
    Baas, Bevan
    [J]. INTEGRATION-THE VLSI JOURNAL, 2023, 88 : 156 - 165
  • [7] A Fine-Grained Parallel Particle Swarm Optimization on Many-core and Multi-core Architectures
    Nedjah, Nadia
    Calazan, Rogerio de Moraes
    Mourelle, Luiza de Macedo
    [J]. PARALLEL COMPUTING TECHNOLOGIES (PACT 2017), 2017, 10421 : 215 - 224
  • [8] Unleashing Fine-Grained Parallelism on Embedded Many-Core Accelerators with Lightweight OpenMP Tasking
    Tagliavini, Giuseppe
    Cesarini, Daniele
    Marongiu, Andrea
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (09) : 2150 - 2163
  • [9] Designing Energy-Efficient Many-Core Servers for Exascale Computing
    Alonso, David Atienza
    [J]. 2017 30TH SYMPOSIUM ON INTEGRATED CIRCUITS AND SYSTEMS DESIGN (SBCCI 2017): CHOP ON SANDS, 2017, : XVIII - XVIII
  • [10] A High-Performance Parallel CAVLC Encoder on a Fine-Grained Many-Core System
    Xiao, Zhibin
    Baas, Bevan
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, 2008, : 248 - 254