Energy-efficient sorting with the distributed memory architecture ePUMA

被引:0
|
作者
Karlsson, Andreas [1 ]
Sohl, Joar [1 ]
Liu, Dake [1 ]
机构
[1] Linkoping Univ, Dept Elect Engn, S-58183 Linkoping, Sweden
关键词
D O I
10.1109/Trustcom.2015.620
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents the novel heterogeneous DSP architecture ePUMA and demonstrates its features through an implementation of sorting of larger data sets. We derive a sorting algorithm with fixed-size merging tasks suitable for distributed memory architectures, which allows very simple scheduling and predictable data-independent sorting time. The implementation on ePUMA utilizes the architecture's specialized compute cores and control cores, and local memory parallelism, to separate and overlap sorting with data access and control for close to stall-free sorting. Penalty-free unaligned and out-of-order local memory access is used in combination with proposed application-specific sorting instructions to derive highly efficient local sorting and merging kernels used by the system-level algorithm. Our evaluation shows that the proposed implementation can rival the sorting performance of high-performance commercial CPUs and GPUs, with two orders of magnitude higher energy efficiency, which would allow high-performance sorting on low-power devices.
引用
收藏
页码:116 / 123
页数:8
相关论文
共 50 条
  • [1] Energy-efficient and High Throughput Sparse Distributed Memory Architecture
    Kang, Mingu
    Kim, Eric P.
    Keel, Min-sun
    Shanbhag, Naresh R.
    [J]. 2015 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2015, : 2505 - 2508
  • [2] Energy-efficient buffer architecture for flash memory
    Huang, W. T.
    Chen, C. T.
    Chen, C. H.
    Cheng, C. C.
    [J]. MUE: 2008 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND UBIQUITOUS ENGINEERING, PROCEEDINGS, 2008, : 543 - +
  • [3] Energy-Efficient Architecture for Advanced Video Memory
    Sampaio, Felipe
    Shafique, Muhammad
    Zatt, Bruno
    Bampi, Sergio
    Henkel, Joerg
    [J]. 2014 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2014, : 132 - 139
  • [4] Benzene: An Energy-Efficient Distributed Hybrid Cache Architecture for Manycore Systems
    Kim, Namhyung
    Ahn, Junwhan
    Choi, Kiyoung
    Sanchez, Daniel
    Yoo, Donghoon
    Ryu, Soojung
    [J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2018, 15 (01)
  • [5] Hybrid Scratchpad Video Memory Architecture for Energy-Efficient Parallel HEVC
    Sampaio, Felipe M.
    Zatt, Bruno
    Shafique, Muhammad
    Henkel, Jorg
    Bampi, Sergio
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (10) : 3046 - 3060
  • [6] An Energy-Efficient GPGPU Register File Architecture Using Racetrack Memory
    Mao, Mengjie
    Wen, Wujie
    Zhang, Yaojun
    Chen, Yiran
    Li, Hai
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2017, 66 (09) : 1478 - 1490
  • [7] Energy-Efficient Deep In-memory Architecture for NAND Flash Memories
    Gonugondla, Sujan K.
    Kang, Mingu
    Kim, Yongjune
    Helm, Mark
    Eilert, Sean
    Shanbhag, Naresh
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2018,
  • [8] SoftEdgeNet: SDN Based Energy-Efficient Distributed Network Architecture For Edge Computing
    Sharma, Pradip Kumar
    Rathore, Shailendra
    Jeong, Young-Sik
    Park, Jong Hyuk
    [J]. IEEE COMMUNICATIONS MAGAZINE, 2018, 56 (12) : 104 - 111
  • [9] A Distributed Automation Architecture Enabling Simulation-in-the-Loop of Energy-Efficient Buildings
    Deng, Yinbai
    Sorouri, Majid
    Pang, Cheng
    Vyatkin, Valeriy
    [J]. UKSIM-AMSS 15TH INTERNATIONAL CONFERENCE ON COMPUTER MODELLING AND SIMULATION (UKSIM 2013), 2013, : 542 - 547
  • [10] dSVM: Energy-Efficient Distributed Scratchpad Video Memory Architecture for the Next-Generation High Efficiency Video Coding
    Sampaio, Felipe
    Shafique, Muhammad
    Zatt, Bruno
    Bampi, Sergio
    Henkel, Joerg
    [J]. 2014 DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION (DATE), 2014,