FIFO-based Hardware Sorters for High Bandwidth Memory

被引:0
|
作者
Nakano, Koji [1 ]
Ito, Yasuaki [1 ]
Bordim, Jacir L. [2 ]
机构
[1] Hiroshima Univ, Dept Informat Engn, Kagamiyama 1-4-1, Higashihiroshima 7398527, Japan
[2] Univ Brasilia, Dept Comp Sci, BR-70910900 Brasilia, DF, Brazil
关键词
parallel sorting algorithms; hardware sorter; high bandwidth memory; burst memory access; big data analysis;
D O I
10.1109/IPDPSW.2019.00112
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The main contribution of this paper is to show efficient FIFO-based hardware sorters that sort n elements with w bits each stored in a high bandwidth memory with modest access latency. We assume that each address of the high bandwidth memory can store p elements of w bits each, which can be read or written at the same time. The access latency l of the high bandwidth memory is assumed to take l clock cycles to access p elements in a specified address. Furthermore, burst mode is supported and k (>= 1) consecutive addresses can be accessed in k + l - 1 clock cycles in a pipeline fashion. However, if k addresses are not consecutive, kl clock cycles are necessary to access all of them. Clearly, all n elements arranged n/p addresses can be duplicated in 2(n/p + l - 1) clock cycles. We present two types of hardware sorters that sort n = rc elements stored in an r x c matrix of the high bandwidth memory. We first develop Three-Pass-Sort and Four-Pass-Sort that sort an r x c matrix by reading from and witting in it three times and four times, respectively. We implement these two algorithms using FIFO-based mergers that can be configured as pairwise mode and sliding mode. Our hardware sorter based on Three-Pass-Sort runs in 6n/p + 3c(2)/p(2)l + O(c/p (l + log r) + r) clock cycles using a circuit of size O(rwp) provided that r >= c(2). Also, our hardware sorter based on Four-Pass-Sort runs in 8n/p + 2c(2)l + O(cl + log r + p) clock cycles using a circuit of size O(rw).
引用
收藏
页码:663 / 672
页数:10
相关论文
共 50 条
  • [21] An FPGA Based FIFO with Efficient Memory Management
    Windmann, Stefan
    Jasperneite, Juergen
    [J]. PROCEEDINGS OF 2015 IEEE 20TH CONFERENCE ON EMERGING TECHNOLOGIES & FACTORY AUTOMATION (ETFA), 2015,
  • [22] The transmogrifier-4: An FPGA-based hardware development system with multi-gigabyte memory capacity and high host and memory bandwidth
    Fender, J
    Rose, J
    Galloway, D
    [J]. FPT 05: 2005 IEEE International Conference on Field Programmable Technology, Proceedings, 2005, : 301 - 302
  • [23] HIGH-SPEED FIFO MEMORY - THEORY AND APPLICATIONS
    CHOCHELES, E
    SIERRA, R
    [J]. ELECTRONIC PRODUCTS MAGAZINE, 1983, 25 (13): : 85 - 89
  • [24] The Era of High Bandwidth Memory
    Tran, Kevin
    [J]. 2016 IEEE HOT CHIPS 28 SYMPOSIUM (HCS), 2016,
  • [25] High-speed integrated particle sorters based on dielectrophoresis
    Nieuwenhuis, JH
    Jachimowicz, A
    Svasek, P
    Vellekoop, MJ
    [J]. PROCEEDINGS OF THE IEEE SENSORS 2004, VOLS 1-3, 2004, : 64 - 67
  • [26] Efficient System-Level Hardware Synthesis of Dataflow Programs Using Shared Memory Based FIFO HEVC Decoder Case Study
    Abid, Mariem
    Jerbi, Khaled
    Raulet, Mickael
    Deforges, Olivier
    Abid, Mohamed
    [J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2018, 90 (01): : 127 - 144
  • [27] Object Placement for High Bandwidth Memory Augmented with High Capacity Memory
    Laghari, Mohammad
    Unat, Didem
    [J]. 2017 29TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 2017, : 129 - 136
  • [28] Realization of high-speed data acquisition using FIFO memory
    Mi, GS
    [J]. ISTM/2005: 6th International Symposium on Test and Measurement, Vols 1-9, Conference Proceedings, 2005, : 439 - 442
  • [29] Lossless image compression algorithm and hardware architecture for bandwidth reduction of external memory
    Li, Shizhong
    Yin, Haibing
    Fang, Xiangzhong
    Lu, Huijuan
    [J]. IET IMAGE PROCESSING, 2017, 11 (06) : 379 - 388
  • [30] Reduce the memory bandwidth of 3D graphics hardware with a novel rasterizer
    Chen, CH
    Lee, CY
    [J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2002, 11 (04) : 377 - 391