ReCSA: a dedicated sort accelerator using ReRAM-based content addressable memory

被引:0
|
作者
LI Huize
JIN Hai
ZHENG Long
HUANG Yu
LIAO Xiaofei
机构
[1] NationalEngineeringResearchCenterforBigDataTechnologyandSystem,ServicesComputingTechnologyandSystemLab,ClustersandGridComputingLab,SchoolofComputerScienceandTechnology,HuazhongUniversityofScienceandTechnology,Wuhan,China
关键词
ReCAM; parallel sorting; architecture design; processing-in-memory;
D O I
暂无
中图分类号
TP333 [存贮器];
学科分类号
摘要
With the increasing amount of data, there is an urgent need for efficient sorting algorithms to process large data sets. Hardware sorting algorithms have attracted much attention because they can take advantage of different hardware’s parallelism. But the traditional hardware sort accelerators suffer "memory wall" problems since their multiple rounds of data transmission between the memory and the processor. In this paper, we utilize the in-situ processing ability of the ReRAM crossbar to design a new ReCAM array that can process the matrix-vector multiplication operation and the vector-scalar comparison in the same array simultaneously. Using this designed ReCAM array, we present ReCSA, which is the first dedicated ReCAM-based sort accelerator. Besides hardware designs, we also develop algorithms to maximize memory utilization and minimize memory exchanges to improve sorting performance. The sorting algorithm in ReCSA can process various data types, such as integer, float, double, and strings. We also present experiments to evaluate the performance and energy efficiency against the state-of-the-art sort accelerators. The experimental results show that ReCSA has 90.92×, 46.13×, 27.38×, 84.57×, and 3.36× speedups against CPU-, GPU-, FPGA-, NDP-, and PIM-based platforms when processing numeric data sets. ReCSA also has 24.82×, 32.94×, and 18.22× performance improvement when processing string data sets compared with CPU-, GPU-, and FPGA-based platforms.
引用
收藏
相关论文
共 50 条
  • [41] FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture
    Ji, Yu
    Zhang, Youyang
    Xie, Xinfeng
    Li, Shuangchen
    Wang, Peiqi
    Hu, Xing
    Zhang, Youhui
    Xie, Yuan
    TWENTY-FOURTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXIV), 2019, : 733 - 747
  • [42] A Practical Highly Paralleled ReRAM-Based DNN Accelerator by Reusing Weight Pattern Repetitions
    Zhang, Yuhao
    Jia, Zhiping
    Du, Hongchao
    Xue, Runzhen
    Shen, Zhaoyan
    Shao, Zili
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (04) : 922 - 935
  • [43] Spara: An Energy-Efficient ReRAM-Based Accelerator for Sparse Graph Analytics Applications
    Zheng, Long
    Zhao, Jieshan
    Huang, Yu
    Wang, Qinggang
    Zeng, Zhen
    Xue, Jingling
    Liao, Xiaofei
    Jin, Hai
    2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM IPDPS 2020, 2020, : 696 - 707
  • [44] A Novel ReRAM-Based Processing-in-Memory Architecture for Graph Traversal
    Han, Lei
    Shen, Zhaoyan
    Liu, Duo
    Shao, Zili
    Huang, H. Howie
    Li, Tao
    ACM TRANSACTIONS ON STORAGE, 2018, 14 (01)
  • [45] LSTMs for Keyword Spotting with ReRAM-based Compute-In-Memory Architectures
    Schaefer, Clemens J. S.
    Horeni, Mark
    Taheri, Pooria
    Joshi, Siddharth
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [46] A Survey of ReRAM-Based Architectures for Processing-In-Memory and Neural Networks
    Mittal, Sparsh
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2019, 1 (01): : 75 - 114
  • [47] An Energy-Efficient Inference Engine for a Configurable ReRAM-Based Neural Network Accelerator
    Zheng, Yang-Lin
    Yang, Wei-Yi
    Chen, Ya-Shu
    Han, Ding-Hung
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (03) : 740 - 753
  • [48] ReSpar: Reordering Algorithm for ReRAM-based Sparse Matrix-Vector Multiplication Accelerator
    Hsiao, Yi-Jou
    Nien, Chin-Fu
    Cheng, Hsiang-Yun
    2021 IEEE 39TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2021), 2021, : 260 - 268
  • [49] An Energy-Efficient Mixed-Bit CNN Accelerator With Column Parallel Readout for ReRAM-Based In-Memory Computing
    Liu, Dingbang
    Zhou, Haoxiang
    Mao, Wei
    Liu, Jun
    Han, Yuliang
    Man, Changhai
    Wu, Qiuping
    Guo, Zhiru
    Huang, Mingqiang
    Luo, Shaobo
    Lv, Mingsong
    Chen, Quan
    Yu, Hao
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2022, 12 (04) : 821 - 834
  • [50] An Energy-Efficient Mixed-Bit ReRAM-based Computing-in-Memory CNN Accelerator with Fully Parallel Readout
    Liu, Dingbang
    Mao, Wei
    Zhou, Haoxiang
    Liu, Jun
    Wu, Qiuping
    Hong, Haigiao
    Yu, Hao
    2022 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS, 2022, : 515 - 519