SADIMM: Accelerating <underline>S</underline>parse <underline>A</underline>ttention Using <underline>DIMM</underline>-Based Near-Memory Processing

被引:0
|
作者
Li, Huize [1 ]
Chen, Dan [1 ]
Mitra, Tulika [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 19077, Singapore
基金
新加坡国家研究基金会;
关键词
Sparse matrices; Hardware; Memory management; Software; Parallel processing; Logic; Bandwidth; Transformers; Faces; DRAM chips; Near-memory processing; sparse attention accelerator; DRAM architecture; software-hardware co-design;
D O I
10.1109/TC.2024.3500362
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Self-attention mechanism is the performance bottleneck of Transformer based language models. In response, researchers have proposed sparse attention to expedite Transformer execution. However, sparse attention involves massive random access, rendering it as a memory-intensive kernel. Memory-based architectures, such as near-memory processing (NMP), demonstrate notable performance enhancements in memory-intensive applications. Nonetheless, existing NMP-based sparse attention accelerators face suboptimal performance due to hardware and software challenges. On the hardware front, current solutions employ homogeneous logic integration, struggling to support the diverse operations in sparse attention. On the software side, token-based dataflow is commonly adopted, leading to load imbalance after the pruning of weakly connected tokens. To address these challenges, this paper introduces SADIMM, a hardware-software co-designed NMP-based sparse attention accelerator. In hardware, we propose a heterogeneous integration approach to efficiently support various operations within the attention mechanism. This involves employing different logic units for different operations, thereby improving hardware efficiency. In software, we implement a dimension-based dataflow, dividing input sequences by model dimensions. This approach achieves load balancing after the pruning of weakly connected tokens. Compared to NVIDIA RTX A6000 GPU, the experimental results on BERT, BART, and GPT-2 models demonstrate that SADIMM achieves 48x, 35x, 37x speedups and 194x, 202x, 191x energy efficiency improvement, respectively.
引用
收藏
页码:542 / 554
页数:13
相关论文
共 50 条
  • [41] Independent ERP predictors of affective priming underline the importance of depth of prime and target processing and implicit affect misattribution
    Seib-Pfeifer, Laura-Effi
    Gibbons, Henning
    BRAIN AND COGNITION, 2019, 136
  • [42] DNA methylation landscapes from pig's limbic structures underline regulatory mechanisms relevant for brain plasticity
    Perdomo-Sabogal, Alvaro
    Trakooljul, Nares
    Hadlich, Frieder
    Murani, Eduard
    Wimmers, Klaus
    Ponsuksili, Siriluck
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [43] DNA methylation landscapes from pig’s limbic structures underline regulatory mechanisms relevant for brain plasticity
    Alvaro Perdomo-Sabogal
    Nares Trakooljul
    Frieder Hadlich
    Eduard Murani
    Klaus Wimmers
    Siriluck Ponsuksili
    Scientific Reports, 12
  • [44] Modified La0.6Sr0.4Co<underline>0.2</underline>Fe0.8O3-s cathode with the infiltration of Bi0.8Ca0.2FeO3-s for intermediate-temperature solid oxide fuel cells
    Chen, Sainan
    Feng, Weiwei
    Zhao, Yan
    Chen, Shichen
    Cao, Yi
    CERAMICS INTERNATIONAL, 2025, 51 (04) : 5234 - 5240
  • [45] 2024 White paper on recent issues in bioanalysis: Impact of LDT in US and IVDR in EU; AI/ML for High Parameter Flow Cytometry; The rise of Olink Technology; CDx for AAV Gene Therapies; Integrative Bioanalysis by Multiple Platforms; Super Sensitive ADA/NAb LBA (<underline>PART 2A</underline> - Recommendations on Advanced Strategies for Biomarkers, IVD/CDx Assays (BAV), Cell Based Assays (CBA), and Ligand-Binding Assays (LBA) <underline>PART 2B</underline> - Regulatory Agencies' Input on Biomarkers, IVD/CDx, and Biomarker Assay Validation)
    Bivi, Nicoletta
    Graham, Danielle
    Joglekar, Laura
    Mcguire, Kristina
    Stoop, Jeroen
    Zoghbi, Jad
    Baker, Brian
    Bandukwala, Abbas
    Bond, Sarah
    Buoninfante, Alessandra
    Chen, Jeff
    Dysinger, Mark
    Engelbergs, Joerg
    Fiscella, Michele
    Garofolo, Fabio
    Hopper, Shirley
    Jones, Barry
    King, Lindsay
    Murphy, Rocio
    Palmer, Rachel
    Sanderink, Gerard
    Seyda, Agnes
    Tang, Huaping
    Van Tuyl, Andrea
    Wagner, Leslie
    Walravens, Karl
    Wang, Kai
    Zander, Hilke
    Zhu, Liang
    Li, Ming
    Lin, Yi-Dong
    Natalia, Mahwish
    Standifer, Nathan
    Eck, Steven
    Goihberg, Polina
    Grugan, Katharine
    Hedrick, Michael Nathan
    Hopkins, Greg
    Kar, Sumit
    Keller, Steve
    Mcgrath, Shannon
    O'Gorman, Bill
    Stevens, Chad
    Stevens, Erin
    Terszowski, Grzegorz
    Trampont, Paul C.
    Yao, Shuyu
    Joyce, Alison
    Kumar, Seema
    Owen, Carolina
    BIOANALYSIS, 2025, 17 (04) : 211 - 248
  • [46] Recent clinical failures in Parkinson's disease with apoptosis inhibitors underline the need for a paradigm shift in drug discovery for neurodegenerative diseases
    Waldmeier, Peter
    Bozyczko-Coyne, Donna
    Williams, Michael
    Vaught, Jeffry L.
    BIOCHEMICAL PHARMACOLOGY, 2006, 72 (10) : 1197 - 1206
  • [47] Study of a Space-Time Monitoring of High-Speed Railway Underline Structure Using Distributed Optical Vibration Sensing Technology
    Diouf, Baye Mbaye
    Che, Ailan
    Feng, Shaokong
    SHOCK AND VIBRATION, 2019, 2019
  • [48] An Adaptive Statistical Approach for Non-Destructive Underline Crack Detection of Ceramic Tiles Using Millimeter Wave Imaging Radar for Industrial Application
    Agarwal, Smriti
    Singh, Dharmendra
    IEEE SENSORS JOURNAL, 2015, 15 (12) : 7036 - 7044
  • [49] Trachoma and the Importance of Sexual Infective Route in Developed Countries. Comment on Gallenga et al. Why the SAFE-<underline>S</underline> Strategy for Trachoma? Are Musca sorbens or Scatophaga stercoraria Really the Culprit?-A Brief Historical Review from an Italian Point of View. Pathogens 2023, 12, 1419
    Garcia-Teillard, Damian
    Garcia-Delpech, Salvador
    Udaondo, Patricia
    PATHOGENS, 2024, 13 (05):
  • [50] About Chlamydia trachomatis. Reply to Garcia-Teillard et al. Trachoma and the Importance of Sexual Infective Route in Developed Countries. Comment on "Gallenga et al. Why the SAFE-<underline>S</underline> Strategy for Trachoma? Are Musca sorbens or Scatophaga stercoraria Really the Culprit?-A Brief Historical Review from an Italian Point of View.
    Maritati, Martina
    Contini, Carlo
    Del Boccio, Marco
    D'Aloisio, Rossella
    Conti, Pio
    Mura, Marco
    Gallenga, Pier Enrico
    Gallenga, Carla Enrica
    PATHOGENS, 2024, 13 (05):